Dr. Mazhar Ali Dootio: Unicode-8 based linguistics data set of annotated Sindhi text

Wednesday, 24 October 2018

Unicode-8 based linguistics data set of annotated Sindhi text

https://www.sciencedirect.com/science/article/pii/S2352340918305687

Sindhi Unicode-8 based linguistics data set is multi-class and multi-featured data set. It is developed to solve the natural languages processing (NLP) and linguistics problems of Sindhi language. The data set presents information on grammatical and morphological structure of Sindhi language text as well as sentiment polarity of Sindhi lexicons. Therefore, data set may be used for information retrieving, machine translation, lexicon analysis, language modeling analysis, grammatical and morphological analysis, Semantic and sentiment analysis.

To read this article please open the link give above

Dr. Mazhar Ali Dootio

Wednesday, 24 October 2018

Unicode-8 based linguistics data set of annotated Sindhi text

No comments:

Post a Comment

Learning, Growth, and Success

Blog Archive