Nep-gLUE Benchmark:

In this project we have tried to incorporate different Natural Language Understanding (NLU) tasks in Nepali at one place. This benchmark contains 4 different task: Named Entity Recognition (NER), Part Of Speech Tagging (POS Tagging), Content Classification (CC) and Catagorical Pair Similarity (CPS) as per now. We will continue to expand the number of tasks but it is a work in progress. All the scores are calcualted using macro-f1 metric.

ModelParams –– NER –– POS –– CPS –– CC –– Nep-gLUE Score
multilingual BERT172M85.4594.6593.6091.0891.19
XLM-R (base)270M87.5994.8893.6592.3392.11
NepBERT110M79.1290.6391.0590.9887.94
NepaliBERT110M82.4591.6789.4690.1088.42
NepBERTa (Ours)110M91.0995.5694.4293.1393.55

To get benchmarking dataset you can visit as follows:

Note: If you want to contribute to the project with your own dataset or if you get improved results on these tasks, please consider mailing us to our address.