Custom Named Entity Recognition for Gujrati Text Using Spacy

Authors

  • Komil B. Vora, Dr. Avani R. Vasant, Dr. Saurabh Shah

DOI:

https://doi.org/10.17762/msea.v71i3.502

Abstract

Named Entity Recognition (NER) is a method to search for a particular Named Entity (NE) from a file or an image, recognize it and classify it into specified Entity Classes like Name, Location, Organization, Numbers and Others Categories. It is the most useful element of the technique known as Natural Language Processing (NLP) which makes text extraction very easy. This paper is about Named Entity Recognition (NER) for Gujarati language. Not much work has been done in NER for Gujarati. There is no standard dataset available for Gujrati NER. Hence we have created two datasets for NER in Gujrati. In this paper, an NER tagger is build using Spacy. The NER tagger is trained with 100% accuracy and capable of identifying person, location and organization names. From the news headlines dataset the NER tagger is able to identify the named entities for entertainment, business and technical etc.

Downloads

Published

2022-08-20

How to Cite

Dr. Avani R. Vasant, Dr. Saurabh Shah , K. B. V. (2022). Custom Named Entity Recognition for Gujrati Text Using Spacy. Mathematical Statistician and Engineering Applications, 71(3), 1483–1495. https://doi.org/10.17762/msea.v71i3.502

Issue

Section

Articles