Named Entity Recognition (Darija) Project
Named entity recognition was performed on Darija language text to identify specific entities such as names and locations. The experience involved fine-tuning the CAMeLBERT model for optimal recognition accuracy. Data preprocessing and annotation were essential steps in training the AI for this specialized task. • Developed custom annotation guidelines to improve consistency. • Used Python and HuggingFace for model training and evaluation. • Focused on improving performance for under-represented language data. • Manually labeled diverse text samples from social media and news articles.