LLM Response Evaluation & Text Annotation Project
Annotated and evaluated large-scale text datasets for supervised and instruction-tuned language models. Performed response grading, prompt–response alignment checks, taxonomy-based labeling, and peer QA reviews. Maintained 98%+ accuracy while adhering to strict quality guidelines and SLA-driven productivity benchmarks. Identified ambiguous or edge-case outputs to improve annotation clarity and model performance.