LLM Evaluation & Prompt-Response Annotation
Worked on LLM training workflows involving evaluation and ranking of AI-generated responses based on accuracy, relevance, tone, and safety. Performed prompt-response writing (SFT), question-answer validation, and comparative evaluation to improve model alignment. Followed detailed annotation guidelines, applied consistency checks, and maintained high quality standards across large text datasets.