AI Prompt Evaluation & LLM Alignment – Project Sapphire
Contributed to a large-scale AI model training and evaluation initiative aimed at improving LLM performance and ethical alignment. Responsibilities included writing and evaluating diverse prompts, reviewing AI-generated responses for factuality, tone, and usefulness, and assigning ratings based on alignment with human values. Also engaged in side-by-side comparisons and classification tasks across varied domains including general knowledge, technical writing, and ethical reasoning. Ensured quality and consistency through detailed review cycles and internal guidelines. The project involved thousands of prompt-response pairs and contributed directly to fine-tuning efforts for model improvement.