LLM Alignment Specialist - Preference Ranking & Safety
Performed high-complexity data annotation and quality assurance to improve the reasoning and safety capabilities of a Tier-1 Large Language Model. Scope: Focused on multi-turn dialogue evaluation and Supervised Fine-Tuning (SFT) for coding and creative writing prompts. Tasks: Ranked model-generated responses based on strict dimensions: Truthfulness, Helpfulness, and Harmlessness. Wrote "Golden Responses" to serve as ground-truth data for model training. Project Size: Processed and audited over 2,500+ complex prompt-response pairs. Quality Measures: Maintained a consistent Quality Audit (QA) score of 97% or higher. Adhered to complex, evolving style guides and utilized "Chain of Thought" reasoning to justify ranking decisions.