Outlier
AI Data Labeling & Model Evaluation (Outlier) Worked as an independent contributor on AI training and evaluation tasks, focused on model accuracy, reasoning quality, and code correctness. Responsibilities included: Evaluating and comparing model outputs based on logical consistency, factual accuracy, and edge cases Reviewing AI-generated code and explanations for correctness, robustness, and compliance with specifications Identifying errors, ambiguities, and failure modes in model responses Applying structured evaluation criteria to ensure high-quality training data Working fully asynchronously and independently, following detailed task guidelines and quality standards This role required strong analytical thinking, attention to detail, and experience with software validation and testing principles.