LLM Evaluation Specialist & AI Data Annotation – Outlier
As an LLM Evaluation Specialist & AI Data Annotation contributor, I evaluated LLM-generated responses for user queries in multiple languages. I annotated answers for reasoning quality, fluency, factual correctness, clarity, and guideline adherence. My tasks included reviewing outputs for linguistic and cultural relevance as well as policy and safety compliance. • Evaluated and rated AI chatbot outputs for accuracy and appropriateness. • Translated, reviewed, and processed multilingual content (Ukrainian, Polish, Russian) for guideline alignment. • Ensured conversation norms, tone, and guideline adherence in AI outputs. • Annotated strengths, weaknesses, and factual errors in model responses for supervised fine-tuning.