For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
L

Lucas Wangese

AI Evaluation Analyst (RWS)

USA flagN/A, Usa
ExpertSama

Key Skills

Software

SamaSama

Top Subject Matter

Large Language Models
AI Model Evaluation
Generative AI

Top Data Types

TextText
DocumentDocument

Top Task Types

No task types listed

Freelancer Overview

AI Evaluation Analyst (RWS). Brings 6+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Internal, Proprietary Tooling, and Sama. Education includes Doctor of Philosophy, University of Washington (2024) and Master of Science, Adelphi University (2024). AI-training focus includes data types such as Text and labeling workflows including Evaluation and Rating.

Expert

Labeling Experience

AI Evaluation Analyst (RWS)

Text
In this role, I performed expert evaluation of large language model (LLM) outputs, assessing accuracy, relevance, and behavioral alignment. I designed and refined evaluation rubrics tailored to domain-specific tasks and policy compliance. I provided analytical reporting that informed improvements in model behavior and benchmarking processes. • Reviewed generated text outputs for issues related to logical consistency, hallucinations, and prompt ambiguity. • Collaborated with research teams to define and refine structured evaluation criteria. • Utilized remote evaluation tools and followed stringent documentation protocols. • Supported ongoing benchmarking and human-in-the-loop feedback cycles.

In this role, I performed expert evaluation of large language model (LLM) outputs, assessing accuracy, relevance, and behavioral alignment. I designed and refined evaluation rubrics tailored to domain-specific tasks and policy compliance. I provided analytical reporting that informed improvements in model behavior and benchmarking processes. • Reviewed generated text outputs for issues related to logical consistency, hallucinations, and prompt ambiguity. • Collaborated with research teams to define and refine structured evaluation criteria. • Utilized remote evaluation tools and followed stringent documentation protocols. • Supported ongoing benchmarking and human-in-the-loop feedback cycles.

2023 - Present

LLM Quality Assurance Research Assistant (Databricks collaboration)

Text
I led research on evaluation methodologies for assessing reasoning reliability and policy adherence in generative AI systems. I developed structured frameworks to measure alignment, hallucinations, and inconsistencies. I produced reports that were utilized for PhD research and peer-reviewed technical briefs. • Conducted comparative evaluations across multiple LLMs for edge-case performance. • Designed protocols for identifying policy violations and model shortcomings. • Analyzed structured outputs to benchmark system reliability. • Ensured rigorous analysis and clear documentation throughout the evaluation cycle.

I led research on evaluation methodologies for assessing reasoning reliability and policy adherence in generative AI systems. I developed structured frameworks to measure alignment, hallucinations, and inconsistencies. I produced reports that were utilized for PhD research and peer-reviewed technical briefs. • Conducted comparative evaluations across multiple LLMs for edge-case performance. • Designed protocols for identifying policy violations and model shortcomings. • Analyzed structured outputs to benchmark system reliability. • Ensured rigorous analysis and clear documentation throughout the evaluation cycle.

2023 - 2024
Sama

AI Quality & Evaluation Associate (Sama)

SamaText
I assisted in evaluating AI systems for compliance with operational and safety protocols. I created documentation outlining model failure modes and inconsistencies in reasoning. My contributions improved testing frameworks based on research findings. • Participated in peer reviews of AI evaluation methodologies and assessment papers. • Developed guidelines for identifying ambiguous or erroneous outputs. • Compiled research-driven recommendations to enhance evaluation protocols. • Maintained detailed quality reports supporting continuous model improvement.

I assisted in evaluating AI systems for compliance with operational and safety protocols. I created documentation outlining model failure modes and inconsistencies in reasoning. My contributions improved testing frameworks based on research findings. • Participated in peer reviews of AI evaluation methodologies and assessment papers. • Developed guidelines for identifying ambiguous or erroneous outputs. • Compiled research-driven recommendations to enhance evaluation protocols. • Maintained detailed quality reports supporting continuous model improvement.

2021 - 2023

Data & AI Quality Review Assistant (Microsoft contract)

Text
I supported data quality review and validation workflows for AI and machine learning systems at Microsoft. I evaluated both structured and unstructured datasets used for model training to ensure labeling accuracy and consistency. I identified errors, escalated quality issues, and documented findings in a guideline-driven remote environment. • Used detailed checklists and quality rubrics to assess data annotation completeness. • Flagged annotation inconsistencies and labelling edge cases impacting model training. • Contributed to continuous model improvement by documenting recurring issues. • Maintained high confidentiality and attention to detail throughout review cycles.

I supported data quality review and validation workflows for AI and machine learning systems at Microsoft. I evaluated both structured and unstructured datasets used for model training to ensure labeling accuracy and consistency. I identified errors, escalated quality issues, and documented findings in a guideline-driven remote environment. • Used detailed checklists and quality rubrics to assess data annotation completeness. • Flagged annotation inconsistencies and labelling edge cases impacting model training. • Contributed to continuous model improvement by documenting recurring issues. • Maintained high confidentiality and attention to detail throughout review cycles.

2019 - 2021

Education

A

Adelphi University

Master of Science, Artificial Intelligence and Machine Learning

Master of Science
2022 - 2024
C

Columbia University

Bachelor of Science, Software Engineering

Bachelor of Science
2012 - 2017

Work History

R

RWS

AI Evaluation Analyst

N/A
2023 - Present
D

Databricks

LLM Quality Assurance Research Assistant

N/A
2023 - 2024