Lucas Wangese - AI Evaluation Analyst (RWS)

Key Skills

Software

Sama

Top Subject Matter

Large Language Models

AI Model Evaluation

Generative AI

Top Data Types

Text

Document

Top Task Types

No task types listed

Freelancer Overview

AI Evaluation Analyst (RWS). Brings 6+ years of professional experience across legal operations, contract review, compliance, and structured analysis. Core strengths include Internal, Proprietary Tooling, and Sama. Education includes Doctor of Philosophy, University of Washington (2024) and Master of Science, Adelphi University (2024). AI-training focus includes data types such as Text and labeling workflows including Evaluation and Rating.

Expert

Labeling Experience

AI Evaluation Analyst (RWS)

Text

In this role, I performed expert evaluation of large language model (LLM) outputs, assessing accuracy, relevance, and behavioral alignment. I designed and refined evaluation rubrics tailored to domain-specific tasks and policy compliance. I provided analytical reporting that informed improvements in model behavior and benchmarking processes. • Reviewed generated text outputs for issues related to logical consistency, hallucinations, and prompt ambiguity. • Collaborated with research teams to define and refine structured evaluation criteria. • Utilized remote evaluation tools and followed stringent documentation protocols. • Supported ongoing benchmarking and human-in-the-loop feedback cycles.

2023 - Present

LLM Quality Assurance Research Assistant (Databricks collaboration)

Text

I led research on evaluation methodologies for assessing reasoning reliability and policy adherence in generative AI systems. I developed structured frameworks to measure alignment, hallucinations, and inconsistencies. I produced reports that were utilized for PhD research and peer-reviewed technical briefs. • Conducted comparative evaluations across multiple LLMs for edge-case performance. • Designed protocols for identifying policy violations and model shortcomings. • Analyzed structured outputs to benchmark system reliability. • Ensured rigorous analysis and clear documentation throughout the evaluation cycle.

2023 - 2024

AI Quality & Evaluation Associate (Sama)

SamaText

I assisted in evaluating AI systems for compliance with operational and safety protocols. I created documentation outlining model failure modes and inconsistencies in reasoning. My contributions improved testing frameworks based on research findings. • Participated in peer reviews of AI evaluation methodologies and assessment papers. • Developed guidelines for identifying ambiguous or erroneous outputs. • Compiled research-driven recommendations to enhance evaluation protocols. • Maintained detailed quality reports supporting continuous model improvement.

2021 - 2023

Data & AI Quality Review Assistant (Microsoft contract)

Text

I supported data quality review and validation workflows for AI and machine learning systems at Microsoft. I evaluated both structured and unstructured datasets used for model training to ensure labeling accuracy and consistency. I identified errors, escalated quality issues, and documented findings in a guideline-driven remote environment. • Used detailed checklists and quality rubrics to assess data annotation completeness. • Flagged annotation inconsistencies and labelling edge cases impacting model training. • Contributed to continuous model improvement by documenting recurring issues. • Maintained high confidentiality and attention to detail throughout review cycles.

2019 - 2021

Education

A

Adelphi University

Master of Science, Artificial Intelligence and Machine Learning

Master of Science

2022 - 2024

C

Columbia University

Bachelor of Science, Software Engineering

Bachelor of Science

2012 - 2017

Work History

R

RWS

AI Evaluation Analyst

N/A

2023 - Present

D

Databricks

LLM Quality Assurance Research Assistant

N/A

2023 - 2024