Andrew Mondejar - LLM Evaluation, Prompt Eng & Code QA Specialist | Full-Stack Python/MERN

Key Skills

Software

Labelbox

Toloka

V7 Labs

Top Subject Matter

No subject matter listed

Top Data Types

Image

Text

Video

Top Task Types

Computer Programming Coding

Diagnosis

Evaluation Rating

Fine Tuning

Prompt Response Writing SFT

Freelancer Overview

I'm a bilingual (English/Spanish) full-stack engineer turned AI-training specialist who lives in the overlap of LLM evaluation, prompt engineering, and rigorous data labeling. Over the past six years I've shipped MERN + Python micro-services at Vicsa & Co., authored 4000+ Bloom-aligned STEM MCQs, and created, scored, and de-hallucinated 2 000 + prompts for instruction-following, code-generation, and RLHF pipelines. My sweet spot is designing reusable ontologies, writing testable labeling guidelines, and then automating the boring parts with Pandas, NumPy, and scalable REST/GraphQL APIs. On live projects I've lifted model pass@k by 18 % cut page-load time 40 %, and kept 95 % test coverage with Jest & Playwright, all while mentoring global annotators on bias detection, consensus QA, and secure data handling (SOC 2).

IntermediateEnglishSpanish

Labeling Experience

RLHF Prompt Evaluation & Code Output Rating

LabelboxTextEvaluation RatingComputer Programming Coding

Curated 2 000 + multilingual prompts and completions for instruction-following and code-generation RLHF pipelines. Rated outputs for correctness, bias, and hallucination; wrote gold-standard responses; and auto-scored unit-test pass/fail results. Effort lifted model pass@1 accuracy +12 % and cut reviewer time 30 % under SOC-2 workflows.

2023 - 2025

Bilingual STEM MCQ Authoring & Difficulty Classification

LabelboxTextClassificationQuestion Answering

Authored & annotated 4 000+ English/Spanish multiple-choice questions across Python, SQL, C++, Physics, and Calculus. Tagged each item for Bloom-level difficulty, distractor quality, and learning-objective alignment; validated with consensus QA scripts that raised item-bank validity to 96 % and shortened editorial cycles 25 %.

2022 - 2025

Education

T

The University of Texas at Arlington

Bachelor's in Computer Science, Computer Science

Bachelor's in Computer Science

2019 - 2023

Work History

V

Vicsa & Co.

Al Prompt Engineer & Model Evaluator

Fort Worth

2023 - Present

V

Vicsa & Co.

Full-Stack Web Developer

Fort Worth

2023 - Present