AI Developer / AI Training Contributor
Evaluated and annotated LLM-generated outputs, focusing on backend code accuracy and response structure. Reviewed RAG-based chatbot answers for factual consistency and instruction adherence. Provided targeted feedback to improve prompt and model output quality. • Detected reasoning gaps, hallucinations, and structural errors in outputs. • Documented recurring failure patterns to refine datasets. • Supported prompt refinement for logic improvement. • Enhanced overall training data through structured evaluation.