AI Trainer – LLM Email Output Rater/Evaluator
As an AI engineer, participated in building multi-agent email generation and evaluation systems. Labeled and rated generated email outputs by LLM agents, evaluating response quality and content consistency. Performed structured output ranking and selected the top outputs through a central judge agent. • Applied stateful orchestration to manage agent workflow and evaluation. • Improved reliability by reducing hallucinations through human-like output evaluation. • Ensured outputs were human-ready for business communication scenarios. • Utilized OpenAI SDK, CrewAI, AutoGen, and LangGraph as part of the evaluation toolset.