LLM Prompt Authoring & Evaluation (SFT, RLHF, Cross-Tool Workflows)
Worked as a Subject Matter Expert supporting LLM training and evaluation across structured Non-UI and Cross-Gym environments. Authored and evaluated 700+ prompts simulating real-world business workflows across communication systems, CRM platforms, project management tools, and event management environments. Performed comparative A/B response evaluations using rubric-based scoring frameworks. Assessed outputs for task completion, reasoning depth, tone alignment, hallucination detection, constraint adherence, and tool invocation accuracy. Designed multi-step cross-tool scenarios requiring coordinated reasoning between email systems, calendar scheduling, CRM updates, task management platforms, and document workflows. Identified weak, generic, or logically inconsistent responses and produced improved benchmark outputs aligned with evaluation standards. Maintained structured QA processes and high data integrity standards to support model alignment and reasoning robustness.