AI Probe Specialist / LLM Evaluator / Prompt Engineer
As an AI Probe Specialist and Prompt Engineer, I led specialized evaluation and probing for enterprise-level language models. My role involved comprehensive assessment of model outputs, focusing on identifying nuanced failures and refining prompts for robust performance. I supported red-team and safety workflows to uncover edge-case behaviors in LLMs. • Designed and deployed adversarial prompt sets to evaluate model consistency • Assessed responses using RLHF-style and comparative ranking methodologies • Documented behavioral drift, truth retention, and instruction following • Provided written evaluations to guide enterprise deployment and safety