AI Prompt Engineering & Evaluation Researcher
This experience involved designing and evaluating prompts for large language models as part of AI prompt engineering and evaluation research. I assessed model outputs for accuracy, instruction adherence, and safety, contributing to the fine-tuning and evaluation of language models. My work enabled more robust task completion by improving AI response reliability and contextual understanding. • Created, iterated, and refined prompts for complex instruction tasks • Evaluated LLM responses for accuracy, coherence, and safety • Utilized internal or proprietary evaluation software and tools • Supported research efforts in LLM evaluation and prompt engineering