Prompt Engineering and Evaluation – Outlier AI
Worked as a Prompt Engineering Intern on a large-scale LLM training and evaluation project. Contributed to various Natural Language Processing (NLP) tasks such as prompt creation, revision, and completion generation for reasoning, coding, and conversational objectives. Responsible for designing high-quality inputs for instruction-following models, reviewing AI outputs, and generating unit test cases for coding problems. Tasks included evaluating completions, fine-tuning prompts, and ensuring alignment with annotation guidelines. Maintained high accuracy and consistency across thousands of examples. The project adhered to strict quality assurance protocols and targeted SOTA model performance improvement.