Computing Science and Math Expert
At Outlier.ai, I worked on diverse projects, including LLM evaluation, RLHF, image understanding tests, agent persona evaluation, audio evaluation, and model stumping. My tasks involved scoring AI outputs, designing failure-inducing prompts, and assessing multimodal data for accuracy and relevance. I ensured high-quality annotations across various datasets, adhering to strict quality standards to enhance model performance and reliability.