AI Prompt Engineer & RLHF Data Annotator
Responsible for prompt engineering, evaluation, and RLHF data creation for LLMs. Designed and tested prompts, ranked LLM outputs, and wrote high-quality preference pairs for model response evaluation. Performed multilingual evaluations and reasoning assessments on chain-of-thought responses in technical and general domains. • Involved in evaluating AI-generated code and providing feedback for correctness and efficiency. • Localized and culturally contextualized AI tasks as a native Bahasa Indonesia speaker. • Contributed to improvement of LLM outputs via ranking and feedback loops. • Performed chain-of-thought and reasoning assessments across a variety of subjects.