Multimodal LLM Annotation & Evaluation Project
Assisted with large-scale annotation and evaluation tasks for training and fine-tuning large language models. The tasks involved evaluating the responses generated by AI, grading the responses based on their accuracy, reasoning, safety, and ability to follow instructions. I helped with reinforcement learning with human feedback (RLHF) by writing and evaluating prompt-response pairs for fine-tuning the models.