Multimodal LLM Training and RLHF for Generalist AI
This project involves the end-to-end training and optimization of Large Language Models (LLMs) to improve reasoning, safety, and factual accuracy. Scope: Conducted high-quality Reinforcement Learning from Human Feedback (RLHF) and Supervised Fine-Tuning (SFT) to align model responses with human intent. Tasks: Performed ranking and grading of model-generated responses (Evaluation/Rating), wrote complex prompts to test model boundaries (SFT), and conducted Red Teaming to identify potential safety biases or "hallucinations." Data Quality: Adhered to strict 98%+ accuracy benchmarks. All entries underwent a secondary "Golden Task" review process to ensure high-fidelity training data.