AI Training Specialist
Model Alignment (RLHF): Evaluate and rank LLM responses based on truthfulness, safety, and reasoning, providing technical rationales for each preference. Adversarial Testing: Design complex, multi-turn prompts to stress-test model guardrails and identify potential hallucinations or logic failures. Instruction Following: Audit model compliance with strict negative constraints and complex formatting requirements (e.g., JSON, word counts, or stylistic rules). Fact-Verification: Conduct deep-dive research to ensure AI outputs are grounded in factual evidence, maintaining high-quality benchmarks for model accuracy.