Independent LLM Evaluator and Safety Annotator (Portfolio-based)
Independently reviewed and evaluated public large language model (LLM) outputs for logical consistency, ethical alignment, and bias. Focused on advanced reasoning, causal inference, and psychological analysis of hypothetical and real-world prompts. Documented model errors and suggested improved rationales to enhance model safety and reduce misuse risk. • Conducted red-teaming and adversarial testing on LLMs to identify policy violations and logical fallacies. • Labeled model responses for bias, misinformation, hate speech, harassment, and self-harm risk using expert-level annotation. • Audited adherence to constitutional free speech principles and nuanced emotional/psychological realism. • Produced structured, reproducible rationales and mitigation strategies for model training and evaluation.