Independent AI Content Creator & LLM Researcher
This role involved large-scale reinforcement learning from human feedback (RLHF), safety evaluation, and red teaming specifically for large language model (LLM) performance and safety. Through human-AI collaborative fiction writing and content evaluation, substantial contributions were made to LLM training datasets. Rigorous adversarial testing and benchmarking on platforms such as ChatGPT and Gemini enhanced model robustness and trustworthiness. • Managed, authored, and evaluated over 10 million words for model training and stress-testing. • Performed red teaming to identify edge cases and improve bias and safety responses in LLMs. • Developed prompt engineering workflows facilitating nuanced 'vibe coding' and context consistency. • Utilized proprietary/internal tools alongside major LLM platforms for data labeling and evaluation.