Senior Software Engineer - RLHF Feedback Loop for AI Training (Ramp)
As a Senior Software Engineer at Ramp, I developed RLHF (Reinforcement Learning from Human Feedback) feedback loop systems to enhance the learning of AI-driven vendor negotiation agents. My work focused on integrating AI feedback into orchestration workflows, leveraging LLMs and custom guardrails for financial contract automation. Significant efforts involved orchestrating AI training data collection through asynchronous human-in-the-loop evaluations and LLM assessment tools. • Designed and implemented Ruby/Node.js AI communication clients for collecting annotated human feedback on LLM-generated contract recommendations. • Integrated Go safety guardrails and regex interceptors to ensure quality and safety of training data. • Automated data collection pipelines enabling AI retraining iterations using RLHF methodologies. • Leveraged internal proprietary AI tooling and Python orchestration services to streamline feedback loop operations.