Code Human Preferences Labeler
For this project, I will select a codebase that is a git repository, and ask the model AI to perform a single task in that codebase. Our goal is to, over multiple turns, iterate on the model’s solution for that task until it reaches a “production-ready” state. This should be iterating on the model’s workflow with it to ensure it is working like a real engineer - meaning ensuring the model is reviewing the code it wrote, validating code against task requirements, committing regularly, etc.