This project focuses on generating Reinforcement Learning with Human Feedback (RLHF) data to enhance code generation capabilities for Python development. The goal is to collect feedback from developers to improve the accuracy and quality of Python code outputs produced by large language models (LLMs). Participants will be asked to review, correct, and optimize auto-generated Python scripts, functions, and algorithms. The data collected will be used to train LLMs to better understand coding standards, efficient practices, and problem-solving strategies in Python, ultimately leading to higher-quality and more reliable code suggestions for real-world applications.
Total Budget
$42,500
Pay per Label
$50/hr
Time Requirement
Flexible
Duration
3-6 months
Leetcode-esque prompt and answers
Software
Hiring Type
Required Location
Workload / Schedule
Work 15 hours weekly for 4-6 months, no time requirements.
Software
Data Type
Label Types
Subject Matter / Industry
Language
Job Type
Share link