For employers

Hire this AI Trainer

Sign in or create an account to invite AI Trainers to your job.

Invite to Job
A
Adi Hermanto

Adi Hermanto

AI Training & LLM Evaluation Specialist (Freelance)

Indonesia flagBandung, Indonesia
$25.00/hrExpertAppenClickworkerData Annotation Tech

Key Skills

Software

AppenAppen
ClickworkerClickworker
Data Annotation TechData Annotation Tech
OneFormaOneForma
Other
LabelboxLabelbox
LabelImgLabelImg
MercorMercor

Top Subject Matter

General/Multidomain (Finance, Healthcare, Insurance, Social Services, Software)
LLM Reasoning
Code Generation

Top Data Types

AudioAudio
Computer Code ProgrammingComputer Code Programming
ImageImage
TextText
DocumentDocument

Top Task Types

Bounding BoxBounding Box
ClassificationClassification
Computer Programming/CodingComputer Programming/Coding
SegmentationSegmentation
Translation/LocalizationTranslation/Localization
Prompt + Response Writing (SFT)Prompt + Response Writing (SFT)

Freelancer Overview

AI Training & LLM Evaluation Specialist (Freelance). Brings 21+ years of professional experience across complex professional workflows, research, and quality-focused execution. Core strengths include Mercor and Appen. AI-training focus includes data types such as Text and labeling workflows including Evaluation, Rating, and Prompt + Response Writing (SFT).

ExpertIndonesianSundaneseEnglish

Labeling Experience

Appen

Speaker Diarization

AppenAudioClassificationAudio Recording
The project aims to transcribe audio files to support the development of a cutting-edge automatic speech recognition model. Participants are responsible for transcribing and performing speaker diarization on audio files up to five minutes long. This involves either improving existing pre-transcriptions or creating new transcriptions from scratch. Additionally, non-speech tags must be provided for sounds occurring simultaneously with speech, such as non-word pronunciations. A key aspect of the project is the precise timestamping of audio to mark continuous speech, defined as speech with pauses of less than 0.5 seconds. Participants must also track and identify speakers by adding timestamps at the beginning and end of each speaker change. To maintain project participation, a minimum commitment of 10 hours per week is required, with an accuracy rate of at least 90%. Quality assurance processes involve reviewing an average of 5% of the work to ensure adherence to these standards.

The project aims to transcribe audio files to support the development of a cutting-edge automatic speech recognition model. Participants are responsible for transcribing and performing speaker diarization on audio files up to five minutes long. This involves either improving existing pre-transcriptions or creating new transcriptions from scratch. Additionally, non-speech tags must be provided for sounds occurring simultaneously with speech, such as non-word pronunciations. A key aspect of the project is the precise timestamping of audio to mark continuous speech, defined as speech with pauses of less than 0.5 seconds. Participants must also track and identify speakers by adding timestamps at the beginning and end of each speaker change. To maintain project participation, a minimum commitment of 10 hours per week is required, with an accuracy rate of at least 90%. Quality assurance processes involve reviewing an average of 5% of the work to ensure adherence to these standards.

2024

Coders - AI Training

OtherComputer Code ProgrammingEvaluation RatingComputer Programming Coding
The project focused on validating AI-generated code solutions across multiple programming languages and frameworks, including Python, CSS, PHP, JavaScript, TypeScript, Go, Dart, HTML, MySQL, Kotlin, and Java. Tasks include evaluating code quality, optimizing performance, and providing detailed explanations of solution approaches. The project involved reviewing approximately 1,000 code snippets monthly, ensuring they met industry standards for efficiency, security, and best practices. Quality measures included comprehensive test case development, performance benchmarking, and detailed documentation of improvement rationales. Each code review required thorough analysis of syntax accuracy, error handling, and scalability, along with creating human-readable summaries explaining the logic and potential optimization strategies.

The project focused on validating AI-generated code solutions across multiple programming languages and frameworks, including Python, CSS, PHP, JavaScript, TypeScript, Go, Dart, HTML, MySQL, Kotlin, and Java. Tasks include evaluating code quality, optimizing performance, and providing detailed explanations of solution approaches. The project involved reviewing approximately 1,000 code snippets monthly, ensuring they met industry standards for efficiency, security, and best practices. Quality measures included comprehensive test case development, performance benchmarking, and detailed documentation of improvement rationales. Each code review required thorough analysis of syntax accuracy, error handling, and scalability, along with creating human-readable summaries explaining the logic and potential optimization strategies.

2023
Appen

Prompt Engineer & Multimodal Evaluation Specialist (Freelance)

AppenTextPrompt Response Writing SFT
Designed prompt structures and evaluation rubrics for multi-step reasoning and real-world task completion by language models. Developed, refined, and optimized prompts for UI/code generation and tool-use validations in AI systems. Performed quality assurance and evaluations across multimodal (text, audio, image) datasets tied to LLM behavior and instruction following. • Created and validated prompts for SFT and reasoning benchmarks • Assessed AI-generated code and generated/improved test suites • Evaluated multimodal data use in AI system completions • Supported prompt engineering for factual and safe model outputs

Designed prompt structures and evaluation rubrics for multi-step reasoning and real-world task completion by language models. Developed, refined, and optimized prompts for UI/code generation and tool-use validations in AI systems. Performed quality assurance and evaluations across multimodal (text, audio, image) datasets tied to LLM behavior and instruction following. • Created and validated prompts for SFT and reasoning benchmarks • Assessed AI-generated code and generated/improved test suites • Evaluated multimodal data use in AI system completions • Supported prompt engineering for factual and safe model outputs

2022 - Present
Mercor

AI Training & LLM Evaluation Specialist (Freelance)

MercorText
As an AI Training & LLM Evaluation Specialist, evaluated over 1,000 AI responses to optimize model accuracy and reasoning. Designed and implemented structured rubric-based evaluation frameworks for consistent AI output assessment. Conducted evaluation of AI tool use, including API and external system interactions, across a diverse set of domains. • Led evaluations for clients including Mercor, Outlier, Appen, and OneForma • Specialized in factuality, safety, and step-by-step reasoning in model outputs • Applied frameworks in software, healthcare, finance, insurance, and social service domains • Improved AI model reliability through comprehensive response ranking and error analysis

As an AI Training & LLM Evaluation Specialist, evaluated over 1,000 AI responses to optimize model accuracy and reasoning. Designed and implemented structured rubric-based evaluation frameworks for consistent AI output assessment. Conducted evaluation of AI tool use, including API and external system interactions, across a diverse set of domains. • Led evaluations for clients including Mercor, Outlier, Appen, and OneForma • Specialized in factuality, safety, and step-by-step reasoning in model outputs • Applied frameworks in software, healthcare, finance, insurance, and social service domains • Improved AI model reliability through comprehensive response ranking and error analysis

2022 - Present
Appen

Audio Transcriber

AppenAudioAudio Recording
I worked on a variety of projects that required a keen ear for detail and an understanding of linguistic nuances in both English and Indonesian. This involved transcribing diverse audio content, from conversational dialogues to technical discussions, while adhering to specific guidelines and formatting requirements.

I worked on a variety of projects that required a keen ear for detail and an understanding of linguistic nuances in both English and Indonesian. This involved transcribing diverse audio content, from conversational dialogues to technical discussions, while adhering to specific guidelines and formatting requirements.

2023 - 2024

Education

No Education added yet

Adi H. hasn’t added any Education History to their OpenTrain profile yet.

Work History

O

Outlier, Appen, Data Annotation Tech

Data Annotation & AI Training

Bandung
2022 - Present
A

Aspirasi Digital Indonesia

CEO, Full-Stack Developer

Bandung
2018 - Present