Suemin Han - LLM Evaluation and Text Generation in English & Korea

Key Skills

Software

Labelbox

LabelImg

Label Studio

OneForma

Scale AI

SuperAnnotate

Internal/Proprietary Tooling

Other

Top Subject Matter

No subject matter listed

Top Data Types

Audio

Image

Text

Top Task Types

Evaluation Rating

Prompt Response Writing SFT

Red Teaming

Text Generation

Translation Localization

Freelancer Overview

Yes, I have experience working as a content writer, reviewer, annotator, and evaluator. I have high attention to detail and ability to thoroughly and accurately categorize content and I am comfortable working with both textual and audio/video materials. My responsibilities are as follows: -Crafted ideal responses to AI prompts (from factual questions to whimsical jokes) -Evaluated human and AI-generated text, images/videos for accuracy, safety, style, and relevance, answering questions -Compared multiple responses and select the best according to detailed rubrics. -Classified labeled content, flagged policy/safety issues, and provided concise written rationales -Conducted Red Teaming exercises to identify adversarial, harmful, or unsafe outputs from large language models (LLMs) -Designed adversarial prompts to uncover weaknesses, safety risks, or edge cases in LLM behavior -Developed test cases to assess accuracy, bias, toxicity, hallucinations, and misuse potential in AI-generated responses -Followed complex, evolving internal, written guidelines and maintained high throughput and accuracy -Performed peer review of other writers -Completed calibrations and QA checks -Collaborated with a remote team to review outputs and escalate flagged issues

IntermediateFrenchKoreanEnglishSpanishPortuguese

Labeling Experience

Nvidia Constraints Following Mulitilingual Project

Internal Proprietary ToolingTextText GenerationEvaluation Rating

Create and edit content across formats such as articles, guides, workflows, and scenarios, and then recreate real-world content workflows for AI evaluation. Review AI outputs for quality, accuracy, and adherence to guidelines and provide feedback and annotations to support LLM improvement.

2025

Gemini vs ChatGPT Prompt Generation Evaluation

OtherTextEvaluation Rating

Systematically test and assess how effectively different prompts guide Gemini and ChatGPT to produce high-quality, relevant, accurate, and safe outputs, ensuring each AI model performs as desired for specific tasks by comparing results, measuring metrics (accuracy, relevance, safety), and refining prompts for better performance and consistency.

2025 - 2025

Speech Evaluation

OtherAudioEvaluation Rating

Assess a speaker's communication skills, focusing on elements like content, delivery (body language, vocal variety), structure (intro, body, conclusion), and overall purpose

2025 - 2025

Audio Emotion Annotation

OtherAudioSegmentationEmotion Recognition

Label audio data with specific human emotions (e.g. joy, anger, sadness, fear) or emotional states, often using established psychological models to train AI for emotion recognition

2025 - 2025

Audio Paralinguistic Annotation

OtherAudioSegmentationText Generation

Methodically label the non-lexical components of speech in audio data to capture information beyond the literal words spoken such as tone of voice, pitch, volume, speech rate, pause and silence, filler words.

2025 - 2025

Education

U

University of California, San Diego

Master of Arts, International Affairs

Master of Arts

2003 - 2005

H

Hankuk University of Foreign Studies

Bachelor of Arts, Spanish

Bachelor of Arts

1997 - 2001

Work History

I

Impactt Limited

Associate Consultant

London

2025 - Present

C

Carrot Global

Corporate Trainer

Seoul

2022 - Present