Tomohiro Kinoshita - LLM Evaluation and Text Generation Specialist in English and Japanesee

Key Skills

Software

Data Annotation Tech

Labelbox

Mindrift

Surge AI

Toloka

Other

Top Subject Matter

No subject matter listed

Top Data Types

Audio

Image

Text

Top Task Types

Audio Recording

Classification

Evaluation Rating

Fine Tuning

Prompt Response Writing SFT

Freelancer Overview

I have practical experience in evaluating and refining LLM training data, especially for Japanese prompt–response tasks. I created and applied fine-grained, atomic evaluation criteria covering structure, relevance, tone, and factual accuracy. I also developed annotation rubrics and response templates to support consistency and quality in large-scale annotation tasks. In a ranking project, I assessed three model outputs (A, B, and C) for each prompt, wrote justification in English, and edited Japanese responses to improve tone, clarity, and quality while keeping their original intent. I also contributed to Japanese voice data projects, recording speech in noisy environments and evaluating generated speech based on pronunciation, dialect, tone, and emotional expression. Thanks to the quality of my contributions, I was later hired as a local marketing consultant by the AI company running these projects. In five months, I successfully helped attract over 5,000 Japanese contributors to join their data collection and evaluation programs.

IntermediateEnglishSwedishJapaneseChinese Mandarin

Labeling Experience

Japanese LLM Prompt–Response Evaluation and Editing

Data Annotation TechTextClassificationQuestion Answering

I evaluated Japanese prompt–response pairs for LLM training. I created fine-grained evaluation criteria and scored responses based on structure, tone, relevance, and factual accuracy. I also edited generated responses for clarity and naturalness, and wrote prompts and sample completions (SFT-style) to guide model behavior. My feedback contributed to model fine-tuning and annotation guideline improvement.

2024 - 2025

SERP Evaluation Project

TolokaTextEvaluation Rating

I’m working on a project that evaluates search engine result pages (SERPs) to determine which set of results better matches a user’s intent. Each task compares two SERPs (Left vs Right) based on four key factors — Relevance, Helpfulness, Trustworthiness, and Freshness — to decide which one provides the most accurate and useful answers.

2025

Japanese Speaker & Trainer – Conversational AI Data Collection (Google / Gemini Project)

OtherVideoQuestion AnsweringTranslation Localization

Participated as both a speaker and trainer in a short-term AI data project designed to enhance Google Meet and Gemini’s conversational capabilities. The project involved structured online discussions conducted via Google Meet to generate high-quality linguistic and conversational data for AI language model improvement. Key Contributions: Acted in various simulated roles — interviewer/interviewee, sales representative, training instructor, and marketing professional — performing natural Japanese conversations based on defined scenarios. Recorded multiple structured dialogue sessions demonstrating realistic tone, clarity, and interaction flow for AI model training. Served as a bilingual bridge between the India-based coordination team and Japanese speakers, providing live interpretation and onboarding support. Designed and facilitated training sessions for Japanese participants to improve recording quality, consistency, and compliance with project guidelines.

2025 - 2025

Native Japanese Audio Transcriber – Mixed Script

LabelboxAudioClassification

This role involves identifying and tagging a wide range of speech and non-speech phenomena (e.g., filled pauses, background speech, non-verbal sounds, singing, whispers, and garbled audio) with precise timestamping and classification. I also ensure consistent normalization, correct language identification, and adherence to transcription standards for text, symbols, and catalog entities. Through this work, I contribute to the development and evaluation of speech models used in AI assistants and language learning systems, helping to enhance their accuracy, inclusivity, and contextual awareness.

2025 - 2025

Japanese Voice Data Collection and Synthetic Audio Evaluation

Data Annotation TechAudioEmotion RecognitionTranslation Localization

I recorded Japanese voice data in various noisy environments, following detailed prompts and environmental constraints. I also evaluated synthetic Japanese audio outputs, focusing on pronunciation accuracy, dialectal variation, intonation, tone, and emotional nuance. This helped ensure the generated audio sounded natural, intelligible, and emotionally appropriate.

2025 - 2025

Education

P

Penn State University

Business Exchange Program, Marketing/Marketing Management, Gerontology

Business Exchange Program

2006 - 2007

K

Kansai Gaidai University

Bachelor of Arts, Languages and Intercultural Communication, Business Administration and Management, General

Bachelor of Arts

2003 - 2007

Work History

S

Surge AI

Marketing Consultant

Funabashi

2025 - Present

D

DataAnnotation.Tech

AI Trainer/ AI Annotator

Funabashi

2024 - Present