Felipe Boechat - AI Tutor & Adversarial Tester specializing in LLM evaluation and safety

Key Skills

Software

Appen

Labelbox

Lionbridge

OneForma

Scale AI

Telus

Top Subject Matter

No subject matter listed

Top Data Types

Audio

Image

Text

Top Task Types

Audio Recording

Evaluation/Rating

Red Teaming

RLHF

Translation/Localization

Freelancer Overview

Over the past two years, I have worked across a variety of AI training and evaluation projects as an AI Tutor, Red Team Member, and Adversarial Tester, contributing to the development, refinement, and safety evaluation of large language models. My experience includes data labeling for text, image, and audio tasks, instruction-following assessments, RLHF workflows, content quality rating, translation and localization tasks, and adversarial testing focused on improving model robustness, alignment, and reliability. With more than 15 years of experience in IT, cybersecurity, and governance, I bring a strong understanding of risk, compliance, and secure system behavior. These skills directly support my ability to evaluate model outputs critically and identify safety gaps, biases, vulnerabilities, and failure modes. I have collaborated with organizations such as Scale AI, Aligner, and RWS on high-impact datasets involving reasoning, classification, dialogue evaluation, red teaming, and scenario-based stress testing. This combination of technical and analytical experience allows me to deliver precise and consistent annotations, along with well-structured evaluations that support safer and more accurate AI systems. I excel in complex judgment tasks, follow guidelines with high attention to detail, and adapt quickly to new workflows and evolving model behaviors in fast-moving AI environments.

IntermediatePortugueseEnglish

Labeling Experience

Action Item

Scale AITextAction Recognition

I worked on a project focused on evaluating and improving an AI system’s ability to extract accurate action items from meeting transcripts. My responsibilities included reviewing AI-generated action items, verifying whether they were correctly derived from the transcript, and determining when items should be added, removed, or rewritten for clarity and correctness. In addition to identifying and validating action items, I assessed the model’s outputs along key dimensions such as truthfulness, groundedness, and instruction following. This involved checking whether each action item was fully supported by the transcript, ensuring there were no hallucinations or fabricated tasks, and confirming that the model followed formatting rules, task-extraction guidelines, and enumeration instructions. I evaluated completeness, relevance, logical consistency, and alignment with the original discussion while ensuring compliance with project standards. My work contributed to improving the model’s re

2024 - 2024

Workspace email check

Scale AITextText Summarization

I participated in a project focused on evaluating and improving AI-generated short summaries of email communications. The goal was to ensure the model could reliably identify key points, maintain factual accuracy, and produce concise summaries aligned with the user’s instructions and the project’s formatting requirements. My responsibilities included reviewing email texts and comparing them with the model’s summaries to verify whether the extracted information was accurate, complete, and fully grounded in the original content. I assessed whether the summaries captured the essential points without adding or omitting critical information and evaluated them according to dimensions such as truthfulness, groundedness, coherence, and instruction following. This work required identifying hallucinations, correcting misinterpretations, spotting missing key information, and refining summaries when necessary. I also ensured compliance with length constraints, clarity standards, and domain-speci

2024 - 2023

Education

U

Unigranrio University

Bachelor's Degree, Network Infrastructure

Bachelor's Degree

2010 - 2013

I

INFNET

Certificate, Network Management (Internet of Things and Cloud Computing)

Certificate

2023

Work History

C

Concentrix

Cybersecurity Analyst

Barcelos

2021 - Present

C

Concentrix

IT Analyst

Barcelos

2020 - 2021