Key Skills

Software

Data Annotation Tech

Other

Top Subject Matter

LLM evaluation in Italian

LLM evaluation in English

Evaluation of images

Top Data Types

Audio

Image

Text

Top Task Types

Evaluation Rating

Prompt Response Writing SFT

Question Answering

RLHF

Text Generation

Freelancer Overview

I worked for almost 2 years on the evaluation of LLM responses in Italian, which is my native language, and in English. I am a fast learner and open to other occupations. I currently am a student of Physics at university.

IntermediateEnglishItalianSpanish

Labeling Experience

Prompt Writing and Evaluation (Failure Testing)

Data Annotation TechTextClassificationQuestion Answering

In this project, I was responsible for crafting prompts within specific categories (e.g., chatbot, classification, Q&A, summarization, etc.). The goal was to design prompts in such a way that they would cause the AI to fail at least one of the given constraints. Once the AI responded, I compared the outputs, evaluated them against multiple criteria, and rewrote the prompt in a more effective manner. My task also involved providing detailed comments on the differences between the initial and corrected responses to ensure the AI's performance aligned with the required standards.

2024

Natural Dialogue Audio Recording

OtherAudioAudio Recording

In this project, I was given a text and collaborated with a colleague to record a natural-sounding conversation. My tasks included adapting the text for a smooth, authentic dialogue in Italian, ensuring clear audio quality, and maintaining relevance to the given topic. I focused on creating fluid, spontaneous exchanges while refining the recording based on feedback to enhance the naturalness of the conversation.

2025 - 2025

Safety and Restricted Categories Testing

Data Annotation TechTextClassificationRLHF

I was assigned to test the AI’s handling of safety and restricted categories. In one task, I created prompts as a user and assessed whether the AI maintained a safe and appropriate response according to the category guidelines. In another task, I attempted to craft prompts that would lead the AI to fail the safety filters. I evaluated the AI's ability to adhere to safety protocols and provided feedback on its response handling, ensuring that it met the necessary safety standards.

2025 - 2025

Coding Evaluation (Python, C, C++)

Data Annotation TechComputer Code ProgrammingRLHFEvaluation Rating

In this project, I evaluated the AI’s responses to coding-related prompts in languages such as Python, C, and C++. I focused on the accuracy of the code, the efficiency of the solution, and the clarity of the explanations. My feedback included providing insights into potential optimizations or improvements in the code and explaining any errors in a clear and understandable way.

2024 - 2025

AI Response Evaluation with Chat History

Data Annotation TechTextRLHFFine Tuning

In this project, I reviewed AI responses based on the chat history available. My role was to assess the coherence, relevance, and correctness of the AI's answers, considering the entire conversation's context. I applied various evaluation criteria, including linguistic quality, informativeness, and accuracy, and provided comments on how the AI's performance could be improved in the given context.

2024 - 2024

Education

No Education added yet

Clara M. hasn’t added any Education History to their OpenTrain profile yet.

Work History

No Work History added yet

Clara M. hasn’t added any Work History to their OpenTrain profile yet.