Data Label Annotator
I worked on a data labelling project focused on LLM response evaluation and quality assurance for the UK market (en_GB), where I compared multiple AI-generated responses to determine which performed better in terms of linguistic accuracy, factual correctness and cultural relevance. In this role, I assessed how well each response followed instructions, ensuring models adhered strictly to given constraints such as summarising content or maintaining a specific tone. I also conducted detailed localisation analysis, identifying the use of British English nuances, accurate regional references like Soho, Birmingham, and Oldham, and culturally relevant elements such as cèilidhs, GCSE standards, and UK retail contexts. My work involved extracting key information from both structured and unstructured data, including tables with football statistics and descriptive texts like band histories, to evaluate the model’s precision. I reviewed tone and style to ensure responses appropriately shifted between formal and informal English where required, and I carried out preference ranking by selecting the stronger response with clear, reasoned justifications. The project spanned a wide range of domains, including academic tasks like GCSE-level literary analysis, commercial evaluations of UK supermarket products, technical and factual assessments such as historical timelines and current events, and cultural or linguistic topics like regional dialects and traditions.