CMKL Data Science Intern & Researcher for BMEiCON - LLM Training/Data Processing
I contributed to a research project on child abuse detection by working hands-on with textual data. My responsibilities included preprocessing, grouping, and plotting data using pandas, which was then used to train Large Language Models (LLMs). The goal was to improve early detection of child abuse cases through advanced data-driven methods. • Processed and structured sensitive case text data for AI training • Assisted a PhD candidate and professor in continuous data annotation and aggregation • Leveraged pandas for data wrangling and basic visualization for model readiness • Played a role in developing AI systems to assist social services