Data Science and Analytics Intern - Text Summarization Labeling
As a Data Science and Analytics Intern at CoderOne, I contributed to the development and fine-tuning of transformer models for text summarization tasks. I processed and labeled text data using extractive and abstractive summarization approaches to enhance training datasets for model improvement. The main focus was on dialogue and news article summarization using advanced neural networks. • Fine-tuned the BART model for dialogue summarization on the SAMSum dataset. • Preprocessed and labeled data via tokenization, attention masking, and dataset exploration. • Evaluated model performance by labeling CNN/DailyMail news article summaries for accuracy testing. • Utilized Python libraries and transformer frameworks for annotation and data preparation.