Cybersecurity and Cloud Infrastructure Data Labeling Project
Contributed to the annotation and quality assurance of AI datasets used for cybersecurity automation and cloud threat detection. Tasks involved labeling structured and unstructured data such as phishing samples, network logs, cloud misconfiguration alerts, and vulnerability descriptions. Annotated entities and relationships between attack techniques, assets, and indicators of compromise (IOCs) to support NLP-based threat analysis and LLM fine-tuning. Ensured data consistency, integrity, and relevance through detailed review and cross-validation with established frameworks (MITRE ATT&CK, OWASP). Maintained high annotation accuracy (>98%) by following strict labeling guidelines and feedback loops. Collaborated on evaluation tasks for LLM-generated responses related to security events and risk categorization.