Prompt Engineering and LLM Evaluation Contributor
Worked on prompt engineering and optimization for large language models (LLMs) as part of AI-powered crypto analytics and assistant projects. In this capacity, contributed to data annotation, prompt curation, and model evaluation efforts, focusing on the refinement and assessment of AI outputs. Engineered evaluation pipelines for prompt performance and conducted regular assessments of generated text to maximize model accuracy and relevance. • Designed and evaluated prompt pipelines to optimize LLM outputs • Performed consistent model evaluation and testing • Annotated and curated text data for prompt engineering workflows • Collaborated on continuous improvement of LLM evaluation metrics