Applied LLM Engineer (Evaluation & Alignment Systems)
As an Applied LLM Engineer, I designed and executed Python-based evaluation frameworks to benchmark large language models across various NLP tasks. My work involved developing structured prompt engineering and testing workflows, iteratively refining evaluation criteria, and improving the quality of training data for fine tuning. The experience enhanced overall response consistency and alignment of LLMs. • Benchmarked LLM performance across accuracy, reasoning, and reliability. • Created ideal response definitions for downstream model fine-tuning. • Implemented systematic prompt and evaluation workflow designs. • Enhanced alignment via rigorous analysis and data quality checks.