AI – LLM(Maths and Python) Engineer, Turing
I developed and deployed scalable machine learning pipelines for model training, validation, and deployment using Python and scientific data libraries. I conducted applied research on optimizing large language model (LLM) transformer architectures for efficient inference. My focus was on building and fine-tuning AI models for high-throughput, low-latency deployment in production settings. • Built RESTful APIs and real-time model inference data pipelines with FastAPI and AsyncIO • Reduced LLM inference latency by 40% through batching and optimizations • Published research on efficient LLM inference within internal AI knowledge base • Integrated advanced ML models into modular backend services for performance and maintainability.