Local LLM Reasoning Evaluation & Latency Benchmarking
Executed rigorous performance testing and reasoning evaluation of local Large Language Models (including Llama 3 and Deepseek) within an Ubuntu environment. Audited model outputs for logic adherence, reasoning accuracy, and hallucination rates. Documented hardware latency across an RTX 4070 Ti Super and 64GB DDR5 architecture to determine optimal inference parameters.