
Introduction Artificial Intelligence models have grown at an extraordinary pace. Modern Large Language Models (LLMs) often contain billions or even trillions of parameters, enabling remarkable reasoning, language understanding, programming, image generation, and scientific assistance. However, these capabilities come at a significant computational cost. Running a large AI model traditionally requires enormous amounts of GPU…