Thursday, January 30, 2025

The Benefits of Running AI On-Premise | Computer Weekly

AI services like OpenAI’s ChatGPT, Microsoft’s Copilot, and Google’s Gemini typically operate in the cloud. But for businesses, AI isn’t just a cloud game.

Technology is evolving. New “small language” models and open-source options are emerging, making on-premise setups more attractive. Companies are considering data privacy, security, and cost-efficiency. Even as the spotlight shines on cloud solutions, there’s a strong argument for on-premise AI.

Derreck Van Gelderen, who leads AI strategy at PA Consulting, explains that most companies run their AI tasks in the cloud because it lets them scale up without major upfront costs. Cloud platforms like AWS, Microsoft Azure, and Google Cloud provide flexible resources needed to handle the heavy lifting of AI, especially during the demanding training stages.

John Gasparini from KPMG sees a similar trend. Many clients are exploring cloud AI to test early ideas, leveraging existing large language models or building their own. However, developing in-house AI requires hefty investments, and the ROI isn’t guaranteed just yet.

While the cloud allows for quick deployment and scaling of AI systems, it comes with a catch. As AI projects evolve, so do their costs. Gasparini stresses that businesses are beginning to worry about tracking their AI expenses. Larger data sets and increased user activity can lead to skyrocketing bills.

On top of that, using vector databases can inflate storage needs tenfold, further driving up costs. Data sovereignty, privacy, and security increasingly influence the move from cloud to on-premise AI. Van Gelderen points out that sectors like defense, healthcare, and nuclear demand strict control over their data.

Real-time performance is another concern. Delays from transferring data to and from cloud servers can hinder applications that need quick responses, such as edge-based solutions.

These cloud limitations are nudging some enterprises to consider running AI on-premise or in-house. This decision often depends on where data lives and how AI training and inference requirements differ.

Van Gelderen notes that most discussions around AI today center on generative AI (GenAI), but that’s just part of a broader picture, needing distinct infrastructures compared to traditional AI like machine learning. This situation demands diverse approaches to AI technology.

A growing trend is retrieval-augmented generation (RAG), which helps companies add their own context and security to AI outputs. Patrick Smith from Pure Storage emphasizes that RAG solves many hallucination issues, lets businesses use their data without extensive tuning, and keeps insights current.

However, as RAG becomes essential, it reshapes the infrastructure requirements for AI. Smith mentions that data location affects AI solutions, drawing many companies back to their data centers.

Not all firms need cutting-edge, cloud-based generative models. Interest is rising in open-source language models and those suited for less powerful hardware. Researchers are even developing smaller models apt for sensitive data, potentially run on standard servers or even laptops.

While the benefits of running AI in-house exist, companies must juggle the technical needs and initial investment against ongoing cloud costs. Van Gelderen warns about the high expenses tied to on-premise AI due to hardware, cooling, and maintenance demands. During training, where massive processing power is crucial, the cloud often shines.

Companies must evaluate their data center capacities, power needs, and total costs. GPUs, essential for AI, are tough to acquire as demand outstrips supply, especially with hyperscalers gobbling them up. Enterprises might need to pivot toward less resource-heavy models for on-premise solutions.

But there are compelling efficiency reasons for an in-house approach during the inference stage. If models are running constantly, they might be more cost-effective on-site, given adequate data center resources. Caley from NetApp suggests that moving AI workloads back to the data center can yield cost benefits, especially for models in continuous operation.

Smith agrees, emphasizing that the cloud is great for testing but can be costly for successful deployments. Companies might prototype AI in the cloud but transition to on-premise solutions once they confirm their value.

In the end, businesses are likely to seek AI models compatible with their existing infrastructure or what they can afford to build.