Saturday, May 24, 2025

M&S Cyber Attack Impact Expected to Persist Until July

Comparing AI Storage Solutions: NAS, SAN, and Object Storage for Training and Inference

Lloyds and Nationwide to Leverage UK Finance Sector’s LLM Technology

Microsoft Mobilizes Team to Combat Threat of Lumma Malware

DSIT Allocates £5.5 Million for New Project Funding

Dell Technologies Customers Creating Practical AI Applications

Vast Data Soars into the AI Stratosphere with AgentEngine Launch

Third-Party Weak Links Threaten Robust Fintech Security Posture

Capital One Expands Data Tokenization Efforts

Comparing AI Storage Solutions: NAS, SAN, and Object Storage for Training and Inference

Artificial intelligence (AI) is all about data. Businesses diving into AI, especially for things like large language models (LLMs) and generative AI (GenAI), need to gather a hefty amount of data for training and to store the outputs generated by these AI systems.

But here’s the catch: That data won’t sit nicely in one place. Companies pull from various sources, combining structured data from databases with unstructured data like documents, images, audio, video, and even code. This information can live both on-premises and in the cloud.

So how do system architects manage AI’s insatiable appetite for data storage? They need to consider setups involving storage area networks (SAN), network-attached storage (NAS), and even object storage. Each type has its ups and downs, and it’s crucial to find the right mix for different organizations.

When it comes to AI projects today, you’re rarely looking at a single data source. Generative AI thrives on a broad array of inputs—much of it unstructured. Documents, images, audio, video, and code all play a part.

“Everything about generative AI revolves around understanding relationships,” says Patrick Smith from Pure Storage. Your unstructured data feeds into the system, while vectorized data sits on block storage.

Training LLMs is about casting a wide net: the more data sources, the better. But enterprises again link LLMs to their own data, often using retrieval augmented generation (RAG) to enhance results. This data often includes documents and info from enterprise applications, keeping everything connected.

For architects, deciding where to store this data is complex. You could keep data in its current form, but that’s not always feasible. Sometimes, you need to process data further or isolate AI applications from production systems. Plus, the process of vectorization tends to inflate data volumes significantly—sometimes tenfold—which raises demands on storage systems.

Storage needs to be adaptable. AI project requirements change throughout their lifespan. Training demands a lot of raw data, while inference requires less data but demands higher throughput and lower latency.

Typically, companies stash their unstructured data on NAS. It’s budget-friendly and easier to manage than alternatives like direct-attached storage (DAS) or block-access SAN storage. For structured data, block storage is the reigning choice, usually hosted on SANs, though small projects might just use DAS.

Performance matters. In enterprise systems like ERP and CRM, data often goes on SAN or DAS as database files. In practice, AI pulls from both SAN and NAS environments.

“AI data can live on either NAS or SAN,” says Bruce Kornfeld from StorMagic. “It’s about how the AI tools access the data.” Sometimes, a SAN is better suited, but other times, NAS can offer the speed required, especially for document or image-heavy systems. For more demanding applications, like autonomous driving or surveillance, a SAN or high-speed local storage may be necessary.

Architects must distinguish between training and inference phases and weigh the trade-offs of moving data between storage systems against potential performance gains.

This has led some organizations to consider object storage for unifying their AI data sources. Object storage isn’t just for the cloud; on-premises versions are gaining traction too. It offers a simpler structure, lower management demands, and easier scalability at a lower cost. But historically, performance hasn’t been its strong suit, making it better for archiving than for high-speed applications.

Storage providers are addressing these performance gaps. Companies like Pure Storage and NetApp are offering systems that can handle file, object, and even block storage. This flexibility allows storage managers to choose the best options without locking themselves into specific hardware.

Until we see more robust object storage systems or a shift to universal storage platforms, companies will likely continue using a mix of NAS, SAN, object, and even DAS. As AI evolves, so will the balance among these storage types. Smith at Pure has noticed a growing demand for hardware that’s focused on unstructured data, while existing infrastructure handles block and vector database needs for most clients.

“Everything about generative AI is about understanding relationships,” he clarifies. Your unstructured data stays in play, while your vectorized data rests on block storage.