Technology

Agentic AI: Storage and ‘the largest tech refresh in IT historical past’


With agentic synthetic intelligence (AI), we may very well be going through the largest tech refresh occasion in historical past, the place each organisation would possibly deploy as much as 2,000 brokers per worker.

And to satisfy that want, your complete IT infrastructure – and storage specifically – might be affected.

These are the views of Jeff Denworth, co-founder of Huge Knowledge, who talks on this podcast in regards to the challenges of agentic AI infrastructure for IT departments, the challenges to storage of agentic AI, and the way prospects can start to satisfy these challenges throughout their datacentres and the cloud.

This contains being very cautious to obviously specify and provision infrastructure whereas not over-buying, in addition to making certain storage and compute work hand in hand with software architectures and database groups.

What additional challenges does agentic AI pose for the IT infrastructure?

It’s a really broad query. However, to start out, I feel it’s necessary to level out that that is in some respects a completely new type of enterprise logic and a brand new type of computing.

And so, the primary query turns into, if agentic methods are reasoning fashions coupled with brokers that carry out duties by leveraging reasoning fashions, in addition to totally different instruments which have been allotted to them to assist them accomplish their duties … these fashions must run on very high-performance equipment.

At the moment’s AI infrastructure typically runs greatest on GPUs [graphics processing units] and different sorts of AI accelerators. And so, the primary query turns into, how do you put together the compute infrastructure for this new type of computing?



And right here, prospects discuss deploying AI factories and RAG [retrieval augmented generation], and AI agent deployment tends to be the preliminary use case individuals take into consideration as they begin to deploy these AI factories.

These are tightly coupled methods that require quick networks that interconnect very, very quick AI processors and GPUs, after which join them to totally different information repositories and storage sources that you just would possibly need to go and feed these brokers with.

The attention-grabbing factor about agentic infrastructure is that brokers can in the end work throughout various totally different datasets, and even in numerous domains. You’ve type of two sorts of brokers – employees, and different brokers, that are supervisors or supervisory brokers.

So, possibly I need to do one thing easy like develop a gross sales forecast for my product whereas reviewing all the shopper conversations and the totally different databases or datasets that might inform my forecast.

Properly, that will take me to having brokers that work on and course of various totally different impartial datasets that won’t even be in my datacentre. An important instance is if you would like one thing to go and course of information in Salesforce, the supervisory agent might use an agent that has been deployed inside Salesforce.com to go and type out that a part of the enterprise system that it needs to course of information on.

So, the primary query turns into, how do you outline this pipeline? How do you scope out all the numerous information sources that you could be need to course of on? How do you dimension for what you’d suppose is type of a nominal operational workload, so that you just’ve acquired sufficient compute sources for the regular state?

There are such a lot of totally different sides of decision-making that come into play when individuals suppose they need to begin deploying agentic workloads
Jeff Denworth, Huge Knowledge

After which, the compute dialogue takes you down the trail of datacentre and energy infrastructure readiness, which is an entire totally different kettle of fish as a result of a few of these new methods – for instance, the GB200 and L72 methods from Nvidia – are very tightly coupled racks of GPUs which have very quick networks between them. These require one thing like 120kW per datacentre rack, which most prospects don’t have.

And then you definitely begin working by way of the issues of my GPU necessities and the place can I deploy them? In a colo? Is it in a datacentre I’ve? Is that doubtlessly hosted in some cloud or neo-cloud atmosphere? Neo clouds are these new AI clouds born within the period of AI. There are such a lot of totally different sides of decision-making that come into play when individuals suppose they need to begin deploying agentic workloads.

What are the important thing challenges for storage infrastructure, specifically, in agentic AI?

Properly, simply as with the primary query, it’s actually multidimensional. 

I feel the very first thing to dimension up is what’s storage in agentic AI? And that is one thing that has radically modified since individuals began coaching AI fashions. Most individuals usually labored below the belief that when you’ve got an excellent and quick file system, that’s adequate. And so, the distinction right here is that when persons are coaching within the AI sense, and even fine-tuning, typically these are very well-curated datasets that get fed into AI equipment, and also you wait a number of hours or a number of days, and out pops a brand new mannequin.

And that’s the extent of interplay you may have with underlying storage methods, apart from that storage system additionally needing to have the ability to seize intermittent checkpoints to ensure that if the cluster fails, you’ll be able to recuperate from some time limit in a job and begin over. 

If you concentrate on brokers, a consumer will get on a system and makes a immediate, and that immediate will then ship the agent to do some type of nearly unpredictable stage of computing, the place the AI mannequin will then go and look to work with totally different auxiliary datasets. 

And it’s not simply typical storage, like file methods and object storage, that prospects want. In addition they want databases. For those who noticed a few of the bulletins from Databricks, they’re speaking about how AI methods are creating extra databases now than people have. And information warehouses are notably necessary as AI brokers look to cause throughout large-scale information warehouses.

So, something that requires analytics requires a knowledge warehouse. Something that requires an understanding of unstructured information not solely requires a file system or an object storage system, nevertheless it additionally requires a vector database to assist AI brokers perceive what’s in these file methods by way of a course of referred to as retrieval augmented generative AI.

The very first thing that must be wrestled down is a reconciliation of this concept that there’s all types of various information sources, and all of them have to be modernised or prepared for the AI computing that’s about to hit these information sources.

I wish to type of take a look at what’s modified and what hasn’t modified out there. And it’s true that there’s all types of recent functions which might be being deployed within the type of new functions that use reasoning brokers, they usually use reasoning fashions as a part of their enterprise logic. However there’s additionally a whole lot of legacy functions that are actually being up-levelled to additionally help this new kind of AI computing.

And so, our normal conclusion is that each single enterprise software sooner or later may have some element of AI embedded into it. And there might be an entire bunch of recent functions that additionally might be AI-centric that we haven’t deliberate for or don’t exist but. 

The widespread thread is that there’s this new fashion of computing that’s occurring on the software stage on a brand new kind of processor that traditionally was not fashionable inside the enterprise, which is a GPU or an AI processor. However I feel the factor that folks don’t realise is that the datasets they’ll be processing on is a whole lot of historic information.

So, whereas the chance to modernise a datacentre is greenfield on the software stage and on the processor stage or on the compute stage, [there is] the brownfield alternative to modernise the legacy information infrastructure that immediately holds the worth and the data that these AI brokers and reasoning fashions will look to course of round. 

We could also be embarking on what may very well be the world’s largest know-how refresh occasion in historical past
Jeff Denworth, Huge Knowledge

Then the query turns into, why would I modernise, and why is that this necessary to me? That’s the place scale comes again into the equation.

I feel it’s necessary to checkpoint the place we’re at with respect to agentic workflows and the way that may affect the enterprise. It’s honest to say that just about something that’s routine or a process-bound method to doing enterprise might be automated as a lot as humanly doable. 

There are actually examples of many organisations that aren’t enthusiastic about a number of brokers throughout the enterprise, however a whole bunch of hundreds, and in sure instances, a whole bunch of thousands and thousands of brokers.

Nvidia, for instance, made a really public assertion that they’re going to be deploying 100 million brokers over the subsequent few years. And that will be at a time when their organisation might be 50,000 staff. Now, if I put these two statements collectively, what you may have is roughly a 2,000 to at least one AI agent-to-employee ratio that you just would possibly take into consideration planning for.

If that is true, an organization of 10,000 staff would require large-scale supercomputing infrastructure simply to course of this stage of company. So, I give it some thought by way of what the drivers are to modernise infrastructure. If simply half or a fraction of this stage of AI agent scale begins to hit a typical enterprise, then each single legacy system that’s holding its information might be incapable of supporting the computational depth that comes from this stage of equipment.

And that is the factor that has us pondering we could also be embarking on what may very well be the world’s largest know-how refresh occasion in historical past. Most likely the newest one up till AI hit the market was virtualisation, which created new calls for on the storage and database stage. That very same factor seems to be true for AI, as totally different prospects we work with begin to rethink information and storage infrastructure for large-scale agentic deployment.

How can prospects guarantee their infrastructure is as much as the job for agentic AI?

It undoubtedly requires some stage of focus and understanding the shopper workload.

However one of many issues I see occurring throughout the market can also be over-rotation, the place infrastructure practitioners won’t essentially perceive the wants that come from both new enterprise logic or AI analysis.

And so, they have a tendency to overcompensate for the unknown. And that’s additionally fairly harmful, as a result of that creates a foul style within the mouth for organisations which might be beginning to ramp into totally different AI initiatives after they realise, OK, we overbought right here, we purchased the incorrect stuff right here.

The very first thing I’d say is that there are greatest practices out out there that ought to undoubtedly be adhered to. Nvidia, for instance, has finished a very terrific job of serving to articulate what prospects want and sizing based on totally different GPU definitions, such that they’ll construct infrastructure that’s general-purpose and optimised, however not essentially over-architected. 

The second factor that I’d say is that hybrid cloud methods undoubtedly have to be reconciled, not solely only for infrastructure-as-a-service – do I deploy stuff in my datacentre? do I deploy some stuff in numerous AI clouds or public clouds? – but additionally totally different SaaS [software-as-a-service] providers.

The reason being that a whole lot of agentic work will occur there. You now have, for instance, Slack, that has its personal AI providers in it. Just about any main SaaS providing additionally has an AI sub-component that features some quantity of brokers. The very best factor to do is sit down with the appliance architects workforce, which a whole lot of our storage prospects don’t essentially all have shut connection to.

The second factor is to take a seat down with the database groups. Why? As a result of enterprise information warehouses have to be rethought and reimagined on this world of agentic computing, but additionally new sorts of databases are required within the type of vector databases. These have totally different necessities, on the infrastructure and compute in addition to on the storage stage.

Lastly, there must be some harmonisation round what is going to occur with the datacentre and throughout totally different clouds. That you must speak to the totally different distributors you’re employed with. That and the entire apply of serving to individuals with this.

We’ve acquired one thing like roughly about 1.2 million GPUs that we’ve been powering all over the world, and there’s all types of attention-grabbing approaches to not solely sizing, but additionally future-proofing information methods by understanding the way to proceed to scale if totally different AI tasks stick and show to achieve success.