Blog

Scality: AI Has Gone Agentic. Why Hasn’t Your Storage Strategy?

I heard from the folks at Scality at AI Field Day #8 for the first time at a Tech Field Day event, and I was happily surprised to discover some new takes on storage management in this brave new era of everything Agentic AI.

Full disclosure here: I have 25+ years of Oracle DBA experience, and I’ve also built storage arrays from the ground up. I also spent two years as a database subject matter expert at Hitachi Data Systems; my experience there convinced me just how poorly most DBAs and DevOps folks understand storage technology, especially how to monitor it for optimum performance.

Scality’s approach to managing complex storage environments definitely captured my attention.

Storage Performance Tuning Isn’t Easy. Why Let Customers Fumble Through It?

Scality’s CTO Giorgio Regni introduced their approach to modern storage strategies with a clever but refreshing history lesson on how most storage providers have tried to satisfy their customers’ demands for visibility into their storage platforms over the past decade or so:

  • First, customers demanded more control over their storage environments, so providers created complex control structures with a multiplicity of “buttons” anyone could press in hopes performance would improve.
  • When that didn’t work, providers created overladen dashboards that hopefully highlighted what was actually non-performant – as long as you knew exactly where to look at the right time.
  • As dashboards proved a distraction to immediate action, providers built complex alert mechanisms that triggered red flags – often sent right to your mobile phone, because who doesn’t want an alarm at zero dark thirty to fix a failed device?
  • Finally, providers decided the real solution for tuning storage performance was to give customers the power to configure anything they wanted through complex config files.

Giorgio’s point? These approaches have given humans complete control over their storage architecture, but unfortunately that meant humans have become the control plane. And that’s trapped system reliability engineers, storage administrators, and maybe even the occasional unfortunate dinosaur DBA who’s their 3rd-tier backup into making decisions to improve performance that exceeded the needed expertise.

Obviously, this approach is unsustainable, especially in light of the complex modern AI systems’ demand to provide peak performance of the underlying storage, whether those workloads are intense inference, long-running model training, or a mixture of both.

An Agentic Approach: Scality ADI

Scality chose AIFD8 as their venue to announce the release their latest product – Scality ADI, short for Autonomous Data Infrastructure. Essentially their product provides a single view of the totality of an enterprise’s storage resources as if it were a single AWS S3 storage endpoint.

While this approach to solving storage flexibility isn’t necessarily dramatically different from some of their competitors’ offerings, what intrigued me was their implementation of their agentic AI assistant, Guardian.

A Scality ADI user can leverage the Guardian agent UI directly to perform day-to-day standard storage management operations, handle data security operations, and even perform storage tasks an under-experienced user has limited or no knowledge of how to issue the appropriate commands. Scality also enables more sophisticated IT shops who have already begun to embrace agentic AI to construct their own storage management UI via MCP calls.

Sure, I Could Master Storage Needs If I Just RTFM’d. But Which FM Do I Need To R?

But the brief demo of Guardian that Scality performed at AIFD8 helped me understand how I could leverage Scality ADI to perform storage management without knowing what specific commands to issue.

Whichever method chosen to implement it, Scality ADI encapsulates 15+ years of storage management experience into a single AI-driven toolset that lets the least experienced user leverage an intelligent set of tools that will invoke the proper commands to perform complex operations without knowing the microscopic knowledge often required. And regardless of the implementation chosen, Scality ADI keeps a human in the processing loop at all times to make sure nothing incredibly foolish accidentally gets executed.

Selector AI: Bespoke Network Monitoring Tailored for AI

Any company who presents their first whiteboard image for their product at an event like AI Field Day #8 is sure to capture my immediate attention.

This was the first time I’d heard from Selector AI at a Tech Field Day event. I found them to be gutsy and independent while offering unique solutions to figuring out network issues in modern AI system architectures.

A Man’s Got To Know His Limits

Some background: I’m a reasonably experienced Oracle DBA, so I know how important a reliable network is for applications to connect to my databases.

I also know my limits.

I know I’d never be able to fill in as a backup network administrator during an irritating recurrence of a periodic latency, and definitely not during an unexpected crisis. I see the value of Selector AI’s platform as insurance against me doing something stupid after I guessed at a solution based five minutes’ research on Stack Overflow.

It’s All Fun & Games … Until Somebody Loses a Packet

Finding the true root cause of a network performance latency often becomes a multi-vendor finger-pointing exercise, especially when up to that point in time everything appeared to be working nominally.

When you factor in the obvious – no two IT shops’ network infrastructure is truly identical, even if they used identical hardware – it can be nearly impossible to assess whether sudden latency is expected because of normal business work schedules, versus the failure of a critical hardware component or because a newbie engineer deployed an untested DNS configuration.

That’s why Selector AI built their monitoring platform with the ability to capture the specific context of their client’s network performance during what would be considered normal and acceptable.

Selector AI’s platform can then deploy bespoke monitoring via proprietary AI models they’ve developed to filter out the noise from millions of log posts produced from many thousands of network devices to isolate root causes effectively.

When A Single-Shot Solution … Isn’t.

Joby Rudolph, Selector AI’s distinguished engineer, demonstrated in detail how the most recent version of their platform came into being.

Their first version used a single-shot AI approach to solving network issues; it enabled an unskilled human to ask questions using a proprietary NLP chatbot approach to ask simple questions about the network’s state. But as more robust AI infrastructure has matured in the past 18 months – especially the capabilities of Model Context Protocol (MCP) tools – Selector built the latest version of their platform around that orientation.

They realized the models already developed for detecting network issues were still valuable, but they migrated their solution from single-shot inferencing to leverage an agentic approach instead.

This latest version deploys three different types of agents to diagnose and solve a non-performant network:

  • An orchestrator agent coordinates the activities of all other agents in the stack.
  • A series of domain-specific agents tackle tasks across the network – for example, querying the health of an individual switch – and then report results cohesively to the orchestrator.
  • Domain-specific agents then leverage one or more MCP agents to obtain the results from the network desired component.

During an (all-too-brief!) demo, Selector AI showed how a detected problem (shown as a non-green hexagon in main monitoring UI) could be drilled into and then queried in natural language to provide advice on how to fix a particular issue.

Need Tools? Great. Got Tools? Perfect!

The three-level agentic model lets Selector AI offer bespoke solutions to each client in their portfolio, meeting them client where they are right now in their network problem resolution methodology:

  • As some clients already have a well-defined toolset for solving network issues, the Selector AI platform acts as an orchestrator to apply those tools directly.
  • Alternatively, their platform also lets a client who already has their own orchestration and solution toolsets to leverage the Selector AI models to figure out what’s malfunctioning and – always with a human in the loop where necessary – solve the problem with more precise intelligence than if the client had built their own detection and severity ranking infrastructure.

Selector AI offers a brief demo of its platform’s offerings that was roughly analogous to what they showed us, so grab a look so you can see it in action.

Hammerspace: An AI DatApocalypse, Forestalled

The AI continuum lately feels as if it’s being warped by the sudden appearance of a hidden and super-massive black hole. Spiraling development costs have given rise to feverish discussion of tokenomics as IT organizations struggle to limit their DevOps teams from depleting a year’s worth of tokens in just a few weeks. But runaway token spends are just the most visible part of today’s AI challenges.

Again With the Data.

Our final presenter at AI Field Day #8Hammerspace – targeted a key dimension that IT shops must focus on when building robust AI solutions that hopefully will yield meaningful results for their applications’ end users: AI still requires humongous volumes of data to produce accurate intelligence.

Running Out Of Everything Everywhere, All At Once

Hammerspace succinctly summarized the multiple uncomfortable realities about AI resource availability being discussed in just about every C-level boardroom discussion these days:

  • Datacenter power resources are crimped because utilities are unable to increase capacity quickly enough.
  • Even if you could even buy more processors, CPUs continue to be expensive.
  • Since storage is mainly SSD- and NVME-focused, the availability of reasonably priced storage is limited, too.
  • Finally, if a shop cannot secure additional power, compute, or storage within their own confines, cloud capacity is no longer a guaranteed off-ramp because nesoscalers – and even the largest hyperscalers – are nearing the maximum limits of their commodity hardware, just to support the existing workloads of their other customers.

Unifying, Rather Than Just Consolidating, Data

Hammerspace’s AI Data Platform solution thus implements a data unification strategy, rather than merely applying typical data consolidation approaches.

When a request is made for a specific set of data – for example, several thousand documents needed for additional AI training, or a vector-driven similarity search during an intense inference operation – the Hammerspace AI platform gathers those resources into the Tier 0 layer so operations can complete as quickly as possible. And when operations cease to require speedy access to resources, they can be moved intelligently to other (s)lower storage tiers.

Their solution essentially builds a global namespace that appears as a single storage viewpoint. Their platform stands in front of an organization’s present set of storage clusters and can access data from anywhere, regardless if it’s retained in local SSDs or NVMes – what we typically call Tier 0 – or an on-premises private cloud, or even a public cloud.

The advantage of this strategy? It treats data, wherever it might exist within multiple physical storage layers, as if they were kept within a single massive cluster.

Storage Vendor Partnerships = Even More Interesting Use Cases

Hammerspace’s customer base illustrates the attractiveness of their platform – they’ve implemented within customer HPE solutions as well as neoclouds. And Hammerspace also described several quite divergent use cases, including an AI application development environment supporting data retrieval demands spanning multiple petabytes and a user base supporting thousands of data scientist end-users.

Hammerspace also co-presented on how their global storage model helps Hitachi Vantara expand the capabilities of its VSP One platform for several of its AI-focused customers.

In concert with Vantara, Hammerspace’s solution was deployed across intriguing use cases, including a complex AI-based fraud detection and risk assessment application, as well as and an AI “incubation hub” where the Hammerspace solution sat above VSP-One block storage and communicated with an NVIDIA-powered two-node GPU cluster.

(Full disclosure: I’d actually helped launch the first iteration on the VSP in 2010 with extensive experimental research on how best to use Hitachi storage solutions to handle extreme Oracle database workloads as they consumed SSD storage bandwidth. I remember one workload was so intense that we actually melted a few SSDs in their cases.)

Storage Has Always Just “Been There”

If I were back in my role as a senior guy “in the trenches” just trying to keep a team of data engineers and data scientists productive and frustration-free, I can see exactly how valuable Hammerspace’s unified storage approach would benefit us. At every IT shop I’ve worked at, the expectation is that the storage layer is transparently ubiquitous, works all the time, and never ever runs out of space.

Of course, that’s a fantasy, and with the onset of increasing AI workload demands like training and inference that are often in direct conflict with each other, anything that helps homogenize and simplify managing storage is worth a closer look.

Our team didn’t get to see how Hammerspace’s AI Data Platform worked in concert with Hitachi VSP One because our final session ran out of time. That’s not a bad sign – it just means our team of delegates found lots of great questions to ask; check out the introductory video to see we all found interesting.

Solidigm: Anatomy of a (Murderous) Prompt

I’ve seen Solidigm present at Tech Field Day events several times since I’ve been a delegate. They always have interesting perspectives on how storage deeply influences the efficiency and flexibility of an IT shop’s capability to scale their computing resources to maximum potential.

Wait … How Many Tokens?

My most recent encounter with Solidigm at #AIFD8 prompted me to change my perspectives on AI. Kapil Karkra, Senior Principal Engineer for AI Solutions and Software, kicked off their presentations with an in-depth explanation of how a typical LLM processes a prompt to return meaningful value to a user. The novel way he explained its inner workings helped me crystallize some new realizations about generative AI inferencing:

  • Because LLMs are naturally non-deterministic, it’s nearly impossible to predict what an AI prompt will actually demand in terms of resources to answer a query.
  • It’s therefore nearly impossible to determine which resources will be needed for inference ahead of time, which also implies it’s difficult to predict how many tokens will need to be spent to retrieve those resources.
  • It’s also possible that I could issue precisely the same prompt and the LLM will issue a completely different query. I’ve seen this happen with Oracle 26ai’s SELECT AI tools, which typically use ollama LLM variants to construct SQL queries based on available metadata and instructions.
  • And unless a vector database is in use and precisely the same vector embeddings have been retrieved and then cached within database memory from a prior prompt, a new prompt can’t leverage those previously-retrieved embeddings.

The end result? A reasonably carefully prepared prompt – regardless of its simplicity! – may need to request tens of thousands of tokens to provide an answer … even though the original prompt may only consume a handful of tokens.

DBA Means Don’t Bother Asking

I have an unusual background; I’m a long-time DBA with deep background in storage technology. (I worked as a subject matter expert for Hitachi Data Systems in the early 2010s and helped launch the initial VSP.) Leveraging that perspective, I compared how a similar query would be handled by most modern databases today – for example, an Oracle 26ai database.

Bear with me as I explain the memory strategies built over the last few decades:

  • The 26ai query optimizer constructs an optimal execution plan against every table and index needed to satisfy the query using metadata about what’s stored within those objects. (In fact, it might even find that no valid answer exists based on the range of values requested, and then simply return a NULL result immediately.)
  • The database then accesses the database objects and return only the database blocks containing the rows needed from storage.
  • Even better, if the needed blocks were already in the database buffer cache memory in the proper state, the database would just use those blocks.
  • If another user had earlier issued a similar query, the optimizer wouldn’t waste any time building a new execution plan – it would just use the one already in the database’s library cache.
  • Finally, the database blocks and the query plan itself would be cached until no longer needed; they’d be aged out of memory as more recent requests demanded memory be freed.
  • Best of all, my DevOps team can tune queries before they’re ever run. The optimizer can accurately approximate the execution plan and thus catch foolish mistakes like queries that will run nearly forever because of improper joins or lack of indexes on columns that are most often used for selection criteria.

The whole point of this optimization strategy is to minimize round trips to physical storage. That also limits expensive physical I/O and helps increase application throughput. And it’s practically the exact opposite strategy that AI inference requests use to locate needed data.

Non-Deterministic Prompts Span Different Storage Workload Patterns

Solidigm’s analysis of what their customer base is focused on for AI workloads is particularly pertinent for wringing performance from underlying cloud storage systems. Their research shows at least two key workload patterns that must be accommodated for effective inference throughput.

The first workload pattern is focused on RAG or grounding activity that’s dominated by relatively small random reads against resources typically stored within either Local NVMes or independent objects / files.

The second AI pattern resembles a typical data warehouse workload with large-block reads and writes dominating in totally different storage stacks, starting at the key-value (KV) cache and extending into other caches depending on how recently resources were propagated for reuse into those caches.

Solidigm’s storage tiering methodology allows them to simulate what a modern database does intrinsically within its carefully-controlled memory caches structures: in essence, systematically maximizing KV cache for higher availability of resources needed to answer AI prompts over longer periods of time while still offering up sufficiently-responsive storage for RAG and grounding. It’s not a perfect solution, of course, but it takes into account how their customers’ AI workloads are evolving from earlier emphasis on model training to leveraging powerful LLMs for inference demands.

An example of the Deep Field photo from Hubble and JWST

When All Seems Dark, Look Backward, Forward, and Up

You know what I dearly love about attending Tech Field Day events? They aren’t just about humdrum, boots-on-the-ground problems we deal with every day. I found myself re-energized about what IT technology can accomplish for humanity at large after hearing more about the history of what we call The Cloud; tireless efforts to preserve trillions of web pages since the Internet’s inception; and just how tiny and precious our pale blue dot is when compared to the vastness of the universe.

The Cloud: A Brief History of Irrational Exuberance

Tom Lyon, an historian of computer science who has been programming since the early 1960s and co-founded DriveScale, stepped up to give us a totally different perspective versus how most of us delegates have probably thought about the current state of all things Cloud – public, private, hyper-scaled, neo-cloud, you name it – but posited that what we call The Cloud these days actually has its genesis nearly a century ago.

The interesting part? We’re really just ending yet another cycle of what Tom posits he and other keen observers call irrational exuberance as the famed AI hype cycle plays out.

Tom noted hyperscalers like Amazon, Google, Oracle, and Microsoft have built a house of cards around unbelievably rosy projections of AI growth possibilities, furthered by complex accounting tricks like special purpose vehicles to hide debt on their balance sheets. I commented that I’m just waiting for some 70-something investor to pop up at a shareholder’s meeting to shout at Larry Ellison, Satya Nadella, or Sundar Pichai Excuse me, but are you on crack?

The Real Ministry of Truth. Apologies to Winston Smith.

The continuing degradation of objective truth in the modern world has severe implications for policy-making, corporate planning, and history at large. It’s impossible to ignore the erasure of large swaths of scientific data about everything from climate change to basic public health that continues in real time.

Without the Wayback Machine, many USA government websites’ content that’s been effectively erased by the current administration in Washington. That’s why it’s so crucial to preserve the Internet’s billions of pages of content created over the last 40 years since the World Wide Web was born.

We heard from Joy Chesbrough, the Internet Archive’s Chief Philanthropy Officer (now that’s a cool job title!), on how their organization is able to preserve the past digitally on a relatively tiny budget of $30M per year. It was inspiring to hear how this non-profit is able to do the sometimes-thankless work of digital preservation and what those efforts conserve for future generations.

We’ve all seen how governments, politicians, and oligarchs have attempted recently to discard uncomfortable truths – everything from the denigration of the 1619 Project’s attempts to document how slavery played a key part of the founding of the American Experiment to recommended standards for nutrition, vaccine scheduling, and disease prevention. The Wayback Machine is a crucial tool for preserving that treasure trove of human knowledge against the creep of religious conservatism, willful ignorance, and authoritarian governments.

There’s No Intelligent Life Here. Do Look Up!

It’s easy to get discouraged about the current state of humanity, but I found my faith restored during our visit to the Search for Extraterrestrial Intelligence (SETI) Institute in nearby Mountain View.

We took a tour of the SETI offices after we chatted with Dr. Christina Ricci about the humongous scope of the universe and the mediums they’re searching for evidence of industrialized and advanced civilizations. She explained the famous Drake Equation which explains just how likely we tiny humans are likely to find evidence of extraterrestrial intelligence, if we just look hard enough. It might even happen in our lifetimes … and it would utterly transform the way humanity thinks about itself once we know we are truly not alone in the universe.

Dr. Vishal Gajjar explained how his efforts at SETI deploys artificial intelligence – really, complex machine learning algorithms trained across exabytes of radio and visual astronomy – are being actively applied to determine if there really are advanced civilizations in our galactic backyard. One of the most fascinating ideas involves looking for evidence of construction of mind-bogglingly huge structures like Dyson spheres surrounding exoplanets and stars across years of collected data in both EM and visible light spectrums.

Of course, I had to buy a couple of t-shirts to support their efforts. 🤯

VMware VCF 9.0: When All Prop Blades Work, OSS Database Performance Is Optimal

Our final day at #CFD25 dove deeply into the feature sets of the recently-released VMware Cloud Foundation 9.0 (VCF 9.0). It was refreshing to discuss the intricacies of cloud computing that didn’t immediately turn towards how many hundreds or thousands of Docker or Kubernetes containers we can deploy at scale. Instead we focused on where a lot of the real work still happens out of sight and mind: the rugged families of databases capturing the exabytes of data eventually used to create documents everyone wants their generative AI workloads to consume.

As a long-time beta tester, user, and afficianado of Oracle Cloud Infrastructure (OCI), I understood the challenges laid at VMware’s doorstep to transform their offerings: Modern IT organizations must effectively operationalize their computing, storage, and networking infrastructure. I equate these facets to an aircraft’s three-bladed propeller: If just one blade is under-performing, the power and effectiveness of the other two will be compromised as well.

Blade #1: Managing Memory To Forestall the Impending DRAMpocalypse

As our VWmare presenter acknowledged, we’re currently in the throes of a“DRAMpocalypse,” so it’s never been more crucial for IT shops to manage their existing server’s memory resources effectively. (I recently purchased a new FrameWork laptop – hopefully the last one I’ll ever need to buy – and the recent spike prices for DRAM memory were a wallet-shocker.)

Answering this DRAMpocalypse, VCF 9.0 offers advanced memory tiering features to exchange the least-active pages from DRAM to NVMe. While most modern databases provide this tiering capability via software, this actually happens within the VMware configuration itself. It’s a hypervisor-native tiering mechanism that leverages what VMware terms a Logical Memory Unit, comprised of DRAM at the top of the tier and the slower NVMe storage tiered below.

The tiering mechanism’s goal is to keep CPUs from waiting to process pages in memory. As database workloads proceed, VMs consume logical memory and the tiering software dynamically relocates the hottest pages to DRAM and switches out the colder pages to NVMe storage. The tiering algorithm takes into account the I/O access method – read-only vs. read-write – needed for operations, too.

VMware claims this tiering method at least doubles the effective use of memory and returns a corresponding 40% reduction in TCO because the hottest pages are placed essentially closer to the CPU. The tiering algorithm is configured automatically so it doesn’t need constant monitoring for effectiveness.

Again, this isn’t a revolutionary concept – Oracle Database 12c implemented this feature 10+ years ago – but since the memory management is native to the hypervisor itself, less sophisticated or open source databases like MySQL or PostGres can take advantage of these performance enhancements.

And since this strategy insures that hottest pages aren’t being constantly exchanged between DRAM and NVMes, there’s also a side benefit: the potential to extend NVME useful life by preventing extensive read/write operations over time.

Finally, several data encryption security features are supported, and it can be deployed at either the host or VM level. Check out this detailed video demonstrating these features, and here’s the deeper details from VMware: https://blogs.vmware.com/cloud-foundation/vcf-advanced-memory-tiering/

Blade #2: Managing MySQL, Postgres, and SQL Server Databases with Data Services Manager 9 (DSM)

I thought it was pretty gutsy for VMware to show up in a room populated with several experienced DBAs from at least three database families – SQL Server, Oracle, and MySQL – to talk about the second prop blade: Data Services Manager 9 (DSM).

DSM offers full support for MySQL and Postgres – two of the most popular open-source database these days – as well as SQL Server. DSM gives VSphere administrators a central management portal to manage and control related resources via specific data service policies and infrastructure policies that limit access to database resources to specific users.

Infrastructure policies make it simple to grant privileges to qualified users – perhaps a trusted DevOps resource, or a junior DBA – to deploy clones of existing production databases, even permitting deployment of prior versions of database engines (releases) for researching issues related to prior releases.

Our VMware presenter also demonstrated how to deploy resources to support a MySQL database through DSM, including the ability to quickly deploy a clustered MySQL environment – a non-trivial exercise – with just a few mouse clicks.

Finally, as an experienced DBA, let me assure you if your DBAs aren’t constantly fretting about backing up your organization’s crucial databases – which should include development and staging databases! – then you haven’t got the right people on staff. VMware showed how DSM 9 made it simple to enable backup strategies, including selection of the appropriate storage targets for backup files.

I did probe our presenters about preserving Transparent Data Encryption (TDE) for MySQL databases. TDE is a particularly valuable feature for MySQL environments; it ensures data is truly encrypted within the database itself. This implies that any backups taken of TDE-encrypted database files remain encrypted to guarantee any database blocks within encrypted tablespaces are also encrypted when they’re backed up to eliminate a potential vector for discovering / accessing data.

Here’s a detailed look at this set of features; you can watch our CFD25 delegates’ spirited questioning too.

Blade #3: Tying It All Together Within Virtual Private Clouds (VPCs)

I’ll admit that the final blade on the prop – networking – tends to be the least interesting (and thus most often ignored) feature for data engineers and experienced DBAs.

Deploying the network infrastructure to support a modern database with proper restrictions is crucial to keeping data secured properly within any application and database environment. (I’ve recently struggled to set up relatively complex networking within OCI environments, so trust me: if networking isn’t in your regular wheelhouse, this operation is potentially error-prone.)

VMware demonstrated how VCF Networking NSX services made short work of building out robust public / private network infrastructure within a Virtual Private Cloud (VPC) in matter of minutes without having to worry about choosing exactly the right IPv4 addresses to make everything work. A particularly useful feature: VCF will not allow subnets to be deployed within overlapping CIDR address blocks accidentally, thus insuring network communication isn’t compromised by face-palm-level mistakes.

What wasn’t readily apparent was how to insure that particular ports within IP addresses are blocked or opened. To their credit, our VMware colleagues explained the best way to guarantee that protection level was to deploy their VDefense toolset to control port-level permissions.

Here’s a detailed look at this offering from VMware’s perspective, and here’s our delegates’ in-depth discussions and questions.

Conclusion: When All Blades Work …

Overall, VCF 9.0 looks to me like a full-featured yet evolving toolset valuable for open-source database management and corresponding support for application development and production deployments. Its self-service features mean IT shops can relegate complex performance monitoring, database environment management, and networking to reasonably qualified or junior team members without incurring significant risks of self-harm.

Hammerspace: Metadata Will Be Assimilated, and Their Uniqueness Added to AI’s Demands

Cloud Field Day 25 #CFD25 was my first introduction to Hammerspace’s unique technology offerings specifically aimed at managing cloud-based workloads for today’s modern cloud environments.

As a long-time Oracle DBA with a deep background in storage technology, I found their analysis of what their customer base is focused on to be particularly pertinent for wringing performance from underlying cloud storage systems. Meeting the demands beyond modern OLTP applications and business intelligence dashboards – specifically, the impact of generative and agentic AI requirements – requires new ways of handling diverse I/O patterns and compute demand cycles, especially for AI training and inference operations.

Leaving Unstructured Data Where It Already Lives. Well, Most of It.

First, I was pleasantly surprised to see Hammersmith lead with an example of their AI Data Platform technology assisting an environment I know quite well: Oracle Cloud Infrastructure (OCI). I quite literally had to rub my eyes in relief, since most Tech Field Day presenters usually avoid Oracle technology like the plague. But OCI plays extremely well with advanced storage systems, whether a database is accessing flat files like tables in ORGANIZATION EXTERNAL mode or when chunking and transforming documents and images into its 26ai VECTOR datatype for generative AI purposes.

Of course, it’s no secret that AI workloads often demand huge compute and GPU resources to be brought to bear to tackle both training and inference operations against equally ginormous unstructured data sources. Instead of migrating terabytes or petabytes of these data to cloud storage, Hammerspace’s Global File System (GFS) enables extremely efficient access to unstructured data for massive, parallel cloud-based compute because it decides exactly which files should be moved to the fastest intra-cloud storage for handling while leaving the majority of unneeded files in their current location.

Those Metadata Matter.

The trick, of course, is selecting only the files most likely to benefit – and that means looking closely at each file’s metadata to determine which file(s) should be gathered from on-premises data and brought to the public cloud to take advantage of VMs, NVMEs, and GPUs.

Hammerspace’s solution involves first deploying an Anvil server that interrogates unstructured files’ metadata to assimilate their key attributes as a cone file system. One big advantage of this strategy is that files can span multiple environments, mount points, and file systems to capture an holistic view of all possible targets.

Once all mount points, file systems, directories, and individual files are assimilated, that information is captured so its Data Services Server (DSX) can then use to intelligently gather the files that would most benefit when processed for a particular workload’s demand.

And before you ask: While the extensive demos we saw at #CFD25 were all GUI-based, Hammerspace also offers access to all these operations via API calls accessed via their Hammerspace development toolkit (HSTK) or their HSCLI command line interface.

Result: A Logical Tier 0 NVMe-Based Storage Layer, With File Resiliency Built-In

Once assimilation is complete, the Hammerspace AI Data Platform solution enables instantiation of sufficient VMs with direct access to GPU processors. Each VM also accesses and gathers the files needed to complete the training or inference tasks at hand. Files are places within Hammerspace’s proprietary NVMe-based logical Tier 0 storage layer. I/O operations then proceed apace in parallel and without the limitations of some files being stored on slower, less responsive storage systems.

Obviously, file replication can be a time- and resource-consuming process. To assuage against the possible loss of data after the potentially lengthy process of replication, Hammerspace’s solution also insures files are spread across Availability Zones (AZ) to avoid placing a file image onto two NFS file buckets within same risk zone.

Even Meta Needs Meta(Data)

As someone who has manually built storage arrays for fastest database I/O processing and worked with high-speed and high-capacity systems for storage access like Hitachi’s Universal Storage Platform and Oracle’s Exadata Database Machine, this strikes me as an elegant solution to the varying demands of generative AI and agentic AI workloads to get compute and GPU resources closest to the right data needed to solve crucial business problems at scale.

And Hammerspace reports that one of their largest customers – Meta (yes, that Meta, as in Facebook, Instagram, et. al) – uses their solution to manage access to 40PB of unstructured data in production, with plans to expand that footprint to 100PB in the future. Meta’s use case also leverages several thousand servers accessing tens of thousands of GPUs.

Glean Insights & Value from Unstructured Data With Qlik Answers

I’ve found that every once in a while, it doesn’t hurt to see what everyone else is doing in the same technology space I’m currently focused on. For the past 18 months, that’s been the Generative AI space and the impending implementation of Agentic AI across diverse industries and applications.

Getting to Generative AI: Like Learning a New Foreign Language

Full disclosure: I’m an Oracle DBA with 25 years of experience in data engineering and 45 years of experience in application programming. Lately I’ve focused on building out simple Generative AI and Retrieval Augmented Generation (RAG) chatbot applications with Oracle Database 23ai technology and Oracle Application Express (APEX) within the Oracle Cloud Infrastructure (OCI) public cloud. That meant learning how to use LLMs to chunk and create embeddings for a corpus of documents, how to perform cosine similarity searches against vectorized content, and prepare appropriate proper system prompts within a chatbot framework to return cogent answers from that corpus based on questions asked.

This was a decidedly non-trivial task – it took me several weeks to master these concepts and then build demos that yielded relatively hallucination-free answers, and it was at least two months before I felt I could comfortably present my work to colleagues at user group conferences. I came away with a new respect for the depth of knowledge required to deliver qualified answers from LLMs and Generative AI applications.

Qlik Connect 2025: Qlik Answers

Back in May 2025 I had a chance to take a close look at the latest version of Qlik Answers for developing Generative AI solutions. While at Qlik Connect, I spoke with executives and developers about their vision for capturing valuable business insights into their customers’ data, especially if it was unstructured information strewn across thousands of pages of documents.

The folks at Qlik granted me a trial account and I dove into what Qlik Answers could achieve. I was pleasantly surprised that it was a relatively straightforward path to construct a chatbot that could search through several hundred pages of documents from multiple sources – scholarly papers, digital news reports, blog posts by reputable authors – to return cogent answers to business questions.

What impressed me was how quickly this all came together: Importing my corpus of nearly 30 documents, indexing them for use, and constructing a basic chatbot that could chew on the corpus to provide answers took less than 15 minutes.

Handling Outliers Is What Matters

I’m not some starry-eyed dreamer about AI capabilities; indeed, my earlier work with generative AI had yielded some surprising confabulations depending on what questions I asked of my chatbot.

Thus my evaluations included some tricks I’d learned during my prior experiences – things like prompt injection attempts and even hiding system prompt overrides within a source document. I discovered that Qlik Answers was able to handle the twists I threw at it quite well without any additional fine-tuning.

Over to You: Have An In-Depth Look At Qlik Answers

Obviously, this brief blog post isn’t going to convince anyone of how well Qlik Answers performed during this process, so please have a look at my complete evaluation of this tool. It contains detailed screenshots and explanations of every step I took, including references to each document I used as a source for my corpus so you can quickly run similar evaluations. Of course, please feel free to post comments to let me what you have discovered.

Reports of My Blog’s Demise Are Premature

Never thought THIS would happen!

I realize that it’s been an incredibly long time since I’ve last posted here, but I have a plethora of reasons – and no, it’s not because an AI agent has taken over my job.

  • Helping run events for the Analytics and Data Oracle User Community (AnDOUC), including my latest role as User Community Organizer, has taken quite a bit of time since our major event, Summit 2025, last April. I’m responsible for a majority of the production of their monthly TechCast episodes. (Shameless promotion: If you like Oracle technology and like presenting on it virtually, consider submitting a virtual session.)
  • I’ve traveled to several Oracle User Group events this past year as well, including nlOUG in the Netherlands and various other smaller events in the USA. In fact, I’m off to Vancouver, BC this Friday to keynote on the state of Generative and Agentic AI in our industry, so if you’re nearby, please consider coming to hear what I’ve got to report.
  • And finally, I’ve been working closely with The Futurum Group and their Tech Field Day events team to expand my portfolio as an expert in the field of data engineering and possibly even get recognized as an influencer in this space. (Hey, if it’s on a badge, it must be true, right?)

I promise to keep up with blog posts in the future – keep an eye out for an upcoming lengthy article on my exploration of the latest features of Qlik Answers in just a few weeks!

GenDev, Meet MultiOpenCloud: Oracle Cloud World 2024

Something was different this year at Oracle Cloud World 2024: There was a buzz this time that just wasn’t there in the last two OCWs I’d attended. It may have been the ubiquitous presence of Generative AI in just about every event I attended, or acceptance that it’s not the flash in the pan some prognosticators are predicting.

Keynote: One Cloud, One World, Zero Humans

It wouldn’t be OCW without Larry Ellison’s keynote, and he didn’t disappoint us with some surprises. For starters, Ellison announced that AWS and Oracle are partnering to build Exascale technology directly within AWS data centers. This will alleviate some sticking points for folks who want to run Oracle database technology and services within AWS environments. (I didn’t have that one on my OCW24 bingo card!)

The total number of private and public Oracle Cloud sites will top 160+ near the end of this year, with the largest one projected to exceed 1 Gigawatt for power requirements. Ellison has mentioned that he’s even considering using small modular reactor (SMR) technology to power the largest of those planned.

Ellison also talked about the need to dramatically increase the security of any public/private cloud, as Oracle has done within Autonomous Database (ADB). In fact, he said, Oracle will be moving all Oracle Applications within their Cloud environments into ADB to insure the security of autonomous systems.

Passwords are a terrible idea – too easy to steal or compromise! – and are the key infiltration points for any cyberattack. The solution, Ellison said, is to use biometrically authenticated logins for all access to the Oracle Cloud.

Finally, I was thrilled to hear him give a shout-out for Application Express (APEX) as the premier low-code application generation environment, given its capability to build applications from schema to screen without only minimal DBA or developer involvement.

Keynote: Oracle 23ai, APEX, and GenDev

Juan Loaiza, Oracle EVP for Mission Critical Database Technologies, dived in quite a bit deeper during his keynote into what generative AI portends for the future of intelligent application development, which he called Generative Development (GenDev for short).

Juan showed us how APEX – Oracle’s premier low-code application environment – can right now generate a complete application, including the database schema, application pages, and user interfaces with just a few natural language prompts to generative AI. I was glad to see that JSON Relational Duality is one of the cornerstones for removing an inexperienced data engineer or under-qualified Oracle DBA from the schema design process.

Later that day at our Oracle ACE dinner, I had a chance to chat with Juan about his keynote to let him know I really appreciated his perspective on GenDev, and also thank him for all his support for the Oracle ACE Program over the past several years. Juan, ever the humblest C-level executive I’ve met, told me he thought his session was a bit boring. I explained that to the contrary, it was one of the most informative sessions I’d seen recently, as it foretold a future of much simpler application development on the near horizon.

My Sessions: RAG, APEX, AIM, RAS

Turnout for every single one of my three sessions was surprisingly high at OCW24. I led off with a talk on how I’d built a prototype RAG solution for a stalled social media campaign, including all the steps to complete the APEX app, from gathering a corpus to my test results revealing some surprising hallucinations. (You can check out that session’s demo video here.)

I also teamed up with Andy Rivenes, the PM for Database In-Memory technology, to show off the latest features of 23ai Database in that realm. And my colleague and fellow ACE Director Karen Cannell teamed up at the last theatre session of the conference to show off the tricks we’d learned over the past few months when we deployed Real Application Security within a complex government agency’s environment.

All Right, Mr. DeMille, I’m Ready for My Closeup

Since I’m an Oracle ACE Director, I spent a lot of time on the exhibition floor: At the Oracle ACE Lounge, the Database Swag + Sweets booth, and (my personal favorite!) the Generative AI hologram station that Paul Parkinson from Oracle had built (video on that here.)

I’m working on making my events reporting much more dynamic, so this year I brought along a new steady-cam camera rig and remote Bluetooth microphone that let me record videos while walking around the floor and interacting with my fellow ACEs and Oracle staffers. My YouTube OCW24 playlist has a few other impressions too – please have a look, and I’d appreciate some feedback!

OCW25: Predictions

Finally, I’d be remiss if I didn’t hang out over my skis a bit and make some predictions for Oracle Cloud World 2025 next year:

  • Generative AI will become more focused into generating full-blown applications with even more limited involvement from DBAs, data engineers, and DevOps … which means we all should consider improving our prompt engineering skillsets to stay relevant.
  • LLMs will improve to the point where worrying about things like the proper chunk size for corpus documents will be unnecessary.
  • Security will still be a major focus; the only real unknowns will be how seriously huge the next breach will be and how many millions (or billions!) of people it will affect.