
Cloud Field Day 25 #CFD25 was my first introduction to Hammerspace’s unique technology offerings specifically aimed at managing cloud-based workloads for today’s modern cloud environments.
As a long-time Oracle DBA with a deep background in storage technology, I found their analysis of what their customer base is focused on to be particularly pertinent for wringing performance from underlying cloud storage systems. Meeting the demands beyond modern OLTP applications and business intelligence dashboards – specifically, the impact of generative and agentic AI requirements – requires new ways of handling diverse I/O patterns and compute demand cycles, especially for AI training and inference operations.
Leaving Unstructured Data Where It Already Lives. Well, Most of It.

First, I was pleasantly surprised to see Hammersmith lead with an example of their AI Data Platform technology assisting an environment I know quite well: Oracle Cloud Infrastructure (OCI). I quite literally had to rub my eyes in relief, since most Tech Field Day presenters usually avoid Oracle technology like the plague. But OCI plays extremely well with advanced storage systems, whether a database is accessing flat files like tables in ORGANIZATION EXTERNAL mode or when chunking and transforming documents and images into its 26ai VECTOR datatype for generative AI purposes.
Of course, it’s no secret that AI workloads often demand huge compute and GPU resources to be brought to bear to tackle both training and inference operations against equally ginormous unstructured data sources. Instead of migrating terabytes or petabytes of these data to cloud storage, Hammerspace’s Global File System (GFS) enables extremely efficient access to unstructured data for massive, parallel cloud-based compute because it decides exactly which files should be moved to the fastest intra-cloud storage for handling while leaving the majority of unneeded files in their current location.
Those Metadata Matter.
The trick, of course, is selecting only the files most likely to benefit – and that means looking closely at each file’s metadata to determine which file(s) should be gathered from on-premises data and brought to the public cloud to take advantage of VMs, NVMEs, and GPUs.
Hammerspace’s solution involves first deploying an Anvil server that interrogates unstructured files’ metadata to assimilate their key attributes as a cone file system. One big advantage of this strategy is that files can span multiple environments, mount points, and file systems to capture an holistic view of all possible targets.

Once all mount points, file systems, directories, and individual files are assimilated, that information is captured so its Data Services Server (DSX) can then use to intelligently gather the files that would most benefit when processed for a particular workload’s demand.
And before you ask: While the extensive demos we saw at #CFD25 were all GUI-based, Hammerspace also offers access to all these operations via API calls accessed via their Hammerspace development toolkit (HSTK) or their HSCLI command line interface.
Result: A Logical Tier 0 NVMe-Based Storage Layer, With File Resiliency Built-In

Once assimilation is complete, the Hammerspace AI Data Platform solution enables instantiation of sufficient VMs with direct access to GPU processors. Each VM also accesses and gathers the files needed to complete the training or inference tasks at hand. Files are places within Hammerspace’s proprietary NVMe-based logical Tier 0 storage layer. I/O operations then proceed apace in parallel and without the limitations of some files being stored on slower, less responsive storage systems.
Obviously, file replication can be a time- and resource-consuming process. To assuage against the possible loss of data after the potentially lengthy process of replication, Hammerspace’s solution also insures files are spread across Availability Zones (AZ) to avoid placing a file image onto two NFS file buckets within same risk zone.
Even Meta Needs Meta(Data)
As someone who has manually built storage arrays for fastest database I/O processing and worked with high-speed and high-capacity systems for storage access like Hitachi’s Universal Storage Platform and Oracle’s Exadata Database Machine, this strikes me as an elegant solution to the varying demands of generative AI and agentic AI workloads to get compute and GPU resources closest to the right data needed to solve crucial business problems at scale.
And Hammerspace reports that one of their largest customers – Meta (yes, that Meta, as in Facebook, Instagram, et. al) – uses their solution to manage access to 40PB of unstructured data in production, with plans to expand that footprint to 100PB in the future. Meta’s use case also leverages several thousand servers accessing tens of thousands of GPUs.















































