The Computational Storage (CS) initiative of the Storage Networking Industry Association (SNIA) includes 51 companies and 261 individual members. The group worked with the NVM Express Compute Storage Task Group. At the SNIA Storage Developers Conference (SDC) 2021 there were numerous presentations on important developments that bring storage and processing closer together.
SNIA’s computational memory activity is an attempt to establish standards and definitions for implementing computational functions near or in digital storage devices. These efforts take advantage of the high internal bandwidth available in storage devices and reduce latency for processes that are offloaded from a server CPU. Additionally. Computing memories can save energy by reducing the transfer of data from memory to the CPUs. CS can be implemented using floating point gate arrays (FPGAs), graphic processing units (GPUs), application specific integrated circuits (ASICs), and digital processing units (DPUs).
Jerome Gaysse said that in a 1,000 virtual machine environment, system density by 7%, energy consumption by 25% and total cost of ownership by 15% from a computer storage solution (from the SNIA Persistent Memory and Computational Storage 2021 peak). Gaysse points out that computerized storage, including energy savings in manufacturing, transport and recycling, could lead to a 50% reduction in carbon footprint.
The following figure gives some insights into the computational storage conditions of SNIA. Computer storage devices can reside on the network (a computer storage processor), in a drive (computer storage device), or in a storage array, computer storage array. These different types of devices perform computational memory functions. In addition to hardware and software architectures, the committee also deals with security and deployment models.
The committee’s work includes developing application programming interfaces (APIs) for compute storage devices. The image below from Scott Shandley’s introductory lecture shows an example of an API flow. Providers define a library for their device that implements an API. The initiative is building a CS-API library that provides a single set of APIs for all CSx (Computational Storage Device) types. Functions that are not available on a specific CS device can be implemented in software that is independent of the operating system. Vendor-specific implementation details can be found in the CS API library. Plugins connect compute storage devices to the CS interfaces.
Fungible’s Jai Menon said CS can speed up many use cases. This includes the provision of database acceleration with data-related scans and aggregation, big data analyzes in which findings are generated directly from the data, image classification in which meta-tagging takes place directly on the data, intelligent vehicles with direct processing of vehicle telemetry data, scientific Data filtering experiments are conducted near the data and content delivery networks where the data is located at the source. Fungible offers a DPU for data-centric processing applications. The first product runs 192 hardware CPU threads and dozen of accelerators in parallel, as shown below.
Fungible also has a programming model that supposedly simplifies computational storage. The use of NVMe over TCP (NVMe-oF) Fungible FS 1600 Storage Appliance Racks can provide elastic block storage. These racks are 2U high with 2 DPUs and 24 SSDs. It also has a single point of management for customer data, virtual machines, containers, and bare metal hardware. Jai also discussed calculating the Performance Efficiency Percentage (PEP) as a performance metric for storage systems.
In another talk, Scott Shadley talked about Computational Storage with Kubernetes and Containerized Applications. He pointed out that with the exponential growth of unstructured data, a new path to the computer on random local data is required and computational storage makes it possible. The following figure shows various form factors of NGD devices.
He mentioned partnerships with VMware for virtualization and database acceleration and Los Alamos National Laboratory for their efficient, mission-centric computing consortium. Note that LANL’s Brad Settlemyer also gave a talk on accelerating file systems and data services with computer storage. Scott works for NGD and their computer storage drive uses the Linux operating system with standard NVMe storage and Linux instructions.
Scott has seen improvements in several applications. Elasticsearch offers a 20% increase in performance with 30% less power consumption. The utilization of DRAM and CPU has been reduced by over 50%. He also demonstrated benefits for Greenplum databases running on vSphere virtual machines as well as the Mongo database on Hadoop. He then went through a Spark cluster deployment as part of a Kubernetes cluster with computer storage drives. He spoke about future advances in on-drive Linux SSDs such as quad-core computing storage CPUs.
Tony Afshary from Pliops spoke about the company’s Extreme Data Processor (XDP), a network-based computing memory processor. They say it increases compute power by 3 to 15 times and allows the most affordable storage to be used for any workload. The following figure shows the Pliops XDP architecture.
The device provided significant software-only advantages over multiple database applications, including a 50% reduction in total cost of ownership. He also talked about the failure protection of Pliops drives and how it can be used in cloud deployments for high performance computing and database backup offloading. He spoke about the impact of Pliops on a public cloud provider with a 43% capacity increase and a 36% cost reduction.
Jayjeet Chakraborty of UC Santa Cruz gave a talk on relieving CPU usage by using Ceph-based object storage devices with a file system “shim” within the storage device to have a file-like view of the object storage. It also used Apache Arrow as a columnar storage format and pluggable components to build data processing systems in the storage devices. His results showed significant improvements in relieving CPU usage and reducing network bandwidth usage.
Computational Storage is a major initiative by SNIA and the NVM Express Group. Progress has been made in defining various CS modes and building hardware and software that allow for lower CPU usage, reduced bandwidth usage, and lower power consumption. Computational storage will play an increasing role in modern storage architectures.