Data storage consumes an estimated 4% of the global electrical energy supply, and the demand for data storage will accelerate for the foreseeable future. We need innovations in storage software and hardware to enable economical, resilient, and environmentally sustainable data management.
This blog is based on a DCIG Executive Whitepaper commissioned by Swiss Vault
The Need for Sustainable Storage
Everybody has lost valuable data, due to various reasons. Most of this can never be recreated in whole or in part. In NASA’s case, valuable technical design data stored on 7-track magnetic tape has been lost, due to the unavailability of still-functioning tape drives.[i] This will soon be the case for data stored on floppy disks. (When was the last time you saw an 8-inch floppy disk reader?)
Another major storage problem is power demand. Within the data center, storage typically accounts for a significant portion of that electricity. Overall, Data storage consumes an estimated 4% of the global electrical energy supply, and the demand for data storage will accelerate for the foreseeable future with the coming Data Tsunami. Thus, sustainability matters when it comes to enterprise storage.
The Data Tsunami Quantified
High-performance computing (HPC) requires vast amounts of data capacity and velocity. That data holds value for organizations and society.
One person I spoke with at the SC22 supercomputing conference is responsible for managing technology for a commercial genomics company. It processes 2 PB of genomics data daily, keeping “just” 100TB of that data. They would like to save more of the raw data for future research but have not found a practical solution that enables them to do so.
Such a cost-effective solution would necessarily use commonly available non-proprietary hardware and provide a seamless transition to supporting new storage capacities and media types as technology evolves.
Raphael Griman, Compute Team Lead for EMBL-EBI, said his organization currently stores more than 250PB of public science data and expects to cross the exabyte boundary within a year. That means EMBL-EBI adds an average of 2PB to its environment each day.
Sustainability – Reducing Data’s Carbon Footprint
Bhupinder Bhullar, Swiss Vault Co-founder and CEO also worked in genomics. He recognized the need to preserve and secure this sensitive data for a lifetime. He concluded that “Every CIO should have a strategy for 100-year data retention.”
The co-founders started Swiss Vault to solve the problem of economically and reliably archiving huge volumes of data for decades. Swiss Vault’s separate software and hardware innovations enable economical, resilient, and environmentally sustainable data management.
“Every CIO should have a strategy for 100-year data retention.”– Bhupinder Bhullar, Swiss Vault CEO
“Our Big Thing is Flexibility”
Swiss Vault’s focus on sustainability caused it to prioritize flexibility. The flexibility to fully utilize existing infrastructure. The flexibility to incorporate diverse storage resources into a single namespace with no limits on the filesystem size. The flexibility to adapt to the customer’s needs instead of expecting the customer to adapt to the technology provider’s restrictions.
Flexibility to Double or Triple the Life of Existing Infrastructure
The vast majority of data center infrastructure enters the waste stream within a few short years. Therefore, extending the life of existing infrastructure is one way of reducing both CAPEX and data’s carbon footprint. Swiss Vault’s Vault File System (VFS) software can delay a data center refresh from the typical three to four years to double or perhaps triple that time. Douglas Fortune, Swiss Vaults CTO and Co-founder stated “By making flexibility a core value in the design of the Vault File System, Swiss Vault is maximizing the useful hardware lifespan and minimizing the waste of electronic components.”
Erasure Coding Flexibility for Sustainable Storage
Erasure coding enables higher resilience and more efficient capacity utilization than traditional RAID schemes. Further, VFS distributes the data over multiple servers, allowing for limited node failures. The Vault File System’s erasure coding is different in that it is easy to implement and much more flexible than other solutions.
Swiss Vault’s flexible erasure coding to any level of resilience means an organization can run existing HDDs to failure without risking downtime, and they can replace older low-capacity drives with more energy-efficient high-capacity drives to optimize for capacity and energy efficiency on a schedule that is optimal for the organization.
Customers can configure and re-configure erasure coding using any combination of data plus parity (D+P) chunks. Most competitors only allow customers to choose from a few vendor-defined options. With Swiss Vault, customers at any time can assign a different D+P per directory, file, file type, or file class to match the customer’s current requirements. This flexibility applies across the life of the hardware infrastructure, regardless of media size, speed, or vintage.
Swiss Vault’s customers can re-use existing standard networks, servers, and storage to extend the life of that infrastructure, and can also take advantage of recent hardware advances. By connecting existing and new systems in a single namespace, organizations can scale their data storage
Networking. VFS clusters can utilize basic networking but use RDMA (RoCE or InfiniBand) if available for inter-server communication. VFS does not require expensive SmartNICs or GPU’s for computational acceleration. The VFS client can also be installed on Linux workstations/instruments for parallel read/write of data (a great use case is research intense data capture from particle physics experiments and genomics).
Servers. VFS is compiled for X86 and ARM, runs on most modern hardware–even Rasberry Pi+–and does not require GPUs or other hardware accelerators. VFS is developed and tested on Mint and Ubuntu and should run on any Linux/Unix.
Storage. VFS can integrate existing JBOD-formatted storage and newer higher-density storage into a single storage pool, with unlimited filesystem size. Customers can put an VFS directory on an existing XFS/EXT4 disk to co-exist with other JBOD data. The VFS directory will grow dynamically as data is migrated into VFS.
VFS media can be moved from server to server and slot to slot without admin configurations. VFS automatically recognizes moved disks and their contained data, even when moving disks across CPU architectures or geographic boundaries.
Incremental scaling. Some storage solutions require adding capacity in multi-disk packs or complete nodes. VFS enables incremental expansion by as little as a single drive in arbitrary size capacity. This incremental scaling reduces waste by enabling the organization to acquire capacity as it is needed instead of the traditional practice of buying several years of capacity up front and then powering it for months or years before it is needed.
Take advantage of hardware innovations. In addition to extending the life of existing infrastructure, Swiss Vault customers can take advantage of its low-power hardware innovations and the latest networking gear to achieve an optimal balance of capacity, cost, and sustainability based on the customer’s priorities.
Tailored to big data workloads. Swiss Vault’s first customers include organizations in genomics, particle physics, remote sensing, and other scientific big data environments. Those industry-specific customers benefit from Swiss Vault’s specialized file-type data compression which is often more efficient than generalized compression.
Swiss Vault’s Power-efficient Hardware Innovations
Though the focus of this report is on Swiss Vault’s Vault File System (VFS), the company is also innovating carbon-reduced storage through low-power storage systems. It uses low-wattage AMD EPYC, ARM, and RISC-V architectures and other mechanisms to enable a 10x improvement in storage density and power efficiency in terms of Watts per PB.
Swiss Vault Best Fit Use Cases
As noted above, its first customers are research and science-oriented organizations generating big data. Thus, organizations with similar workloads may be especially interested in joining these early adopters.
Swiss Vault may be a good fit for you if one or more of the following applies to your organization:
- You have sub or multi-petabyte storage requirements in support of scientific big data.
- You need to enhance the resilience of your big data environment.
- You are approaching a data center refresh cycle.
- You are implementing an active archive or are rapidly outgrowing your archive storage.
- You are considering a replacement for your tape-based data archive
As with many innovators, Swiss Vault’s founders’ vision put them in touch with emerging requirements earlier than most. They recognized the need to store and securely access large amounts of sensitive data for a lifetime, and they embarked on a journey to meet that need.
Swiss Vault emerged from stealth in 2022. They have secured multiple startup awards, including an Energy Globe Gold Award from the Energy Globe Foundation (Austria) for innovation in sustainable technologies. The Vault File System already delivers the core features needed to enable a more everlasting storage infrastructure, and they have an exciting software and hardware roadmap to realize and expand on that vision.
Swiss Vault has now reached the commercialization stage and is offering incentives (such as industry-specific optimizations & customizations) for early adopters. That is, organizations with IT staff who 1) understand their organization’s data requirements, 2) are able to communicate effectively with product developers, and 3) are willing to implement software updates.
If these early adopter requirements and uses cases describe you and your organization, I urge you to contact the Swiss Vault team.
Keep Up-to-Date With DCIG
To be notified of new DCIG articles, reports, and webinars, sign up for DCIG’s free weekly Newsletter.
To learn about DCIG’s future research and publications, see the DCIG Editorial Calendar.
Technology providers interested in licensing DCIG TOP 5 reports or having DCIG produce custom reports, please contact DCIG for more information.