Two general techniques have now emerged that reclaim storage on thinly provisioned volumes: zero page reclamation and file system-based, intelligent storage reclamation. Both techniques identify when blocks are freed on a thinly provisioned volume and mark them as available for reclamation and restoration to the storage array’s general storage pool. However there are other factors that come into play that should influence when organizations select intelligent versus zero page reclamation as their preferred method of storage reclamation.
As more organizations adopt thin provisioning, a new problem is emerging: Staying Thin. As an application deletes or even moves data, it creates pockets of allocated but unused storage capacity on thinly provisioned volumes. The problem is that the underlying storage array has no way to reclaim this storage capacity since it still “thinks” that those blocks are in use by the application.
This is where zero page reclamation comes into play. Some applications and operating systems with third party utilities now support a technique of writing zeros to these blocks of allocated but freed blocks of storage capacity. Once these zeros are written, the storage is marked as available for reclamation and recapture. The storage array then scans the blocks of data for these zero filled blocks, identifies them as available for reclamation and returns them to the storage array’s general storage pool.
In that sense, zero filling these blocks is as elegant and inventive as it is simple to implement. It solves this rather thorny problem of reclaiming allocated but unused storage on thinly provisioned storage while taking what on the surface sounds like a very simple method to accomplish it.
However there are at least four potential problems with using zero page reclamation that organizations need to consider before implementing it.
- The write penalty. The zeros on these allocated but unused blocks of storage do not just magically appear. The application server has to initiate an algorithm that writes zeros to them or they cannot be reclaimed by the storage array.
In addition, in the typical scenario where not all the space in a given file system is being used, the entire space dedicated to the file system will need to be zeroed out. Imagine a 100 GB file system with 20 GB used space. If 10 GB is now deleted, there will be 10 GB of allocated but unused space in the file system.
However, because that 10 GB is ‘mixed’ in with the original 80 GB of unused space, all 90 GB will need to be zeroed out before the 10 GB that was freed up can be released. In most cases, that would mean that the 80 GB will need to first be physically allocated, zeroed out and then released via zero page reclamation.
All this extra work creates a significant write penalty. If there is a large number of these blocks, writing zeros to all of them can introduce a substantial amount of overhead on both the server generating the writes and the storage array doing the writes. Generating these write I/Os can potentially negatively affect the application running on that server. It can also have an adverse ripple effect and impact other applications that also use that storage array.
- The lag time. Using zero page fill reclamation only makes the recapture of allocated but unused storage capacity possible, it does not make it immediate. Even once storage capacity is identified as available for recapture there will still be some lag time before the storage capacity is reclaimed.
As mentioned, the application server first has to write zeros to all of these allocated but unused blocks of data and then the storage array has to recapture them. The period of time between when these blocks of data are identified as ready for recapture and then returned to the available storage pool varies and could be hours, days or even weeks before this task is complete.
- Zero page fill support. Not all applications or operating systems yet support zero page fill so unless the storage array vendor provides some sort of utility that writes these zeros, organizations will not be able to take advantage of this feature. Further, not all storage arrays that support thin provisioning support zero page fills so they cannot offer this functionality.
- Manual scripting. Applications and operating systems do not yet automatically write back zeroes when files are deleted. While a few vendors provide scripts that write zeroes and then trigger a reclamation, if there are mistakes in the script, it can trigger extra allocation. Further, if the script goes and allocates all available physical space in the pool, the READ and WRITE errors could start to appear causing downtime for the applications.
Organizations concerned about these downsides of zero page fill should instead consider leveraging the intelligent storage reclamation that Symantec’s Storage Foundation Thin Reclamation API affords. Storage Foundation eliminates the need to write zeros to storage blocks devoid of data or wait to recapture the freed storage blocks.
Storage Foundation has always tracked which storage blocks have had data written to them. Now using its Thin Reclamation API feature, it can let storage arrays that support thin provisioning and integrate with its Thin Reclamation API know exactly which blocks they can reclaim with little or no performance impact to the host or the storage array.
To accomplish this, Storage Foundation sends a standard SCSI command to the storage array which tells it what blocks or range of blocks on a thin provisioned volume have been freed and can be reclaimed. Issuing this simple command eliminates the need for zeros to be written and the requirement to wait for the storage array to scan for these zeros and then recapture the storage.
Further, Symantec has just recently extended the number of storage arrays it supports. While initially just supported by three storage providers (3PAR, HDS and IBM), last week it announced its Thin Reclamation API is now supported by Compellent, EMC, Fujitsu, HP and NetApp.
Zero page fill is a useful utility that helps organizations automate the recapture of freed but allocated blocks of storage capacity. However organizations seeking an intelligent and standardized way to implement and manage it across their entire storage infrastructure are advised to seek out Symantec’s Veritas Storage Foundation with its Thin Reclamation API. In so doing, organizations can schedule the automatic reclamation of storage on all leading storage arrays that support the Thin Reclamation API without either the performance overhead or the wait that zero page fill introduces.