Online Hierarchical Storage Manager
Intel, Sandisk and Samsung are investing billions of dollars into SSD technology and manufacturing capacity. Unfortunately due to the extreme cost of building the manufacturing facilities, SSD manufacturing capacity is not likely to exceed HDD manufacturing capability for at least 10 years, and it may be 20 years or more. Most data center applications heavily lean toward database applications which use random read/write disk activity. For random read/write activity the performance of SSDs is 10x to 100x that of a single rotational disk. Unfortunately, the cost is also 10x to 100x that of a single rotational disk.
Due to the limited manufacturing capability of SSD, most applications are going to remain on rotational disk for the foreseeable future. We have developed OHSM to allow SSD and traditional HDD (including RAID) to be seamlessly merged into a single operational environment thus leveraging SSD while using only a modest amount of SSD capacity.
In a OHSM enabled environment, data is migrated to and from the high performing SSD storage to traditional storage based on various user defined policies. Thus if widely deployed, OHSM has the ability to improve computer performance in a significant way without a commiserate increase in cost. OHSM being developed as open source software also abolishes the licensing issues and the costs involved in using storage solution software. OHSM being online signifies the complete abolishment of the file system downtime and any changes to the existing namespace.
Online Hierarchical Storage Manager (OHSM) is the first attempt towards an enterprise level open source data storage manager which automatically moves data between high-cost and low-cost storage media. HSM systems exist because high-speed storage devices, such as hard disk drive arrays, are more expensive (per byte stored) than slower devices, such as optical discs and magnetic tape drives. While it would be ideal to have all data available on high-speed devices all the time, this is prohibitively expensive for many organizations. Instead, HSM systems store the bulk of the enterprise's data on slower devices, and then copy data to faster disk drives when needed. In effect, OHSM turns the fast disk drives into caches for the slower mass storage devices. There would be certain policies that would be set by the data center administrators as to which data can safely be moved to slower devices and which data should stay on the fast devices. Under manual circumstances the data centers suffers from down time and also change in the namespace. Policy rules specify both initial allocation destinations and relocation destinations as priority-ordered lists of placement classes. Files are allocated in the first placement class in the list if free space permits, in the second class if no free space is available in the first, and so forth.
The policies have been broadly rifted into two broad categories, Allocation and Relocation policy. Allocation policies come into play whenever a new file is created on the file system. The allocation of the physical blocks is decided depending upon polices that were set by the administrators. If none of the criteria matches, it eventually lands up on the default allocation policy that is used by the file system. Wherein, the Relocation polices plays its role at different time intervals as and when it is enforced by the admin. As the relocation of data happens at a lower lever than the file systems, this is totally concealed to the file system users. Obviously, the decision for the eligibility of data for relocation requires a complete FS scan but that’s not too frequent.
Fundamentally, enterprises organize their digital information as hierarchies (directories) of files. Files are usually closely associated with business purpose—documents, tables of transaction records, images, audio tracks, and other digital business objects are all conveniently represented as files, each with a business value. Files are therefore obvious objects around which to optimize storage and I/O cost and performance.
In a typical HSM scenario, data files which are frequently used are stored on disk drives, but are eventually migrated to tape if they are not used for a certain period of time, typically a few months. If a user does reuse a file which is on tape, it is automatically moved back to disk storage. The advantage is that the total amount of stored data can be much larger than the capacity of the disk storage available, but since only rarely-used files are on tape, most users will usually not notice any slowdown.
Popular Pages In the WIKI
Not all data is created equal. Some types of data such as databases, financial data, and online records are transactional in nature - meaning they change or are updated frequently. A majority of data however, is not transactional, but is data that is written once and read infrequently (persistent data). These two classes of data require different storage approaches. Transactional data requires high performance, disk-based storage. For persistent data, previously, the only option was tape which, while relatively inexpensive, was a pain to manage and unreliable for backups and restores.
Not all data is treated equal and that’s why we have placement classes in the storage used by the enterprise level market player. Did applications have different data usage pattern. Take the example of a search engine, where not every data is referenced frequently. So, in our OHSM we move such files or objects to a lower level storage tier or a placement class. A placement class is usually identified with a storage tier. Policy rules cause files to be created and extended within specified placement classes, and to be relocated to other placement classes when they meet certain naming, activity, access rate, and size-related qualifications.
What OHSM offers
- A very user friendly GUI to generate XML based policy files.
- Facility to write XML based policy files and assign directly to a namespace.
- Easy-to-use service level based volume migration.
- Non-disruptive, completely transparent data object movement.
- Safely moves data when application requirements increase or decrease.
- Users can respond to business initiatives with faster storage provisioning.
- Easily fine-tunes volume provisioning via command line interface (CLI).
- The GUI Removes complexity in managing storage tiers.
- Flexibility to add or remove storage to any placement class or tier.
For any feedback or suggestions do mail us at email@example.com