Intel, Sandisk and Samsung are investing billions of dollars into SSD
technology and manufacturing capacity. Unfortunately due to the
extreme cost of building the manufacturing facilities, SSD
manufacturing capacity is not likely to exceed HDD manufacturing
capability for at least 10 years, and it may be 20 years or more. Most
data center applications heavily lean toward database applications
which use random read/write disk activity. For random read/write
activity the performance of SSDs is 10x to 100x that of a single
rotational disk. Unfortunately, the cost is also 10x to 100x that of a
single rotational disk.
Due to the limited manufacturing capability of SSD, most applications
are going to remain on rotational disk for the foreseeable future. We
have developed OHSM to allow SSD and traditional HDD (including RAID)
to be seamlessly merged into a single operational environment thus
leveraging SSD while using only a modest amount of SSD capacity.
In a OHSM enabled environment, data is migrated to and from the high
performing SSD storage to traditional storage based on various user
defined policies. Thus if widely deployed, OHSM has the ability to
improve computer performance in a significant way without a
commiserate increase in cost. OHSM being developed as open source
software also abolishes the licensing issues and the costs involved in
using storage solution software. OHSM being online signifies the
complete abolishment of the file system downtime and any changes to
the existing namespace.
Online Hierarchical Storage Manager (OHSM) is the first attempt
towards an enterprise level open source data storage manager which
automatically moves data between high-cost and low-cost storage media.
HSM systems exist because high-speed storage devices, such as hard
disk drive arrays, are more expensive (per byte stored) than slower
devices, such as optical discs and magnetic tape drives. While it
would be ideal to have all data available on high-speed devices all
the time, this is prohibitively expensive for many organizations.
Instead, HSM systems store the bulk of the enterprise's data on slower
devices, and then copy data to faster disk drives when needed. In
effect, OHSM turns the fast disk drives into caches for the slower
mass storage devices. There would be certain policies that would be
set by the data center administrators as to which data can safely be
moved to slower devices and which data should stay on the fast
devices. Under manual circumstances the data centers suffers from down
time and also change in the namespace.
Policy rules specify both initial allocation destinations and
relocation destinations as priority-ordered lists of placement
classes. Files are allocated in the first placement class in the list
if free space permits, in the second class if no free space is
available in the first, and so forth.
The policies have been broadly rifted into two broad categories,
Allocation and Relocation policy. Allocation policies come into play
whenever a new file is created on the file system. The allocation of
the physical blocks is decided depending upon polices that were set by
the administrators. If none of the criteria matches, it eventually
lands up on the default allocation policy that is used by the file
system. Wherein, the Relocation polices plays its role at different
time intervals as and when it is enforced by the admin. As the
relocation of data happens at a lower lever than the file systems,
this is totally concealed to the file system users. Obviously, the
decision for the eligibility of data for relocation requires a
complete FS scan but that’s not too frequent.
Fundamentally, enterprises organize their digital information as
hierarchies (directories) of files. Files are usually closely
associated with business purpose—documents, tables of transaction
records, images, audio tracks, and other digital business objects are
all conveniently represented as files, each with a business value.
Files are therefore obvious objects around which to optimize storage
and I/O cost and performance.
In a typical HSM scenario, data files which are frequently used are
stored on disk drives, but are eventually migrated to tape if they are
not used for a certain period of time, typically a few months. If a
user does reuse a file which is on tape, it is automatically moved
back to disk storage. The advantage is that the total amount of stored
data can be much larger than the capacity of the disk storage
available, but since only rarely-used files are on tape, most users
will usually not notice any slowdown.
Not all data is created equal. Some types of data such as databases,
financial data, and online records are transactional in nature -
meaning they change or are updated frequently. A majority of data
however, is not transactional, but is data that is written once and
read infrequently (persistent data). These two classes of data require
different storage approaches. Transactional data requires high
performance, disk-based storage. For persistent data, previously, the
only option was tape which, while relatively inexpensive, was a pain
to manage and unreliable for backups and restores.
Not all data is treated equal and that’s why we have placement classes
in the storage used by the enterprise level market player. Did
applications have different data usage pattern. Take the example of a
search engine, where not every data is referenced frequently. So, in
our OHSM we move such files or objects to a lower level storage tier
or a placement class. A placement class is usually identified with a
storage tier. Policy rules cause files to be created and extended
within specified placement classes, and to be relocated to other
placement classes when they meet certain naming, activity, access
rate, and size-related qualifications.
What OHSM offer:
* A very user friendly GUI to generate XML based policy files.
* Facility to write XML based policy files and assign directly to
* Easy-to-use service level based volume migration.
* Non-disruptive, completely transparent data object movement.
* Safely moves data when application requirements increase or decrease.
* Users can respond to business initiatives with faster storage
* Easily fine-tunes volume provisioning via command line interface (CLI).
* The GUI Removes complexity in managing storage tiers.
* Flexibility to add or remove storage to any placement class or tier.
For any feedback or suggestions do mail us at fscops@...