I. Executive Summary
There is no question that we live in the age of information. The focus of the world economy has shifted from physical production to the importance of data: statistics, facts, figures, numbers and records are highly valued in the business world. As this shift continues, the importance of information in a business escalates. Nearly all of a professional organization’s data exists in an electronic format, and as the value and volume of data increases, so does the demand for adequate storage space to house it.
The solution? Data archiving.
Companies have large amounts of data, and a large percent of it needs to be retained. Only 25% of the data within an organization is freshly created; the rest is redundant data, or data that was created in the past and must be preserved for future reuse.
This situation has created a high demand for information storage, a demand that carries both monetary and logistic concerns. Data archiving allows organizations to efficiently retain this mass of redundant data, often for very long periods of time, so that it can be accessed when necessary.
II. Why Archive?
To put it simply, an organization’s electronic data is a valuable asset and needs to remain available over time. In today’s fast-paced world, data archives are no longer a luxury. Effective, reliable and affordable archiving
Data archives help professional organizations achieve three main goals:
• Protect and retain data for the future
• Meet the world’s increasing data storage needs
• Live up to the world’s compliance and legislation requirements
Protect and retain data for the future
From contracts to medical records, employee documents to client communications, organizations expend a great deal of time and energy on creating, managing and maintaining electronic documentation.
As a result of the influx of electronic data, and the importance it carries, a significant amount of a company’s data holds significant value.
Once this electronic data and documentation had been created, it must be managed and protected. In order to safeguard important data,
Many organizations already have backup systems in place that allow them to save and protect data. So what is the difference between a data archive and a data backup?
A data archive is a storage device that is used to house data for long-term preservation. Data archives store and protect historic data that is not needed on a day-to-day basis, but is important and necessary for future reference, such as information that must be retained for regulatory compliance. Data archives are indexed for search optimization-files can be located and retrieved quickly and easily. The
A data backup is a large repository that is used to store multiple copies of data. The backup is used to restore data in case it is corrupted or destroyed. Typically, data backups are used more for storage and less for retrieval. In contrast to data archives, backups are not indexed or designed for swift data location and recovery, and the data has a shorter retention period. The
The two key differences between a data backup and a data archive are the length of data retention and the ability to retrieve the data. Data archives not only offload content from primary, short-term storage to safe, long-term storage, but they retain information for future use and for frequent recall-all at a low cost. This ensures that important data remains safe, retrievable and easily accessible for decades. Furthermore, as archives streamline data storage and recovery processes, productivity increases.
Meet the world’s increasing data storage needs
In the last five years, the amount of electronic data that needs to be retained has outgrown online storage capacities. Currently, there is a gap in data storage capabilities. Data that is stored on local or personal hard drives or online tapes remains static and unprotected.
One significant reason for the lack of adequate storage space is that IT departments have not been able to keep up with the sustained rapid growth of data. Consequently, they have had limited resources to work with in order to efficiently and cost effectively manage current storage needs, as well as create new capacities. As a result, many organizations began meeting their data storage requirements by purchasing extra hard disk space and adding servers. Although effective in the short term, these methods tend to clog up networks and waste valuable online space. In addition, duplicate data is often stored and backed up many times over, eating up storage room that could be better and more efficiently utilized.
The growth of electronic data is certainly not dawdling, and space continues to run short. Hence, the need for additional offline storage becomes crucial for both cost and efficiency reasons. Data archives can fill the gap that this data race has created, allowing companies to increase efficiency, manage old and current data more effectively and reduce storage costs.
The data-storage gap results in:
• Inactive and active data being managed and maintained in one ineffective storage bundle
• Stored data being unable to be recalled or recovered quickly and easily
Data archives can meet storage needs by:
• Moving data from online storage space to offline storage capacities, thereby liberating space for new, more current data
• Reducing storage management costs by implementing affordable, logistical and automated data management systems
Fulfill the world’s compliance and legislation requirements
Thousand of regulations exist today that require record retention, and many mandatory compliance standards are in place to ensure that original records can be reproduced unaltered. From FDA regulations to Sarbanes-Oxley standards, industries world-wide find themselves under a mountain of highly specific, strict and standardized compliance and legislation requirements. In addition to compliance standards, companies that face tough litigation issues must often produce documents and other information to be used as evidence in courts of law.
Clearly, electronic documents and records play an increasingly important role in these areas-contracts, medical records, e-mails, financial records and images are only a few examples of information that organizations may need to recall under compliance and legislation requirements. As a result, innumerable world-wide professional organizations must have good archiving architecture in place to store and manage their electronic data as well as have the capabilities to produce original, authentic information at random and unpredictable times. By nature of their
Compliance and legislation require companies to:
• Retain vital information for extended periods of time
• Store & manage original records they can be recalled in their initial and unaltered form, and persons with unauthorized permissions are denied access to such records
• Maintain data so that it can be easily navigated on retrieved
Effective data archives ensure that data is:
• Available for recall for years or decades
• Secure from unauthorized access and modification
• Easily recallable with random access
• Hard disk drives (HDDs)
• Tape drives
• Optical media
Hard disk drives – successes and failures
Hard disk drives are routinely used in online or near-line storage archives, which have a role in nearly every aspect of the digital world as we know it today. People and organizations keep significant amounts of data on various versions of HDDs. MP3 players, cell phones, personal business computers, web applications and corporate storage systems are some examples of disk drives used around the world. HDDs are widely used because they are manufactured with affordable magnetic drives that have large individual capacities. These capacities are used to store significant amounts of space-hogging data, such as music, records and video. Although hard disk drives can be effective means of data storage, they are not the perfect archive. When it comes to HDD archiving, there is good news and bad news.
First, the good news:
• Accessibility. Data on HDD archives can be rapidly accessed. Whether a hard disk drive is being used as primary or secondary storage, data can be retrieved in a matter of seconds.
• Affordability. Over the years, the cost of near-line HDDs has been driven down, making them more affordable in terms of acquisition than higher performance HDDs used in primary storage.
• Compliance. Hard disk drives have the basic capabilities needed to meet compliance and regulatory requirements, ensuring that data can be recalled in its original, intended format.
Now for the bad news:
• Long-term retention? Not likely. It’s true that HDDs have the basic archiving capabilities to meet storage needs. However, HDDs are normally only designed for a 3-5 year life span. For organizations that need to reproduce data ten or fifteen years down the road due to compliance and legal regulations, this does not bode well.
• Out with the old. Hard disk drives are not designed for unpowered shelf storage. HDDs are designed to heat up only when powered on, and they tend to fail more rapidly when they are sitting unpowered on a shelf. Therefore, offline management of old information is simply not possible with this form of archive
• What goes up stays up. Power and air conditioning consumption are significantly high contributors to HDD operating costs. Furthermore, as a result of their short lifespan, migration to new disk drives is necessary every 3-5 years. Throw in the time and money spent on reliability issues and what happens? The organizational and environmental costs go up- just when they need to go down.
• To err is the nature of HDD. Mistakes happen, which is why humans often rely on
Hard disk drive readability failures
HDDs have a tendency to fail as a result of readability errors, which are either operational or latent in nature. Both types hinder the ability of HDDs to reliably archive data, yet the two failures behave differently.
Operational failures often occur when data cannot be written to the disk drive because the HDD itself has stopped working. Latent failures occur when the disk drive works-data can be written to the HDD, but electronic or mechanical errors prevent the content from being retrieved. Latent errors are the dominant source of errors in HDD archiving.
Latent failures are a little more complex than operational failures, simply because there are several factors that can cause them to occur, and these factors are often lurking, unseen and undetected, until it’s too late.
The causes of HDD latent failures
Latent failures cause hard disk drives to be unreadable and unstable.
Causes of latent failures include:
• Thermal instability and self-demagnetization: When an HDD’s thermal energy, or the internal energy created by the HDD system and components, is at room temperature, the bits stored on the drive are slowly disordered. As a result, thermal decay occurs. Unpowered hard disk drives are more susceptible to areas of data loss due to thermal decay.
• Corrosion: The internal components of the disk drives are subject to corrosion, including the media, motor parts and connectors. The most severe type of corrosion occurs on the media itself. If corrosive sites develop on the disk platter, data loss could result.
• Particulate infiltration or contamination: It is simpler than it sounds-airborne contamination settles on the disk surface, often rendering it unreadable. This phenomenon can either create a site for possible corrosion or data loss.
• Out-gassing: Out-gassing usually refers to release of detrimental vapor from the HDD cartridge’s internal parts or hard case over time. Out-gassing can deposit detrimental films upon the disk platter, which leads to a loss of space or a severe chemical reaction. This process ultimately results in data loss.
• Adhesive breakdown: Some HDD components, such as the filter and desiccant inside each disk, are mounted with adhesives. This adhesive might break down due to time, temperature or humidity, causing the filter to loosen. In turn, this can cause the internal components to rub or make contact with the disk, resulting in areas of non-recoverable data.
To add insult to injury, there is significant research that shows HDD latent errors increase over time. Near-line HDDs (the most common form of archive HDDs) are more likely to develop latent errors than Enterprise HDDs. In one study, 3.45% of 1.53 million disks developed latent errors, and this percentage increased super-linearly for near-line disk drives. Furthermore, drives that have experienced errors are more likely to develop additional errors in the future.
So the question remains, what good is your data if it can’t be read? The negative consequences to an organization’s time, data and finances as a result of HDD readability issues are plentiful.
When it comes to archiving and hard disk drives, the good does not always outweigh the bad. Due to the increased volumes of electronic data, the ever-growing demands of data archives are beginning to outpace the industry’s ability to create adequate HDD storage capacities. The progressive amount of digital data stored on the internet, as well as the growing amounts of information being stored on personal computers and servers, has created a demand for additional offline storage capabilities that prohibit HDDs from becoming an effective archiving tool. In addition, readability failures, as well as high operational costs and the inability of HDDs to retain data for the long-term, significantly compound the issue. The long-term storage deficiencies of hard disk drives make this
Tape drives – good for backup, bad for archive
A tape drive is a data storage device that uses magnetic tape to read and write data. The tape itself is primarily packaged in a cassette or cartridge, which is then loaded into the drive. Individual drives can be connected to computers via cable connections, such as SATA, USB or FireWire, while multiple tape drives are often housed in autoloaders or large tape libraries. These devices often include built-in barcode readers that identify the tapes and an automated system that loads the tapes into drives-no human intervention is necessary.
The greatest benefit of tape drives is that they are able to store tremendous amounts of data. Tape drive capacities can range from a few MB to over 100 GB, well exceeding the storage capabilities of hard drives and network storage. However, this benefit comes with one large drawback: their accessibility rates are significantly slow. Tape typically offers sequential data storage (versus the random storage capabilities of disk drives), and access to data on tape can take anywhere from a few seconds to two or three minutes. Despite their bulk and lazy retrieval times, tape drives are capable of transferring linear streams of large amounts of data at once. It is for this reason that tape drives are most commonly used for data back-up.
Although the data storage capacity is there, tape drives are not a reliable choice for data archiving needs, as they are not designed to read or write individual files. In addition, tape is fundamentally rewritable-a huge drawback when considering any regulatory and compliance requirements. Compatibility is also not a tape drive’s strong suit. Tape standards tend to change every decade, some times more frequently than that. This means that almost every tape format is proprietary and not backwards compatible. When efficiency is paramount, updating drives to current standards or adjusting
Here is the good and bad news about tape archives.
The good news:
• Low incremental cost. Most organizations already have some form of a tape drive in place for data back-up purposes, which means little additional costs if the same drive were to be used in place of primary storage. In addition, the power consumption of tape is low, resulting in low operating costs.
• Long-term retention. Old information can be removed and stored off-line, and the average life span of tape is 7-10 years.
The bad news:
• Incompatibility. Because tape
• Slow access times. For all intents and purposes, data stored on tape drives is accessible. However, tape is best suited to sequential access. Due to tape’s need to spool, data retrieval times lag. Ultimately, slow retrieval time results in an unmanageable and unpredictable archive
• Vulnerability. Although old data can be removed from the drive and stored off-line, tape is vulnerable to electro-magnetic radiation, and it requires regular maintenance to prevent tapes from adhering together. As a result, support staff needs to be on-hand in order to condition the tapes on a consistent basis.
As an archiving
Optical discs – a reliable data archive choice
Most people are familiar with Blu-ray discs’ high-quality ability to store video, games and other interactive content. However, this form of optical media is also a superior choice when it comes to data archiving. Optical archives use Blu-ray disc
The primary benefits of optical media archiving include:
• Reduced risk of data loss
• Reduced storage costs
• Long-term data retention, durability and compatibility
• Low cost, low power and minimal carbon footprint over time
Reduced risk of data loss
Blu-ray media is 100% WORM, meaning that while the original data stored on the disc cannot be altered, it can always be accessed. This is great news when it comes to compliance and legislation requirements. Optical
Reduced storage costs
A Blu-ray disc is the same physical size as a standard CD or DVD. The differentiating factor between the three types of optical media is that Blu-ray disc
In addition, Blu-ray
Long-term data retention, durability and compatibility
The average lifespan of a Blu-ray dual-layer 50 GB disc is 20 to 100 years. Two physical factors contribute to the superiority and dependability of optical media.
First, Blu-ray discs contain a protective hard-coating on the outside surface, making scratches and fingerprints a non issue. In addition to the built-in WORM support, which ensures that the data stored on the disc will not be harmed by internal or external forces, Blu-ray’s advanced coating
Second, optical media is created in industry standard formats: ISO 9660 and UDF, both of which are backwards compatible. These formats are supported in all major operating systems: Windows, Linux, UNIX and MAC OS. In other words, optical media is highly compatible, assuring that both now and years ahead, Blu-ray discs can be utilized and read by a standard PC. (Further proof: CDs from 25 years ago are still readable today by a standard PC.) There is no need to worry about the proprietary issues that are associated with tape drives-optical media will not out-grow the
Low cost, low power and minimal carbon footprint over time
Over time, Blu-ray discs are more economical than hard disk drives. Blu-ray ownership costs are low because:
• Blu-ray discs can hold a significant amount of data, and have inherent
• Blu-ray media has the longest shelf life of all data archive solutions.
• Blu-ray’s wide capabilities and platform support drastically reduce the need to migrate to new
The cost benefits of optical media ultimately translate into environmental benefits. Optical media is the data archiving
• Optical media is a passive storage device, requiring no energy over decades of storage-it does not consume any power when it is not being utilized to access data and information.
• The energy consumption required to operate and run Blu-ray media is extremely low due to the use of shared resources.
• Blu-ray media generates a significantly low amount of heat, which in turn means that little to no energy is spent on cooling capabilities.
When it comes to data archiving, optical media storage is a robust and reliable option. Blu-ray discs are specially formatted for the wide demands of the ever-growing video industry-a clear advantage over disk drives and tape drives. Second, optical archiving
As a result of its robust nature, optical media fully supports the increasing demands of compliance and legislation requirements. These requirements include:
• Long-term record retention
• Reproduction of original, unaltered record
• Quality archiving architecture in place for compliance and legislation
In our digital world, electronic data permeates and dominates business industries across the globe. As our world economy continues to be more information-based, mass amounts of electronic data continue to accrue, and this growth is by no means slowing down. As a result, the world’s need for data archiving is ever-growing, placing more and more demand on professional organizations to ensure reliable archiving
HDDs are effective in the short term, but are prone to operational and latent failures that prove this
2010 Rimage Corporation. All rights reserved.
This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Rimage is a registered trademark of the Rimage Corporation. All other brand or product names are trademarks of their respective owners and are used without intention of infringement.