![]() |
|
INTRO VIDEOS CLOUD COMPUTING DIRECTORY GLOSSARY ABOUT THE AUTHOR PRESS CONTACT SITE MAP HOT! |
COMPUTER STORAGE: AN INTRODUCTIONThis page provides an overview of the most widely available means of storing and backing-up computer data, and in doing so provides a supplement to the hardware and security pages. For more information you may also want to watch the following video on "data wrangling" in which I discuss the handling and long-term storage of large quantities of data: Computer storage is measured in bytes, kilobytes (Kb), megabytes (Mb), gigabytes (Gb) and increasingly terabytes (Tb). One byte is one character of information, and is comprised of eight bits (or eight digital 1's or 0's). Technically a kilobyte is 1024 bytes, a megabyte 1024 kilobytes, a gigabyte 1024 megabytes, and a terabyte 1024 gigabytes. This said, whilst this remains true when it comes to a computer's internal RAM and solid state storage devices (like USB memory sticks and flash memory cards), measures of hard disk capacity often take 1Mb to be 1,000,000 bytes (not 1,024,768 bytes) and so on. This means that the storage capacity of two devices labelled as the same size can be different, and which remains an ongoing source of debate within the computer industry. Any sensible computer user will plan for two categories of storage. These will comprise the storage necessary to files internally on their computer, as well as those media required to back-up, transfer and archive data (as also explored in the security section). In turn, when deciding on suitable external storage devices, the key questions to be asked should be how much data actually needs to be stored, and whether the external data archive will be subject to random-access or incremental change. STORAGE CAPACITY AND REQUIREMENTSIf a computer user is usually only going to create word processor documents and spreadsheets, then most of their files will probably be in the order of a few hundred Kb or maybe occasionally a few Mb in size. If, however, a computer is being used to store and manipulate digital photographs, then average file sizes will be in the region of several Mb in size (and potentially tens of Mb if professional digital photography is being conducted). Yet another level of storage higher, if a computer is being used to edit and store video, individual file sizes will probably be measured in hundreds of Mb or even a few Gb. For example, an hour of DV format video footage consumes about 12Gb of storage. Non-compressed video requires even more space -- for example 2Gb for every minute of standard definition footage, and 9.38Gb for each minute of non-compressed 1920x1080 high definition video. Knowing what a computer is going to be used for (and of course many computers are used for a variety of purposes) is hence very important when planning storage requirements. In addition to capacity requirements, whether the data in a user's back-up archive will have to change in a random-access or incremental fashion can be a critical factor in the choice of external storage devices. A digital photographer, for example, will probably have incremental back-up requirements where each time they complete a shoot they will want to take a back-up of several hundred Mb or a few Gb of photographs that will subsequently never change. In other words they will want to keep a permanent record of an historical digital state of the world. Writing data like photographs to write-once media (such as CD-R or DVD-R as discussed below) would hence be perfectly acceptable. The photographer's total archive may be hundreds of Gb in size, but would only be added to incrementally with previously stored data never being changed. In contrast, somebody producing 3D computer animation may be re-rendering tens of Gb of output on a regular basis to replace previous files in a random-access fashion. In this situation not only would re-writable media be more suitable, but the speed of the back-up device would become far more critical. Having to take a copy of even 50Gb of data at the end of a working day is a very different proposition to a few Gb, let alone a few tens or hundreds of Mb. Further discussion of the suitability of different media for incremental and random-access back-up continues within the following explanation of available storage devices and technologies. HARD DISK STORAGESpinning hard disk (HD) drives are today the most common means of high capacity computer storage, with almost every desktop and laptop computer still relying on a spinning hard disk to store its operating system, applications programs and at least some user data. Traditional, spinning hard disk drives consist of a number of disk "platters" stacked one above the other, and coated in a magnetic media that is written to and read by the drive heads. As discussed in the hardware section, hard disk drives can transfer data directly to other computer hardware via a range of three interface types (SATA, IDE/UDMA, or SCSI) and come in a range of speeds from 4200 to 15000 revolutions per minute (rpi). Hard disks are almost always manufactured with either 3.5" of 2.5" platters (although just to break the rule a few smaller -- most notably 1.8" -- and even some larger platter disks are made by some manufacturers). For many years 3.5" hard disks have been standard for desktop computers and servers, and 2.5" hard disks for laptops. However, this is now starting to change, with enterprise class 2.5" hard disks now increasingly being used in servers and some desktop computers due to their low power requirements. Indeed, the fact that Western Digital's top-of-the-range Velociraptor hard drives now use a 2.5" rather than a 3.5" mechanism (with the drives being supplied in a metal "sled" for fitting into a 3.5" bay) speaks volumes and probably indicates that within a few years most spinning hard disk drives are likely to be 2.5". ![]() Whilst at least one hard disk is usually required inside a computer as the "system disk", additional hard disk drives can be located either "internally" inside the main computer case, or connected "externally" as an independent hardware unit. A second internal hard disk is highly recommended where a user regularly works on very large media files (typically digital video files) that are always accessed directly off the hard disk, rather than loaded into RAM. Where such files are loaded off a computer's system disk, the disk drive heads are inevitably constantly nipping back and forth between accessing the large media file and writing temporary operating system files, and this both degrades performance and reduces the life of the disk. The other good reason to install a second internal hard disk is where it is set up in combination with the first disk in a "RAID" configuration (standing for a redundant array of independent disks, or sometimes a redundant array of inexpensive drives). What this means is that two or more hard disk units are configured to appear to the computer and its user as a single storage device. Many possible RAID configurations are available. These basically permit either performance improvements (by spreading or "striping" data across more than one disk), and/or increased fault tolerance and hardware failure protection (achieved for example by "mirroring" the same data across two disks). Many modern personal computer motherboards permit two SATA hard disk drives to set up in a RAID configuration. However, for most users there are relatively few benefits to be gained. More broadly, it also needs to be remembered that any hardware setup featuring more than one internal hard disk -- whether or not in a RAID configuration -- at best provides marginal improvements in data security and integrity. Not least this is because it provides no more tolerance to the theft of the base unit, nor to power surges or computer power supply failures (which can simply fry two hard drives at once rather than one). Except where two internal hard disks are considered essential on the basis of performance (and possibly convenience), a second hard disk is today most advisably connected as an external unit, or what is sometimes now known as a "DAS" or direct attached storage drive. DAS external hard disks connect via a USB, firewire or an E-SATA interface (see the hardware section), with USB being the most common. The highest quality external hard drives routinely include at least two of these interfaces as standard, hence maximising their flexibility for moving data between different computers. As explained in the networking section, today some external hard disks can also be purchased as NAS (network attached storage) devices that can easily be shared between users across a network. Except when connected via a USB 1.1 port on an older computer, external hard disks will usually perform as effectively as most internal hard disks -- even when used for highly disk intensive processes such as video editing. External hard disks also have the added convenience of being easily physically separable from the computer for secure and/or off-site storage. A user can also purchase additional external hard disks as their data storage requirements dictate. External hard disk units normally include one 3.5" or 2.5" hard disk inside their case. Units with a 3.5" disk tend to offer a cheaper cost per megabyte. Units based on 2.5" drives are smaller and usually do not require an external power adapter (as a computer can supply enough electricity down the USB or firewire hard disk connection cable). Some external hard disks now include several physical disks inside one unit in some form of RAID configuration.
EXTERNAL HARD DISK LIMITATIONSExternal hard disks offer a user fast and high-capacity external storage with a low cost-per-megabyte. In most instances, they are also only real option where high capacity, random access data archives have to be maintained. This said, most users will never have such archives, and there are several other disadvantages to DAS-style external hard disks. For a start, whilst their cost-per-megabyte is low, their cost-per-unit is high compared to most optical media and solid state storage devices. External hard disks are also fairly easy to physically damage via impact or by getting them wet. Reliance on a single external hard drive can also place an entire data archive "in one basket", and is of no use at all where data either needs to be physically exchanged between users (as still happens even in the days of the Internet), or has to be accessed via a media device to which an external hard disk cannot be connected. External hard drive units are also somewhat cumbersome for those wrangling tens of terabyes of data. For this reason, some people now transfer and store large quantities of data on bare hard disks connected to their computer as required (and usually via a flying E-SATA lead). However, this is hardly ideal, not least because both connectors and the drives themselves can become damaged. As shown above in my Explaining Data Wrangling video, one solution for those who need to work with a great many hard drives is to use house the disks in caddies that then slot into PC-mounted bay. Such caddies can sometimes also be connected to other computers via USB or E-SATA. As a consequence of the above limitations, computer users handling both small and large quantities of data tend not to rely entirely on hard disk technology, and will therefore also make use of optical and/or solid-state storage. OPTICAL DISK STORAGEAlmost all optical storage involves the use of a 5" disk from which data is read by a laser. Optical media can be read only (such as commercial software, music or movie disks), write-one, or rewritable, and currently exist in one of three basic formats. These are compact disk (CD), digital versatile disk (DVD)and Blu-Ray disk (BD). A fourth format called High-Definition DVD (HD DVD) is now dead-in-the-water. Compact disk is a very mature, low-cost and reliable storage media particularly well suited for most personal computer users for incremental data archiving, as well as for the physical exchange of moderate-sized qualities of data (even today, e-mailing a 500Mb attachment is still not either that common or that welcome!). Writable compact disks can be either CD-R (which are a write-one media) or CD-RW (to which data can be written and erased typically a few hundred times). The storage capacity of a compact disk is up to about 700Mb for CD-R and somewhat less for CD-RW media (and depending on the format used to write the data). For the reliable back-up or exchange of up to 700Mb of data there is still little to beat a compact disk. Problems accessing a CD-R disk are now very rare, and the cost of the disks is low if bought in bulk in "pancakes" of 25, 50 or 100 disks. The media are also physically very durable -- and certainly considerably more so than an external hard disk. The only real drawbacks to compact disks for data storage are the speed of access (even if a modern drive will write and verify a CD-R in well under five minutes) and the relatively limited capacity. DVD followed compact disk into the optical storage arena, and most new computers are now equipped with an optical drive that will read and write both CD and DVD media. Due to format battles as yet unresolved (and now unlikely ever to be resolved!), DVD comes in two write-once formats (DVD-R and DVD+R), as well as two re-writable formats (DVD+RW and DVD-RW). Many older DVD writers will only write to either DVD-R and DVD-RW or to DVD+R and DVD+RW, so users need to take care to purchase the right media. Also many DVD drives will only read one type of rewritable media, and again users need to carefully take this into account when producing disks for other people. In general, it is fairly widely accepted that DVD+R is the most "stable" widely-readable write-once format (especially in domestic DVD video players) due to having superior error correction and burning control than DVD-R, whilst DVD+RW is the most flexible re-writable format. To make matters a little more confusing, Panasonic also created a format called DVD RAM. This is actually a superb re-writable technology (disks can reliably be re-written tens of thousands of times, as opposed realistically to hundreds of times for DVD-RW or DVD+RW). DVD RAM disks are also starting to be widely used in domestic DVD recorders, and are available in caddy units that can be either single or double sided. For video recording purposes and stable data archiving, DVD RAM is the media of choice. The only constraint is that many DVD drives still won't read or write DVD RAM disks (although the number is rapidly growing), with even fewer drives accepting the caddied disks that offer the media the best protection from dust, and hence maximum the durability. Windows XP also has only limited support for DVD RAM. The standard capacity for any format of DVD media is 4.7Gb. Commercial read-only disks (as used to distribute movies) double this to 8.5Gb by storing the data on two layers. Yet two more formats of DVD write-one disk (DVD-R DL and DVD+R DL) also exist to copy the same trick to raise writeable DVD data storage capacity to 8.5Gb. However, once again not all drives will write these media, and in terms of cost per gigabyte it remains far cheaper (if less environmentally or archive-space friendly) to write two DVD-R or DVD+R disks rather than a single double layer (DL) disk. Double-sided DVD RAM disks -- that physically have to be turned over to read or write the other side -- have a capacity of 9.4Gb. Blu-Ray disk is the high-capacity successor to DVD, and the only surviving new optical disk media on the block. It was developed by the Blu-Ray Disk Association (BDA) as a higher-capacity replacement for DVD (and especially to allow for the distribution and home recording of movies in high definition). Whilst most of the attention in this area has until recently been focused on Blu-Ray's battle with HD DVD (see below), for computer users Blu-Ray already offers write-once (BD-R) and re-writable (BD-RE) disk capacities of 25Gb on a single-layer disk and 50Gb on a dual layer disks. Just as importantly for the format, multi-hundred Gb disks are already in the lab and on the consumer horizon. More information on Blu-Ray can be found via the FAQ files at Blu-Ray.com. It is worth noting for completeness that HD DVD was the contender to Blu-Ray Disk to replace DVD as the next generation optical storage media for both computer data storage and domestic video use. HD DVD disks had a 15Gb capacity (lower than Blu-Ray disk at 25 or 50Gb, and not that much higher than dual layer DVD-R DL or DVD+R DL disks at 8.5Gb). HD DVD was created by Toshiba and NEC, and was backed by Microsoft. However, most movie studios and other computer industry players (including Sony, Panasonic, Philips, Samsung, Pioneer, Sharp, JVC, Hitachi, Mitsubishi, TDK, Thomson, LG, Apple, HP and Dell) were on the side of Blu-Ray. Indeed, it was following the defection in early 2008 of Warner Bros from the HD DVD that Blu-Ray won the high capacity optical disk format wars. Hurrah! As an aside, in the television industry, Sony now sells professional video cameras and recorders that use its own 23.3Gb XD-CAM optical disk storage system. Whatever format of optical disk media users choose, an ongoing debate concerns the archival qualities of all forms of optical media (ie how likely it is that data is going to remain on a disk in the long-term). Everybody seems to agree that archives should never be made on re-writable media (ie CD-RW, DVD+RW, DVD-RW or BD-RE), and advice to make new copies of optical media at least once every few years is not uncommon. For an in-depth discussion of this issue, see this excellent article on How To Choose CD/DVD Archival Media. And if you don't want an in-depth discussion, the short recommendation from this article is to archive on write-once media manufactured by Taiyo Yuden (the creators of recordable CD), and as available in the UK from retailers including DVDshoponline. To make matters far easier, in 2010 Taiyo Yuden bought the JVC Media brand, meaning that Taiyo Yuden media can now be purchased in (some) JVC boxes. Another solid archival option is to purchase "gold archival" DVD media made by Verbatum or Kodak, and fairly widely available (if at about triple the cost of standard DVD-R or DVD+R disks). SOLID STATE STORAGESolid state storage devices store computer data on non-volatile "flash" memory chips rather than by changing the surface properties of a magnetic or optical spinning disk. With no moving parts solid state disks (SSDs) -- are also very much the future for almost all forms of computer storage. ![]() Sometime in the second half of this decade, solid state disks are likely to replace spinning hard disks in most computers, with several manufacturers now offering hard-disk-replacement SSDs. These are often very fast indeed, extremely robust and use very little power. As pictured above, typically today most hard disk replacement SSDs are the same size -- and hence a direct replacement for -- a 2.5" hard drive. They also usually connect via a SATA interface. Unfortunately the prices of solid state disks are currently high, with the lowest capacity disks (of 30 to 64Gb) costing in the £60 to £120 bracket, and the highest capacity disks (currently up to 512Gb) being over £1,100(!). At present SSDs are therefore generally only being used in high-end PCs and laptops, and as a means of increasing robustness, reducing noise, decreasing power consumption, and often significantly decreasing boot-up times. As a notable exception, for a couple of years some ultramobile "netbook" computers and some low-power desktop computers -- such as the Asus Eee PC range and the Cherrypal desktop -- have been used a solid-state disk rather than a traditional hard drive, and which has been made cost-effective by limiting disk sizes to around 4-8Gb. The launch of Google Chrome OS devices (in late 2010 onwards) will also see a move back towards SSD-based netbooks and then tablets as Chrome OS is intended only for SSD-based devices that will access applications and data from the cloud. For more information on solid state disks as hard disk replacements, you may also like to watch the following video: The above discussion of hard-disk replacement SSDs noted, at present for most people most solid state storage devices come in two basic forms: flash memory cards and USB memory sticks. ![]() Flash memory cards were developed as a storage media for digital cameras and mobile computers. They consist of a small plastic package with a pin array that slots into the camera or other mobile computing device, or an appropriate computer memory card reader. Such readers usually have several slots (to accommodate the various formats of flash memory cards now available), and can either be integrated into a desktop computer or laptop's case, or connected via a USB port as an external hardware unit. Many mobile phones and audio recorders now also have slots for reading and writing a flash memory card, as do an increasing number of domestic DVD recorders. The capacity of flash memory cards on the market currently ranges from 8Mb to 64Gb. There are also six major card formats, each with its own type of card slot. The most common format is the secure digital or SD card, which is also available in a "micro" format allowing its use on very small devices like mobile phones. Next most popular are compact flash (CF) cards, which were the first popular format introduced, and which are used by most professional digital cameras and audio recorders. Finally come Sony's Memory stick format (and not to be confused with a USB memory stick), the multi-media card (MMC) and the xD picture card (XD card). Which flash memory card format you use will probably depend on the devices other than your computer that you own. This said, if you are choosing such devices, it is safest to stick with either the compact flash (CF) or secure digital (SD) format. This point noted, you should be aware secure digital cards can prove more problematic in use due to compatibility issues with cards having a capacity over 2Gb or even 512Mb. An additional standard SDHC (Secure Digital High Capacity) now exists alongside non-standard SD cards with capacities of up to 32Gb. However, many fairy recent SD/SDHC card readers are not always capable of reading all SD cards. Oh dear! Recently, adapters that allow a compact flash card to be connected to a computer's motherboard instead of a hard disk have entered the market, and these are becoming popular on small-format computers running the Linux operating system. As another aside, Panasonic have their own video recording flash memory card format called the P2 card. This is internally based on four high-speed SD cards, currently available in 16, 32 or 64Gb capacities, and is used instead of tape on some professional video equipment. In April 2007, Sandisk and Sony also released an alternative flash memory card format -- the SxS card -- currently available in 8, 16 and 32Gb capacities. This said, even in professional video, CF and SDHC cards are becoming the dominant recording media USB memory sticks (or USB memory keys, USB memory drives, or whatever you choose to call them!) are basically a combination of a flash memory card and a flash memory card reader in one handy and tiny package. Over the past five years, USB memory sticks have also become the dominant means of removable, re-writable portable data storage, and look set to remain so for some time. Not least this is because of their size, ever-increasing capacity (which currently ranges from about 512Mb to 256Gb), and perhaps most importantly their inherent durability. As with other storage devices, there are two key factors to consider when selecting a USB memory stick: capacity and data transfer speed. Whilst most consumer attention remains on the former, the later can be at least as critical. It is not uncommon for some USB memory sticks to transfer data at least ten or more times slower than others (I recently compared transferring 1Gb of files between a high-specification Corsair Voyager USB memory stick and a cheaper "own brand" model and measured transfer times of under 2 minutes and approaching 30). The extent to which this matters depends as discussed previously on whether the data in your archive is only updated incrementally (with each new document), or more completely (with a large number or a few large files replaced on a regular basis). A USB memory stick that takes 30 minutes to shift a gigabyte of data is fine if you only copy a few tens of Mb or less to it per day. However, if you regularly have to back-up multiple Gb, you need a fast USB memory key if you are not to lose your sanity. Fortunately, just why some solid state disks are slower than others is not a mystery. Rather, it is a function of the type of flash memory chips used to hold the data. Without going into great technicalities, these chips come in two varieties called single level cell (SLC) and multi level cell (MLC). Basically, MLC flash chips store two or more bits of data in each memory cell, whilst SLC chips store only one. MLC solid state disks are therefore cheaper to produce than SLC disks at any given capacity, but due to storing more than one bit of information in each memory cell take longer to write and read data. If you need a fast USB key, memory card or indeed hard-disk replacement SSD then you need to pay more to obtain an SLC device. NETWORK AND ONLINE STORAGEMany computer users may never have to back-up their data to a removable media or external hard drive (and indeed may be discouraged or banned from doing so) because their files will be stored and backed-up on their company's network servers. Even in the home (and as discussed in the networking section), back-up to a server is also now an option for many. Far more fundamentally, all of those switching in whole or part to cloud computing are now storing at least some of their data in the cloud. And even those not using online applications and processing power now have the option of backing up moderate amounts of data online, and often for free! Files stored and/or backed-up online are still saved to a hard disk rather than to some magic, new alternative media. However, the fact that the disk is located remotely to your computer, can be accessed from anywhere, and is probably backed up by the service provider(?), can make online storage and back-up very attractive. Indeed, when Google added 1Gb of free online storage for any type of file to its Google Docs online office suite it even stated in the press release that one of their intentions was to remove the need for people to use and carry USB memory keys. Cloud data storage services come in two flavours. Some simply provide online filespace, whilst others additionally include a back-up synchronization service. An online filespace can be thought of as a hard disk in the cloud that can be accessed with a web browser to upload or download files. One example is Microsoft's Windows Live Skydrive, which provides 25Gb of personal storage absolutely free (although there is a maximum file size limit of 50Mb). As already noted, Google Docs offers 1Gb of free online storage to which any kind of file can be uploaded up to a maximum size also of 1Gb. Google then charge $5 a year for each additional 20Gb. Another popular online filespace provider is box.net. For those people who may forget to regularly back-up their data to one of the above, there are cloud storage services that automate the process. These require the installation of a piece of software on each computer that uses them. This local application then automatically backs up data to the cloud, and may also synchronize it across PCs. Such a service is offered by Dropbox, which describes itself as a kind of 'magic pocket' that becomes available on all of your computing devices. For a more extensive listing of online storage services, please look in the cloud computing directory. STORAGE SUMMARY
Every major media has now gone digital, and as a result both companies and individuals are creating an increasing volume of data not just to initially store, but just as importantly to manage and back-up into a coherent archive. Indeed, in the film industry where the digital storage requirements for high-speed, random access archives can run into tens of terabytes on a major blockbuster, the job title of "data wrangler" has been born to signal the requirement for people to take on effective data management in order keep the production running effectively. (With the decline of the Western, there has been a decline in the need for horse wranglers, though sadly the skill sets required for data wranglers and horse wranglers are not similar, with no former horse wrangler having been reported to have taken up residence in a data center). |