Tom's Hardware Forums » Storage » General Storage » How valuable would an INSTANT backup/restore be?
 

How valuable would an INSTANT backup/restore be?

Add a reply



 Word :   Username :  
 
Bottom
Author
 Thread : How valuable would an INSTANT backup/restore be?
 
Profile: stranger
More Information

Hard drives are great. For about 20 cents a gigabyte you can buy a number of high capacity drives in a variety of form factors. You can stick a couple of them together with a RAID controller in a portable box and have a couple Terabytes to pack around and plug into just about any USB or FireWire port.

A big problem with such high capacity devices is the time it takes to do a full backup or restore operation. Even using the best imaging tools, a nearly full TB drive can take hours to backup or restore. Snapshot backup software can reduce the time needed to backup, because only the changes need to be saved. But this solution may not be very useful for portable drives since they may be shuffled back and forth between computers that may not be running the monitoring software.

In any case, if your file system gets trashed by a virus or a bad internet download or OS update refuses to uninstall, your only option may be to do a full restore from backup.

But what if you could buy a hard drive that could do a full internal backup or restore operation in just a few seconds. Changes to the hard drive firmware could allow the user to configure the drive to have a full backup space. For example, a 1 TB drive could have 500 GB of current data and a 500 GB space of backup. Unlike partitioning methods, the backup space would be completely invisible to the computer and so the drive would behave just like a 500 GB drive.

When the user wanted to perform a backup, a command could be sent to the drive (e.g. by pushing a button on the drive itself) and the drive would transfer internally any data that changed since the last backup to the backup space.

Likewise, a restore command would transfer any data that changed since the last backup from the backup space to the working data space.

Obviously, since hard drives can fail, it would be prudent to still perform periodic backups to another hard drive, but this solution would be sufficient for many cases.

I have invented a method whereby these internal backup or restore operations can occur Instantly. That is, once the command to backup the data is sent to the drive, the computer can resume normal operations within a few seconds. Likewise, a restore command would allow normal operations as quickly as you could reboot the computer.

This feature would be completely independent of any operating system, file system, or disk partitioning method. No special backup software would be required.

I would like to know if such a feature would add significant value to a hard drive. I personally would pay $50 more for a drive that had this feature over an otherwise identical drive that did not. My time is valuable and I feel that even a single restore operation saving me about three hours would make it worth it.

I would like to know if anyone else would find this feature to be valuable. Please tell me what you think.

Bonus: If a large storage array were configured using a bunch of disk drives with this feature, the entire array could be backed up or restored in an equally short time. Each drive in the array would perform its own backup or restore operation internally in parallel without using any host computer resources. Imagine being able to backup or restore a Petabyte of data with only 10 seconds of system downtime and with no complicated backup software.

In addition to the time savings, this feature could also change the user's behavior. How much more often would someone be willing to download software from less reputable sites, try beta software, or play around with disk utilities if they knew they could get back to where they started as quickly as they could reboot? Even a complete wipe of the visible disk space by a nasty virus could ruin the user's day.

Related Pr oduct
Register or log in to remove.

Profile: enthusiast
More Information

Unless I mis read you here you will have 1 backup. With that backup being a copy of the original drive volume as opposed to being an image of some sort where more than 1 could be stored. . So this might be a good idea for certain backup scenarios. It will not work for daily archive where you need Monday thru Friday backup for example. It would also not work for offsite backup or in cases of a computer actually being stolen.

But this would be a cool thing for some home users. This is the same as having a ghosted drive to use in the case of disaster except that your solution is much faster with recovery. But your idea is pretty much backup and redundancy all in one which might not be such a good thing. You never want to confuse redundancy for backup. For a redundancy solution you need to survive a drive failure. You say your solution works with RAID so it might be OK anyway for some users, but with a cost. How many would pay extra for this idea with such a high overhead cost. With a RAID 1 array the 50% overhead is high for this solution. An example is if I want to use your backup solution I already lose 50% of my drive space if I also RAID 1 this then I have an extra 50% on top of the original 50% so 2 - 1TB drives would yield 500MB of space. Not very sexy but some may jump on this anyway. I have seen people throw concern for cost out the window when data is in jeopardy.

The other side of this is backup. In a conventional system backup is a good idea and your solution certainly works if you need to restore from corruption but you are hosed if the drive itself goes down.

As for risky behavior, that should be done inside a VM or on another machine all together. I would not do that to my main machine even with your idea. Since I only have 1 backup, what would happen if I didn’t know I was infected with something for a few days and I did a backup. Then even the good backup would be bad too since I only have 1.

So I think you idea would work in some situations. Such as where someone needed a really fast recovery from corruption and didn’t mind paying for the overhead of 50%.

Just curious, in what sector do you see a demand for this? I do think you have a good idea though.


---------------
Intel DX48BT2 bone trail 2 || Xeon X3350 with Xigmatek S1283 || 4GB Gskill DDR3 1600 || 1 - 300GB 15k SAS boot , 3 - 750GB SATA Raid 5 || Adaptec 5805 SAS RAID controller || ATI 3870 || Antec 300 Chassis with Nspire 600 watt PS
Profile: old hand
More Information

In the enterprise sector, this technology already exists and is implemented as the Snapshot feature in most SAN units. There is no downtime at all when taking a Snapshot (the host system(s) don't even know that a Snapshot was taken). To do a restore, just shut down the host system(s), restore from the Snapshot (which takes a few seconds) and start the host system(s) back up.


---------------
- SomeJoe7777

"Did he dazzle you with his extensive knowledge of mineral water? Or was it his in-depth analysis of, uh, uh, Marky Mark that finally reeled you in?" - Troy Dyer (Ethan Hawke), Reality Bites, 1994
cjl
Rocket Scientist
Profile: nimble knuckle
More Information

If the virus wiped the disk, a backup faster than a couple of hours is impossible simply due to the physical transfer rate limitation on the disk. Restoring changes is faster, if the quantity of changes is small, but you certainly can't guarantee seconds to do a backup in all cases.

 

So basically, the restoration of a petabyte of data in a few seconds is nothing but a rather imaginative dream right now.

 

Admittedly, if it were actually physically possible, it could be nice, but unfortunately, the laws of physics are still valid.


Message edited by cjl on 07-10-2008 at 01:03:49 AM
Profile: stranger
More Information

There are several reasons why this invention would seem to do the impossible.

First of all, all data transfers happen within the disk itself. No data must be sent to the host computer and back again which is what happens when you try to Ghost a partition and store the image into another partition on the same drive. In fact, if the drive had power it could perform a backup or restore operation without even being attached to a host computer.

Second, the backup space is located on the opposite side of each disk platter from the "live data" space. This means that every modified track can be transferred very efficiently. When the read head is located over the source track, the write head is also located over the destination track. Each track can be transferred with a couple rotations of the disk without head movement. If every single track on the "live data" space had been completely wiped and a full restore was required, the disk could do it with a single "sweep" of the disk. The first track would transfer in a couple rotations, the head would move to the next track, that track would transfer in a couple rotations, the head would move to the third track, ... continue loop until the last track was transferred.

Third, only the modified tracks would need to be transferred. If only a thousand tracks had been modified since the last backup, only those thousand transfers would occur during either a backup or restore. The drive would not care if those tracks contained data files, operating system files, file system structures, or partition tables - to the drive it is just block data. The drive would set a "modified" bit for a physical track whenever it receives a "write" command for a block in that track. This way the drive could tell exactly how many tracks were modified since the last backup and which ones.

Finally, since the drive controls both the "live data" space and the "backup" space, it can finish any backup or restore command in the background while it continues to receive read or write requests from the host. The host would think the operation completed in just a few seconds even if the drive had to take 20 minutes to transfer all the modified tracks. For example, if the drive received a read request during a backup it would service that request from the "live data" space. If it received a write request during the backup it would first check to see if the track being written to was schedule for transfer but not yet transferred. If it was, it would first transfer the track then process the write command. The same kind of logic would be used during a restore process. Using this technique you can instantly restore a space without putting the backup space at risk.

Profile: stranger
More Information

The number of backups allowed by this invention would only be limited by the number of disk platter surfaces located in the drive itself. If a disk drive had two platters (four surfaces) it could be configured to have three separate backups in addition to the "live data".

Because the drive would be configurable, if you filled up the "live data" space you would need to make a decision, reconfigure in order to double the "live data" space and only have a single backup, or get another drive to add more disk space.

If this invention were used in conjunction with current RAID technologies, you could have redundancy and backup. If you had two 1TB drives in a portable enclosure, you could configure each drive to have a single backup space and then mirror the "live space" using RAID. This would give you the ability to instantly backup or restore and recover from a damaged physical drive as well.

If you had these drives in a striped set, you could get through drive failure without mirroring.

Also, snapshot technology works great as long as two conditions are met.

1) The snapshot data itself (and the backup software needed to restore it) are undamaged by bugs, a virus, or by accidental or intentional user action.

2) The drive being monitored is always connected to the host running the snapshot software. For portable drives that are plugged into several different computers in order to access data at different times, monitoring each change to the drive is next to impossible using snapshot technology.

I
Profile: addict
More Information

It would be wasteful to have redundancy of raid and backup, and your proposed backup alone won't otherwise do anything about drive failure.

In short, this is a topic without merit. No it wouldn't be a significant added value as right now the industry is transistioning towards solid state and in that arena every last GB/$ counts, people won't consider a second reserved area for that.

BTW, if it were useful this is not something it would be terribly difficult to implement, for example I could do it tomorrow if I cared to, all one has to do is hide the partition, parallel writes, and redirect the partition table when a restoration were needed.


Message edited by I on 07-10-2008 at 02:47:44 AM
Profile: stranger
More Information

Your example of using partitioning instead of this invention is completely off base. Partitioning does almost nothing to prevent data corruption. I could write a virus that wipes out not only your partition table and all the data in your partition, but also your backup partition table and the hidden partition you created. In that case you are out of luck.

Also, assuming your hidden partition was not corrupted (which I can't see how it would remain intact seeing as you are doing parallel writes, i.e. software mirroring), then if you simply redirected the partition table then you really haven't restored anything. You may have a working set of data until it too gets hit with a bug, virus, or other corruption. Now you don't have a backup.

Also, I never said that this backup alone would guard against drive failure.

cjl
Rocket Scientist
Profile: nimble knuckle
More Information

StorageInventor wrote :

There are several reasons why this invention would seem to do the impossible.

 

First of all, all data transfers happen within the disk itself. No data must be sent to the host computer and back again which is what happens when you try to Ghost a partition and store the image into another partition on the same drive. In fact, if the drive had power it could perform a backup or restore operation without even being attached to a host computer.

 

Second, the backup space is located on the opposite side of each disk platter from the "live data" space. This means that every modified track can be transferred very efficiently. When the read head is located over the source track, the write head is also located over the destination track. Each track can be transferred with a couple rotations of the disk without head movement. If every single track on the "live data" space had been completely wiped and a full restore was required, the disk could do it with a single "sweep" of the disk. The first track would transfer in a couple rotations, the head would move to the next track, that track would transfer in a couple rotations, the head would move to the third track, ... continue loop until the last track was transferred.


2 things: First, the reason Ghost takes a long time is that it is often waiting on the CPU to compress the files. If you don't use high compression, and have a fast computer, it would be much, much faster (and drive limited, not interface limited)

 

Second, the backup and restore operations would take far longer than you expect, as it takes a hard drive time to transition from a read to a write (not much time mind you, but enough that it wouldn't be able to perform both on the same revolution), and it is still bandwidth limited. A full disk write from the buffer on a 250GB notebook drive takes about 1 hour and 20 minutes. This requires zero interface bandwidth or anything like that - I'm talking about writing a simple pattern from the on disk RAM, over and over again until the disk is full. This represents the absolute fastest that an operation like this could occur. Now, it is true that if only a few things were modified, this would be quite fast. That is also true of many of the other backup software packages. However, in my hypothetical full disk wipe scenario, it would be several hours before the backup could complete simply due to the physical disk transfer rate.

 
StorageInventor wrote :


Third, only the modified tracks would need to be transferred. If only a thousand tracks had been modified since the last backup, only those thousand transfers would occur during either a backup or restore. The drive would not care if those tracks contained data files, operating system files, file system structures, or partition tables - to the drive it is just block data. The drive would set a "modified" bit for a physical track whenever it receives a "write" command for a block in that track. This way the drive could tell exactly how many tracks were modified since the last backup and which ones.


Very true. However, even in this scenario, it would not be a matter of seconds as you presume. If 1000 tracks have been modified, and you assume 2 revolutions are necessary to restore each track (one to read the backup, and one to write it back to the data region, since as I said, it can't instantly switch from reads to writes), this would require 2000 revolutions. At 7200RPM, this would take about 20 seconds at best, and this is a TINY quantity of data. More likely, tens of thousands of tracks would have been modified since the last backup. Also, how does it know which ones have been modified? Where does it store the change log?

 
StorageInventor wrote :


Finally, since the drive controls both the "live data" space and the "backup" space, it can finish any backup or restore command in the background while it continues to receive read or write requests from the host. The host would think the operation completed in just a few seconds even if the drive had to take 20 minutes to transfer all the modified tracks. For example, if the drive received a read request during a backup it would service that request from the "live data" space. If it received a write request during the backup it would first check to see if the track being written to was schedule for transfer but not yet transferred. If it was, it would first transfer the track then process the write command. The same kind of logic would be used during a restore process. Using this technique you can instantly restore a space without putting the backup space at risk.


True, this could make it seem like you could continue work without significant interruption, however it has a couple of problems. First, it would cause a fairly horrendous slowdown of the computer during this period of restoration, as it would constantly be seeking back and forth, and would also be attempting to restore while also being fed constant tasks (background activity is quite high in most operating systems). Combine this with the halving of capacity (who on earth would buy a 500GB drive for $200 when one can be had for $80?), and you have an excellent recipe for something that nobody will likely buy. This also shows that it is far, far more than a $50 price premium - it is actually a $50 price premium over a drive with double the capacity, or a price premium of about a factor of 3 over what a comparable drive would cost ($240ish vs $80 for 500GB, assuming a $50 premium over a current, high performance TB drive).

 


Message edited by cjl on 07-10-2008 at 09:43:33 AM
Reformulated with 20 percent less ahole !
Profile: nimble knuckle
More Information

Acronis trueimage lets you backup during use right now and even restore in most cases while using the computer, sorry to tell you :) GL


---------------
X2 5400+, Biostar TA780G M2+ MATX, 2 gig mushkin, 8800gts 512 , CM 532, Kingwin 450w ATX 2.2

"Now if the 4870x2 was actually notably faster than the 280 for about the same price, then I might even take a chance on it. However, that won't be the case."
Profile: member
More Information

I agree that the cost would kill these drives' value. The biggest problem is that you're talking about mirroring, which is not useful for software related issues that destroy file systems. Mirroring is really only useful for physical drive failures, meaning one drive dies, but you have an identical copy ready to go. Of course, restoring from the mirror is instant as it simply starts looking at the other drive and waits for you to replace the failed drive before it can start rebuilding the mirror. This happens transparently to the end user. So technically, adding mirroring inside a single physical drive divided by partitions or platters is simply useless. If the drive fails, BOTH sets of data are GONE. If the software screws up the data, BOTH sets of data are SCREWED. It's a good thought and could lead to some innovation if properly researched... I think the idea of transparent backups is definitely worth the time.

Profile: old hand
More Information

I always liked mirror RAID myself both drives writing the data at the same time instant back ups instant restores.... zero CPU overhead if the motherboard has a decent chipset.

Mirroring has some down sides (the need for more then one drive, but thats a good thing if one drive fails having two is a good thing....)

The same reason Nortons Ghost will not let you store a back up on the same drive its backing up.... When hard drives fail 99% of the time its not the platters its the drive electronics or the bearings or the OS itself being corrupted.

Also having two separate drives lets you do really cool things like incremental back ups (going back to a specific date and time ala Windows System Restore) Ghost can do that and also differential back ups (backing up only the data that has changed since its last back up) this speeds things up tremendously.

Profile: stranger
More Information

There still seems to be some confusion about what this invention does. Apparently, I didn't do a very good job explaining it so I'll take another crack at it.

This invention would be completely implemented in the firmware of the disk drive. Like a RAID controller in an external drive box, it functions completely independent of the host computer. It wouldn't rely on any backup or snapshot software running on the computer in order to work.

Also, it is configurable meaning that if you use it on a 1 TB drive you don't "lose" 500 GB. If you need more than 500 GB of storage space, you turn off the feature and the drive will behave exactly like every other 1 TB drive. Until your drive is half full however, instead of having the other half sit empty, it can be used for this backup/restore feature. Again it would be just like an external storage box with two drives in it and that you can configure as RAID 0 or RAID 1 depending on if you wanted mirroring or more space.

It functions similarly to disk mirroring using RAID except all the writes to the second mirror would not occur immediately as it does with mirroring. Instead they would be batched up and occur whenever there is a backup command given. Between backups, all writes occur to one side of the "mirror", the other "mirror" is in "freeze frame". If something goes wrong, the restore process can rollback to the last backup by synchronizing with the "freeze frame mirror". At any time the user can move the "freeze frame mirror" up to the current point by doing a backup operation.

So obviously this isn't the same as "in-disk mirroring". Since both mirrors would be in the same drive the only thing such redundancy could save you from is a head crash where the data is unreadable from one side of the platter but the drive could still read the other side.

Someone asked how the drive keeps track of which tracks need to be updated. The drive would have a bitmap with a single bit assigned to every physical track in the drive. Whenever a write occurred on the drive, the firmware would set this "modified" bit for that track.

Again, I know that a lot of the functionality is currently available using good snapshot software. But snapshot software has its drawbacks. You have to buy it. You may have to update it whenever your operating system or file system changes. It must be running correctly whenever the disk is written to, so if you must boot in "safe mode" or from a flash drive or from a CD/DVD, it may not be running. If your boot drive is configured for booting multiple operating systems, the snapshot software must be running on each OS. And finally, it only works if the snapshots themselves are not damaged by whatever corruption has occurred.

With this technology, I could take my portable drive and plug it into a Windows box, make lots of changes, plug it into my iMac, make more changes, plug it into a Linux box at my friend's house, make even more changes, and when I was done have the ability to push a button on it and have it automatically roll back to my last backup and undo all the changes I made using each of those computers.

Profile: enthusiast
More Information

I could see schools or libraries or a computer cafe possibly using this. Anywhere the general public has access to a computer and you would like to undo all changes at the end of the day. There are other thing now that do this but yours solution may be faster.
My favorite imaging solution as of now is Bart PE using Acronis. I can preload my RAID controller driver in PE and Image to and from a USB hard drive. Restore operations are very fast.


---------------
Intel DX48BT2 bone trail 2 || Xeon X3350 with Xigmatek S1283 || 4GB Gskill DDR3 1600 || 1 - 300GB 15k SAS boot , 3 - 750GB SATA Raid 5 || Adaptec 5805 SAS RAID controller || ATI 3870 || Antec 300 Chassis with Nspire 600 watt PS
Profile: stranger
More Information

Back to my original question. How much more would anyone be willing to pay for this feature?

The industry sweet spot for hard drives seems to always be the 2 platter design. A two platter 500 GB drive currently runs about $75. Based on pricing models I have seen in the past, it costs the industry about $15 more to install an additional 2 platters. So drive makers could sell their 1 TB drives for about $100 and still make a profit. This is why the 1 TB drives are falling in price so fast as competition heats up for the high capacity drives.

So if the industry implemented this feature and locked down the extra 2 platters so you couldn't reconfigure it to 1 TB and compete with the current 1 TB drives, would you be willing to pay extra for the feature? In other words, if you could either buy a plain 500 GB drive for $75 or a 500 GB Instant Backup/Restore drive for $125, would you pay the extra money for the feature?

cjl
Rocket Scientist
Profile: nimble knuckle
More Information

No.

Profile: stranger
More Information

Well, its nice to know how much some rocket scientists value their time.

Here's the problem I have that I was hoping this invention would solve. (If any of you have a better solution, please let me know.)

I've got a 250 GB USB drive (2.5 inch) that I like to pack around with me a lot. It isn't full yet (a little less than 200 GB) but it has a lot of video, music, software, documents, etc. that have come in real handy on many occasions. Almost nothing on the drive is original, in other words it is a collection of data that consists of copies of files that exist on at least one of my four computers. Although I might be able to recreate the data set from scratch by going to each computer and copying files from each, it would be a lot of hassle and very time consuming to do so. So, I need some kind of backup/recovery solution.

Because I regularly plug this drive into each of my four computers (two running Windows, two running OS X), as well as other computers I may encounter, snapshot software will not work. My only option seems to be to regularly image the whole drive so I can restore the image if my file system(s) get hosed.

It currently takes a couple hours to image the drive. I have to remember to manually create the image every month or so. (Since the invention would not save me from having the drive itself lost, stolen, damaged, or otherwise fail, I would need to do this step anyway.) Although it would be nice to image the drive every few days so my latest image is always very recent, backing up is such a hassle and since I have to remind myself to do it, it only occurs about once a month on average.

The main problem is the recovery. I have had to restore an image twice since I bought the drive. Both times, the file system was screwed up so bad I couldn't access the files. Once I was home so I just had to wait a few hours after I started the restore process. The other time I was on a trip (my image was at home) so I had to wait until I got back home before I could access any of my data.

In both cases, if my drive had been equipped with this technology I could have pushed a button and been on my way. (And yes I know, I would only be able to store 125 GB on this particular drive and I also realize that if the damage was much more extensive than a few critical file system blocks, I might have some loss in performance while the restore process completed in the background.)

But for me, it would have been worth 50 bucks to spare me from the time and the hassle I have spent so far (not to mention the next X number of times I will have to do this for the life of the drive).

Profile: member