Image Software Backup file size

PAUL41

Distinguished
Dec 10, 2009
8
0
18,510
I am just curious on how image backup software works.

I would think that it would work by reading all the 1's and 0's held on the drive and storing this as a very large number. If this is the case then why is the size of an image file so large when all it should be is 1 large number. Not having a clue on how imaging software really works i am obviously missing someone. I would be grateful if somoene could explain this.

Thanks in advance
 
If you have a 1TB drive then you have almost 10,000,000,000,000 "ones" and "zeroes". It's a lot of digits to store - in fact it takes a terabyte (almost 10 terabits) to store them.

The space isn't reduced just by treating it as a "number" composed as string of digits. If I tell you to write down the word "supercalifragilisticexpialidocious", does it take less space if you write down the individual letters or if you write it down as a "word"? Doesn't matter. It still takes 34 letters to write it down.

You can reduce the size of backups if you do things like:
1) Don't bother backing up the part of the drive that isn't currently used for data
2) Handle special cases like a file full of "0" bits separately (ie, store something that notes "this file contains 10,000 zeros" instead of actually storing 10,000 zeros.

The latter is an example of compression, and it can work very well if the file has fairly regular patterns in it's data. But it's useless for things like photos and movies because those are already compressed and so there are no patterns that can be taken advantage of.
 

PAUL41

Distinguished
Dec 10, 2009
8
0
18,510
Thanks for your reply.

I understand there is no difference between storing 34 letters as it is the same.

However, on a hard-drive they are stored as 1's and 0's. Therefore it would be less space to simply store a number of say 65535 as its hexadecimal equivalent of FFFF. The one's and zero's are only on the hard-drive itself but to re-load an image it should simply only be able to load the image number equivalent?

So if all 1's and 0's were read from the drive and counted then could the total not be simply represented as a very large number. This would mean a small file size as the image software would deal with converting this back into its binary output onto the harddrive.

01010101010100000001111111111100000000000001111111111111100000000000 = 12294897317110087424 (decimal)
AAA03FF8003FFF00 (Hexadecimal)

Obviously there will be limits due to 32bit/64bit but i don't see why this should not be possible? Any large number equated should could be later converted to its binary equivalent. Maybe this would require a lot of processor/time than the conventional method currently used?
 
I think the mistake you're making is in thinking that hexadecimal "FFFF" somehow takes less "space" than it's binary equivalent. In fact, there's absolutely no difference between a binary 1111111111111111, hex FFFF, or decimal 65535. When stored in a computer, they are all EXACTLY the same thing: two bytes with all of the bits set to '1's.

If you try to convert it to anything else, it will NOT save any space and in a lot of cases it will take up more space. For example, if you convert it to a string of ASCII characters containing "FFFF", it now takes up 4 bytes instead of just 2.

It's true that a binary number has more DIGITS, but each DIGIT only takes up one bit of storage. The same value expressed in base 16 can be written with fewer DIGITS, but each hexadecimal digit represents four bits of storage.

Binary numbers can be written in hex using 1/4 of the digits, but since each digit takes up 4 times as much storage the result requires exactly the same amount of storage.