Download the Tom's Hardware App from the App Store
The reference for current tech news
Yes No
Ads

IBM Builds Monster 120-Petabyte Data "Drive"

by - source: Technology Review

This massive storage is for a supercomputer used by an unnamed client, and includes 200,000 physical drives.

The data storage group at IBM's Almaden, California, research lab is currently building a 120 petabyte drive comprised of 200,000 conventional hard disk drives working together. The team is throwing this storage monster together for an unnamed client that needs a new supercomputer for detailed simulations of real-world phenomena (like weather, climate changes etc).

Despite the insane capacity, the technologies that were developed to handle the monstrous repository could enable similar systems for more conventional commercial computing, claims Bruce Hillsberg, director of storage research at IBM and leader of the project. "This 120 petabyte system is on the lunatic fringe now, but in a few years it may be that all cloud computing systems are like it."

The technology behind the 120 petabyte "drive" includes modified horizontal drawers stacked inside typical data center racks which are significantly wider so that more disks can be crammed into nearly the same amount of physical space. The IBM engineers also ditched the standard fan setup as a cooling system, and went with a more reliable liquid cooling design to keep the drives chilled and to reduce the overall energy consumption

In addition to modifying the rack system, IBM also developed a file system known as GPFS to enable supercomputers faster data access. This new file system spreads individual files throughout multiple disks so that numerous parts of a file can be read or written simultaneously. GPFS also enables a large system to keep track of its many files without "laboriously" scanning through every one. Ultimately this system on a whole is not expected to lose any data for a million years without making any compromises on performance.

Hillsberg added that keeping track of the names, types, and other attributes of the files stored in the system will consume around two petabytes of its capacity. To put this number in perspective, 120 petabytes equals to 120 million gigabytes which theoretically could hold 24 billion 5MB MP3 files or 60 copies of the Internet Archive's WayBack Machine.

To read more about IBM's 120 petabyte drive, head here.

Share:
47
Comments
X
Submit

Comments
Add your comment
burnley14 08/30/2011 2:06 AM
Hide
-9+

They built this for me. I have a pretty big music library. . .

NapoleonDK 08/30/2011 2:10 AM
Hide
-20+

It's for porn, obviously. =\ The new ".xxx" domain now has a home.

Also, inb4 "can I install Crysis?" XD

otacon72 08/30/2011 2:14 AM
Hide
-4+

Still not enough to hold my porn collection.

daygall 08/30/2011 2:17 AM
Hide
--2+

thats alot of pron O_O

on a more serious note *cough*

cant wait for more miniaturization :D 3tb now hopefully 6-8 in by 2015... baring the Apocalypse lol

bavman 08/30/2011 2:23 AM
Hide
-15+

"unnamed client" = someone with too much porn and not enough harddrive estate

techseven 08/30/2011 2:42 AM
Show
Zanny 08/30/2011 2:43 AM
Hide
--3+

daygall :
thats alot of pron O_Oon a more serious note *cough*cant wait for more miniaturization 3tb now hopefully 6-8 in by 2015... baring the Apocalypse lol



We have been pushing the limits of mechanical disk reading lasers. Blue spectrum is the smallest imprint we are going to get, and the data error limits on drives past 3 terabytes are really small, in that it is very likely to have a bad sector somewhere on the disk by that point.

oparadoxical_ 08/30/2011 2:59 AM
Hide
-4+

techseven :
120 Petabytes = 120.000 Terabytes / 2TB per drive would equal 60.000 2TB drives, but they store it on 200.000 drives?120.000 Terabytes / 200.000 drives = 600MB per drive. So they must need all these drives to make it fast enough...?


Actually, 120 petabytes=122,880Tb which equals 61,440 individual 2Tb HDDs.
Then, the 122,880tb=125,829,120Gb and if you divide that by 200,000, you get about 630gb per HDD.

PennyLife 08/30/2011 3:04 AM
Hide
-0+

I can only imagine two different types of entities that would want to purchase this:

- A military agency, OR

- A company that is preparing for and aiming to be a big provider of cloud services.

Pyree 08/30/2011 3:17 AM
Hide
-9+

In raid 0 (good luck finding the 1 drive which fails)!



Only joking, of course they won't be in raid 0.

jsanthara 08/30/2011 3:18 AM
Hide
-15+

Do you think it will cost more than $150, because I have a little over $300 and I want to buy 2.

dalethepcman 08/30/2011 3:25 AM
Hide
-5+

This is probably to backup facebook, so the advertisers can find historical data a million years from now, and match your great great great.....(x100) great grand sons face to yours for their next gen ancestry.com ad's

zoemayne 08/30/2011 3:47 AM
Hide
--1+

facebook data isn't that large...... not the text info

Azimuth01 08/30/2011 4:06 AM
Show
Azimuth01 08/30/2011 4:06 AM
Show
a sandwhich 08/30/2011 4:07 AM
Hide
-1+

jsanthara :
Do you think it will cost more than $150, because I have a little over $300 and I want to buy 2.


lol wut?

balister 08/30/2011 4:07 AM
Hide
-0+

techseven :
Advanced tagging, mapping and storage of porn?120 Petabytes = 120.000 Terabytes / 2TB per drive would equal 60.000 2TB drives, but they store it on 200.000 drives?120.000 Terabytes / 200.000 drives = 600MB per drive. So they must need all these drives to make it fast enough...?



You're forgetting the overhead for running as a RAID 10. So likely 1.5 TB with double the drives for the mirroring.

So, 90k for one RAID 0 and then another 90k for the other side of the mirror of the RAID 0.

Azimuth01 08/30/2011 4:07 AM
Show
FloKid 08/30/2011 4:07 AM
Hide
-3+

Oh boy I hope they have a pretty good Anti Virus

Azimuth01 08/30/2011 4:08 AM
Show
Azimuth01 08/30/2011 4:09 AM
Show
mcd023 08/30/2011 4:40 AM
Hide
--1+

I'm not the RAID expert, but would a RAID 5 be better for redundancy? I know that the 2 drives in a RAID 10 that have the same data failing are improbable, but I was just wondering.

On another thought: think of how many drives they'll be replacing like, what, every day? I've heard of large arrays needing several replacement drives/wk. Imagine this!

chickenhoagie 08/30/2011 4:50 AM
Hide
--1+

mcd023 :
I'm not the RAID expert, but would a RAID 5 be better for redundancy? I know that the 2 drives in a RAID 10 that have the same data failing are improbable, but I was just wondering.On another thought: think of how many drives they'll be replacing like, what, every day? I've heard of large arrays needing several replacement drives/wk. Imagine this!

I think you're right. RAID 5 is the best RAID solution for mirroring/backup as far as I know. Certainly better than RAID 10

Anonymous 08/30/2011 5:02 AM
Hide
-0+

It stores every phone call, text message, email, message board, forum post, and news comment (and miscellaneous) from around the world. It stores, searches for specific words, names, and phrases, and determines the relevance, if any, then permanently saves (xferred to smaller system) or purges the data. All voice and data communications are being rerouted through specific satellites in geosynchronous orbit around the earth.

Anonymous 08/30/2011 5:44 AM
Show
Anonymous 08/30/2011 5:45 AM
Hide
--2+

uh yeah, DoDPA is on the right track. i can't believe this hasn't been mentioned yet. it's obviously a storage shed for government surveillance of citizens. that means you and me.

seriously, isn't this obvious? what is wrong with you sheeple.

kinggremlin 08/30/2011 5:55 AM
Hide
-3+

Zanny :
We have been pushing the limits of mechanical disk reading lasers. Blue spectrum is the smallest imprint we are going to get, and the data error limits on drives past 3 terabytes are really small, in that it is very likely to have a bad sector somewhere on the disk by that point.



Hard drives don't use lasers, never have. Optical drives and sharks use lasers. Current generation hard drives are somewhere in the 600Gb/in2 range with current PMR techniques likely to top out around 1Tb/in2. Seagate has predicted that the next generation of hard drives (HAMR) will be able push capacity into the 50Tb/in2 range. That's over 80 times more dense than current drives. So we are no where near the limits of mechanical hard drive capacities.

velocityg4 08/30/2011 6:01 AM
Hide
--1+

mcd023 :
I'm not the RAID expert, but would a RAID 5 be better for redundancy? I know that the 2 drives in a RAID 10 that have the same data failing are improbable, but I was just wondering.On another thought: think of how many drives they'll be replacing like, what, every day? I've heard of large arrays needing several replacement drives/wk. Imagine this!



RAID 6 is much better as you have much more redundancy. Though with that many drives they must be using some sort of proprietary system to handle issues of redundancy and all the read/write cycles those drives will be going under. I would speculate that the system would more likely have thousands of RAID 60 array. Which using their GPFS filesystem views each array as an individual sector of the giant array. Though they likely developed some sort RAID system designed specifically for handling massive numbers of hard drives.

It would just be too inefficient to break apart every file among 200,000 drives then reassemble it. Not to mention you would severely limit the number of files that could be accessed at one time before slowing the array to a crawl.

Anonymous 08/30/2011 6:41 AM
Hide
--3+

.Azimuth01 08/30/2011 4:06 AM


Google must be about to hatch their master plan....
First: consolidate all the information they ever collected onto one machine
Next: Begin analyzing trends using variables and timetables from every known source
Lastly: Use this information to predict the future and take over the world

Still based on ideas of relevent information

agnickolov 08/30/2011 7:38 AM
Hide
-0+

My guess would be 1TB 2.5in drives in 6- to 8- drive RAID 6 configurations with some overhead to manage all the RAID 6 arrays. 3.5in drives just don't cut it for the density requirements of such a massive storage solution.

Also to oparadoxical_: You are misrepresenting units quite a bit. Lowercase b is for bits, not bytes. Lowercase t is meaningless. I suppose you were trying to illustrate binary vs decimal units, in which case it'd be TiB vs TB, GiB vs GB etc.

wiinippongamer 08/30/2011 8:06 AM
Hide
-0+

Misleading title as always....


Ads

Best offers

Newsletters


OK
Ads