IBM Builds Monster 120-Petabyte Data "Drive"
This massive storage is for a supercomputer used by an unnamed client, and includes 200,000 physical drives.
The data storage group at IBM's Almaden, California, research lab is currently building a 120 petabyte drive comprised of 200,000 conventional hard disk drives working together. The team is throwing this storage monster together for an unnamed client that needs a new supercomputer for detailed simulations of real-world phenomena (like weather, climate changes etc).
Despite the insane capacity, the technologies that were developed to handle the monstrous repository could enable similar systems for more conventional commercial computing, claims Bruce Hillsberg, director of storage research at IBM and leader of the project. "This 120 petabyte system is on the lunatic fringe now, but in a few years it may be that all cloud computing systems are like it."
The technology behind the 120 petabyte "drive" includes modified horizontal drawers stacked inside typical data center racks which are significantly wider so that more disks can be crammed into nearly the same amount of physical space. The IBM engineers also ditched the standard fan setup as a cooling system, and went with a more reliable liquid cooling design to keep the drives chilled and to reduce the overall energy consumption
In addition to modifying the rack system, IBM also developed a file system known as GPFS to enable supercomputers faster data access. This new file system spreads individual files throughout multiple disks so that numerous parts of a file can be read or written simultaneously. GPFS also enables a large system to keep track of its many files without "laboriously" scanning through every one. Ultimately this system on a whole is not expected to lose any data for a million years without making any compromises on performance.
Hillsberg added that keeping track of the names, types, and other attributes of the files stored in the system will consume around two petabytes of its capacity. To put this number in perspective, 120 petabytes equals to 120 million gigabytes which theoretically could hold 24 billion 5MB MP3 files or 60 copies of the Internet Archive's WayBack Machine.
To read more about IBM's 120 petabyte drive, head here.
- IBM,
- Storage,
- 120-petabyte ,
- IBM-Almaden ,
- Bruce-Hillsberg ,
- rack-mount ,
- GPFS
- This USB 3.0 Flash Drive Has 2 TB of Storage
- Will Tim Cook Become the Next Steve Ballmer?
- OWC's $129.99 60GB SSD Does 556MB/s Reads
- Acer Says Tablet Fever is Already Cooling Down
- Microsoft Making File Copying Better in Windows 8
- The Razer Blade is a Stunning New 17" Gaming Laptop
- Microsoft Shows Off New Quad-core Tablet
- Steve Jobs' Salary Was $1, But What About Tim Cook?
- Crucial Releases New (20% Faster) m4 SSD Firmware
- LG Unveils Ultra-slim New LCD Monitors
- A Pentium III Autopsy Using an Electron Microscope
- Deals for August 30: 24" Dell UltraSharp IPS LCD $329
- Modern Warfare 3 Will Have Full Steam Support
- Apple Files Dynamic Cell Memory Patent
- HP May Spin-Off PC Business Rather Than Sell
- Corsair Releases Two New Force Series GT SSDs
- VIDEO: Quake 3 Running on Tiny, Little Raspberry Pi
- Asus Reportedly Launching 5 to 6 Ultrabooks in October







They built this for me. I have a pretty big music library. . .
It's for porn, obviously. =\ The new ".xxx" domain now has a home.
Also, inb4 "can I install Crysis?" XD
Still not enough to hold my porn collection.
thats alot of pron O_O
3tb now hopefully 6-8 in by 2015... baring the Apocalypse lol
on a more serious note *cough*
cant wait for more miniaturization
"unnamed client" = someone with too much porn and not enough harddrive estate
Advanced tagging, mapping and storage of porn?
120 Petabytes = 120.000 Terabytes / 2TB per drive would equal 60.000 2TB drives, but they store it on 200.000 drives?
120.000 Terabytes / 200.000 drives = 600MB per drive. So they must need all these drives to make it fast enough...?
thats alot of pron O_Oon a more serious note *cough*cant wait for more miniaturization 3tb now hopefully 6-8 in by 2015... baring the Apocalypse lol
We have been pushing the limits of mechanical disk reading lasers. Blue spectrum is the smallest imprint we are going to get, and the data error limits on drives past 3 terabytes are really small, in that it is very likely to have a bad sector somewhere on the disk by that point.
120 Petabytes = 120.000 Terabytes / 2TB per drive would equal 60.000 2TB drives, but they store it on 200.000 drives?120.000 Terabytes / 200.000 drives = 600MB per drive. So they must need all these drives to make it fast enough...?
Actually, 120 petabytes=122,880Tb which equals 61,440 individual 2Tb HDDs.
Then, the 122,880tb=125,829,120Gb and if you divide that by 200,000, you get about 630gb per HDD.
I can only imagine two different types of entities that would want to purchase this:
- A military agency, OR
- A company that is preparing for and aiming to be a big provider of cloud services.
In raid 0 (good luck finding the 1 drive which fails)!
Only joking, of course they won't be in raid 0.
Do you think it will cost more than $150, because I have a little over $300 and I want to buy 2.
This is probably to backup facebook, so the advertisers can find historical data a million years from now, and match your great great great.....(x100) great grand sons face to yours for their next gen ancestry.com ad's
facebook data isn't that large...... not the text info
Google must be about to hatch their master plan....
First: consolidate all the information they ever collected onto one machine
Next: Begin analyzing trends using variables and timetables from every known source
Lastly: Use this information to predict the future and take over the world
Google must be about to hatch their master plan....
First: consolidate all the information they ever collected onto one machine
Next: Begin analyzing trends using variables and timetables from every known source
Lastly: Use this information to predict the future and take over the world
Do you think it will cost more than $150, because I have a little over $300 and I want to buy 2.
lol wut?
Advanced tagging, mapping and storage of porn?120 Petabytes = 120.000 Terabytes / 2TB per drive would equal 60.000 2TB drives, but they store it on 200.000 drives?120.000 Terabytes / 200.000 drives = 600MB per drive. So they must need all these drives to make it fast enough...?
You're forgetting the overhead for running as a RAID 10. So likely 1.5 TB with double the drives for the mirroring.
So, 90k for one RAID 0 and then another 90k for the other side of the mirror of the RAID 0.
Oh boy I hope they have a pretty good Anti Virus
Google must be about to hatch their master plan....
First: consolidate all the information they ever collected onto one machine
Next: Begin analyzing trends using variables and timetables from every known source
Lastly: Use this information to predict the future and take over the world
multiple posts....
I can't remove them, why?
I'm not the RAID expert, but would a RAID 5 be better for redundancy? I know that the 2 drives in a RAID 10 that have the same data failing are improbable, but I was just wondering.
On another thought: think of how many drives they'll be replacing like, what, every day? I've heard of large arrays needing several replacement drives/wk. Imagine this!
I'm not the RAID expert, but would a RAID 5 be better for redundancy? I know that the 2 drives in a RAID 10 that have the same data failing are improbable, but I was just wondering.On another thought: think of how many drives they'll be replacing like, what, every day? I've heard of large arrays needing several replacement drives/wk. Imagine this!
I think you're right. RAID 5 is the best RAID solution for mirroring/backup as far as I know. Certainly better than RAID 10
It stores every phone call, text message, email, message board, forum post, and news comment (and miscellaneous) from around the world. It stores, searches for specific words, names, and phrases, and determines the relevance, if any, then permanently saves (xferred to smaller system) or purges the data. All voice and data communications are being rerouted through specific satellites in geosynchronous orbit around the earth.
I got 600 000 000 000 000 , which should be 600tbs for drive. Or would be.
Given that 120pbs would equal 120 000 000 000 000 000 000.
Still seems like theres a varible of different not included somewhere even on this scale.
But after checking again for like the 5 time be 600gs per drive for peta.
K-zon:
Then whatever that would round too or from on ideas of how hard drives say calculate.
Shouldn't be that hard of a problem at like all, but seems to have its place of "catch" for some reason.
uh yeah, DoDPA is on the right track. i can't believe this hasn't been mentioned yet. it's obviously a storage shed for government surveillance of citizens. that means you and me.
seriously, isn't this obvious? what is wrong with you sheeple.
We have been pushing the limits of mechanical disk reading lasers. Blue spectrum is the smallest imprint we are going to get, and the data error limits on drives past 3 terabytes are really small, in that it is very likely to have a bad sector somewhere on the disk by that point.
Hard drives don't use lasers, never have. Optical drives and sharks use lasers. Current generation hard drives are somewhere in the 600Gb/in2 range with current PMR techniques likely to top out around 1Tb/in2. Seagate has predicted that the next generation of hard drives (HAMR) will be able push capacity into the 50Tb/in2 range. That's over 80 times more dense than current drives. So we are no where near the limits of mechanical hard drive capacities.
I'm not the RAID expert, but would a RAID 5 be better for redundancy? I know that the 2 drives in a RAID 10 that have the same data failing are improbable, but I was just wondering.On another thought: think of how many drives they'll be replacing like, what, every day? I've heard of large arrays needing several replacement drives/wk. Imagine this!
RAID 6 is much better as you have much more redundancy. Though with that many drives they must be using some sort of proprietary system to handle issues of redundancy and all the read/write cycles those drives will be going under. I would speculate that the system would more likely have thousands of RAID 60 array. Which using their GPFS filesystem views each array as an individual sector of the giant array. Though they likely developed some sort RAID system designed specifically for handling massive numbers of hard drives.
It would just be too inefficient to break apart every file among 200,000 drives then reassemble it. Not to mention you would severely limit the number of files that could be accessed at one time before slowing the array to a crawl.
.Azimuth01 08/30/2011 4:06 AM
Google must be about to hatch their master plan....
First: consolidate all the information they ever collected onto one machine
Next: Begin analyzing trends using variables and timetables from every known source
Lastly: Use this information to predict the future and take over the world
Still based on ideas of relevent information
My guess would be 1TB 2.5in drives in 6- to 8- drive RAID 6 configurations with some overhead to manage all the RAID 6 arrays. 3.5in drives just don't cut it for the density requirements of such a massive storage solution.
Also to oparadoxical_: You are misrepresenting units quite a bit. Lowercase b is for bits, not bytes. Lowercase t is meaningless. I suppose you were trying to illustrate binary vs decimal units, in which case it'd be TiB vs TB, GiB vs GB etc.
Misleading title as always....