Tuesday, September 7, 2010

Soon Petabytes Won't be Enough

The Guardian published an article that quote researchers as estimating the amount of data published in the last year - what they call the "digital universe" - as having grown 62% from the prior year, reaching a total of 800,000 petabytes. That's a pretty freaking large number - a single petabyte is 1,000,000 gigabytes, which makes it more than 1,000,000,000,000,000 bytes (oh wait, note my digression below, apparently that makes it exactly 1,000,000,000,000,000 bytes).

They also note that with an expected further 40% or more growth in the next year they will be switching over to zetabytes, which are a million petabytes, or 10 to the 21st power*. When numbers get that large it's hard to really make any sense out of them. It does seem like lots of storage will continue to be needed, and all that content should drive lots of traffic, which is all good from my employer's point of view.

* I'm having one of my nerd driven slightly off the topic reactions to this article: what happened to the powers of 2?
Since kilobytes and megabytes originally entered common usage as measurements of computer memory capacity, and computers use binary for addressing, computer units are based on powers of two. A kbyte is defined as 2 to the 10th power of memory, or 1,024 bytes, and a megabyte is 2 to the 20th power, or 1,048,576 bytes. I've always assume then that a gigabyte would be 2 to the 30th or 1,073,741,824 bytes, and by extension that petabytes are 2 to the 50th power, and zetabytes would be 2 to the 70th power. I wonder if they still use 1,024 for each expansion of 3 digits or not any more? Enquiring minds want to know.



Update: I see that wikipedia defines a petabyte as 10 to the 15th power bytes, so they've dropped the powers of two from that measurement. So much for being easily able to figure out the numbers of bits required to index mass storage in my head. Another nifty mental ability becomes a dinosaur skill. Sigh. On the other hand, wikipedia helpfully pointed out that 2 to the 50th power is a pebibyte: nice thought, but I've never seen anyone use pebibytes when measuring capacity.

No comments: