Stuffed: Why Data Storage Is Hot Again. (Really!)
The world’s hard drive is almost full.
Most people think digital storage technology is unlimited — you can get free gigabytes of storage by just opening up a Gmail or Box account, after all.
But in about two years, we will start to generate more data than we can save with existing technologies.
For CIOs at banks, insurance companies and online retailers, the shortage is a looming crisis. For investors and startup entrepreneurs, it represents a huge opportunity, because it means demand will outstrip supply.
You can see the heart of the mismatch in the charts below. Market research firm IDC predicts that digital data will grow at a compound annual growth rate (CAGR) of 42 percent through 2020, thanks to the proliferation of cellphones, digital entertainment and new technologies like Big Data and the Internet of Things that tend to generate vast amounts of files.
The total amount of digital data generated in 2013 will come to 3.5 zettabytes (a zettabyte is 1 with 21 zeros after it, and is equivalent to about the storage of one trillion USB keys). The 3.5 zettabytes generated this year will triple the amount of data created in 2010. By 2020, the world will generate 40 zettabytes of data annually, or more than 5,200 gigabytes of data for every person on the planet.
Industry estimates calculate that 92 percent of all the data in the world has been generated in the last two years alone. As Jason Taylor, director of capacity engineering and analysis at Facebook, put it recently, we live in a “Write Once, Keep Forever” world.
The explosion of data is part of the reason venture capitalists and conglomerates have invested millions into next-generation storage arrays over the last three years. Growth will be particularly strong in emerging nations. China alone will generate 20 percent of the world’s data by 2020.
But here’s the problem: It’s far easier to generate zettabytes of data than to manufacture zettabytes of data capacity. A yawning gap is emerging between data production and hard drive and flash production. The purple curve below charts data growth. The bar charts track factory capacity. By 2020, demand for capacity will outstrip production by six zettabytes, or nearly double the demand of 2013 alone. Even if the gap shrinks to three zettabytes, it will be a massive gap to fill.
We’re not just in the Zettabyte Era. We’re in the “Unavailable” Zettabyte Era.
While digital-storage companies will continue to add capacity, we won’t likely close the gap. The hard-drive industry likely won’t even have a zettabyte of capacity until 2015 or 2016. Trying to build factory capacity to scale to demand would take billions in investment.
Flash is even more challenging. By 2020, the cost of a fabrication facility for making flash memory could rise from $4 billion today to $15 billion to $20 billion. It would take $210 billion in factory investments to displace 15 percent of the demand for drives in 2014, according to John Monroe at Gartner. Memory makers are chronically skittish about building new capacity because of the escalating costs. That $210 billion figure is outside the realm of possibility.
What happens when you have rising demand that can’t be met by business as usual? You see the development of new business models and technologies. It’s the classic situation where innovation emerges to create new opportunities. Cloud services represent a first step in that direction. By keeping files in a centrally managed facility where equipment, drive space and bandwidth can be more efficiently utilized, companies like Dropbox and Box can drive out waste and redundancy. Customers are gladly paying fees because the cost for these services is less painful than trying to continually upgrade their own systems.
We will also likely see architectures where intelligent summaries replace raw data. Data from smart meters and building-management systems, for instance, can be converted to trend lines, while the original records get sent to tape archives or even optical disks. Data on tape isn’t rapidly accessible, but it’s cheaper. Better software for eliminating duplicate files and more quickly navigating sprawling databases will also emerge.
And of course we’re going to see continued advances in density, capacity, retrieval speed and other performance metrics. Heat-assisted magnetic recording will lead to drives in 2015 that can store a terabit of information on a square inch. Nick Goldman and Ewan Birney of the European Bioinformatics Institute (EBI) are experimenting with ways of storing data on engineered strands of DNA.
At this point, it’s impossible to predict winners and losers. But make no mistake: Space is filling up fast.
Rocky Pimentel serves as president, global markets and customers, and is responsible for Seagate’s global customer engagement, sales, sales operations, product line management, marketing and retail activities. His diverse career experience has included COO and CFO at McAfee, senior vice president and CFO at LSI Logic, and senior leadership positions at Glu Mobile, Zone Labs, WebTV Networks and Redpoint Ventures.