NSA to store yottabytes of surveillance data in Utah megarepository (update: not so much)

There’s an interesting article in the current New York Review of books (predictably, a book review) detailing the history of the National Security Agency, that shadowy power-behind-the-power to which we surrender much of our privacy. That in itself is interesting, but I found the introduction a bit shocking: the NSA is constructing a datacenter in the Utah desert that they project will be storing yottabytes of surveillance data. And what is a yottabyte? I’m glad you asked.

There are a thousand gigabytes in a terabyte, a thousand terabytes in a petabyte, a thousand petabytes in an exabyte, a thousand exabytes in a zettabyte, and a thousand zettabytes in a yottabyte. In other words, a yottabyte is 1,000,000,000,000,000GB. Are you paranoid yet?

The more salient question is, of course, what are they storing that, by some estimates, is going take up thousands of times more space than all the world’s known computers combined? Don’t think they’re going to say; they didn’t grow to their current level of shadowy omniscience by disclosing things like that to the public. However, speculation isn’t too hard on this topic. Now more than ever, surveillance is a data game. What with millions of phones being tapped and all data duplicated, constant recording of all radio traffic, 24-hour high definition video surveillance by satellite, there’s terabytes at least of data coming in every day. And who knows when you’ll have to sift through August 2007’s overhead footage of Baghdad for heat signatures in order to confirm some other intelligence?

As for the medium on which the data might be stored on, that’s anybody’s guess. Whoever’s making the estimates is probably playing a bit fast and loose with exponential curves, but if any of the alternative storage technologies we cover here on CG are any indication, yottabytes won’t seem so big a few years from now. We can be sure, however, that despite their better dollars-per-gigabyte cost, spinning hard disks won’t be in use as a main medium. The electricity required, mean time before failure, and other maintenance issues are probably unacceptable for an economy-minded government agency — interestingly, it seems that lack of electricity is one of the NSA’s primary concerns.

The article mentions that the NSA’s equivalent in the UK, the Government Communications Headquarters, asked that all telecoms providers store and hand over a huge amount of customer data for an entire year. They refused, citing “grave misgivings” and noting that at any rate the level of data collection expected was “impossible in principle.” Tut tut! Those Brits lacked the American can-do spirit. Thus it was that AT&T and other telecoms instantly complied with US mandates following September 11. The extent of the government’s meddling with switches, routers, antennas, and so on may never be fully known, but I wouldn’t be surprised if everyone reading this article isn’t on the record somewhere. Storage capacity of this magnitude implies a truly unprecedented amount of subjects for monitoring.

There is talk of the NSA shutting down altogether or being rolled into another agency, but I suspect that the “too big to fail” idea, as well as the “our safety is worth any price” dogma, will prevent that eventuality. It’s more reasonable to ask when or if its expansion will cease being sustainable. These datacenters, and the yottabytes they will hold, are extremely expensive as well as practically having bulls-eyes painted on them to the enemy (whoever he is) — though at under $10bn the NSA’s budget is a footnote compared to other programs and agencies. So is the increasingly (to use a semi-word that is only rarely usable) tentacular NSA a necessary evil of the digital age, or a cancerous money sink born from the colossal intelligence competition of the Cold War?

The answer will only be visible in retrospect years from now, perhaps when a sequel to the book being reviewed (The Secret Sentry: The Untold History of the National Security Agency, by Matthew M. Aid) is released covering the heavily-redacted records of the early 2000s. In the meantime, it’s probably best to assume that the walls have ears.

(Updated with a note on storage medium)

Update 2: A commenter points out that in the study cited, yottabytes are only one possible estimate for total storage requirements. The more realistic estimates are in the hundreds of petabytes, which is much easier for a datacenter to accommodate. That said, I’m leaving the post as it is because the speculation still stands with “only” hundreds of petabytes being stored in these datacenters. However, adjust your tinfoil hats accordingly. (Crunch Gear, 11.01, Devin Coldewey) http://www.crunchgear.com/2009/11/01/nsa-to-store-yottabytes-of-surveillance-data-in-utah-megarepository

0homefly.gif (8947 bytes)