The Google Book initiative is a huge undertaking. Google is digitizing tens of millions of books and promising to make them available to the world forever. Sergey Brin, the co-founder of Google, wrote an impassioned editorial for the New York Times a few days ago, citing the destruction of the library at Alexandria as evidence of how important it is to preserve books for the ages, which Google promises to do.
Google gives away enough of its services that it takes an effort to remember that Google makes fourteen kajillion dollars every day and there are real issues in its book project about advertising, fairness to competitors, the prices charged to institutions, the conditions set on access by the public, the rights of authors and copyright holders, and much more.
I can’t cover those disputes here. It is a huge, multifaceted effort, with Google’s efforts offset by strenuous objections and lawsuits by competitors, libraries, publishers, authors, and many other affected groups. (There’s a pretty good FAQ here.) We might have the most comprehensive library in human history, within our lifetime, and all we have to do is trust Google – well, trust it and grant it a de facto monopoly. It’s hard to imagine another company with the technical and financial resources to do anything similar.
But I can point to a couple of times that Google has failed to live up to its promises and its reputation. It’s a little unnerving.
Wired Magazine has a fascinating article recalling Google’s offer to preserve the written conversations that gave life to the Internet – the Usenet archives, literally hundreds of millions of newsgroup postings spanning two decades. People exchanged public text messages through their dialup connections about everything imaginable, including the birth of the Internet itself. These are the chronicles of the early days of Microsoft, the birth of the Mosaic web browser, the early development of computers – interesting little bits of history buried in mountains of inconsequential words.
In 2001, Google acquired two different storehouses of Usenet messages, 700 million messages in 35,000 newsgroups spanning twenty years, and promised to safeguard them and make them available forever.
The archives were abandoned. They were technically available online but the search index had fatal bugs that made the archives effectively unusable. Searches by date range frequently returned no results; searching within a newsgroup would often also fail. Complaints went unanswered.
When Wired turned a public searchlight on the broken archives, Google quickly got some engineers working on fixing the bugs. By the next day, there were signs of progress. But that was after nearly ten years of neglect! (The Wired followup article also mentions in passing that the magazine had contacted Google a month earlier about the search failures; Google made no response until the magazine went public.)
This resonates with me because of my own experience learning that many blogs hosted by Google’s Blogger service are not being indexed and cannot be searched. Complaints about the Blogger search problems continue to be added to the Blogger forums. The problem has existed for months; in the last couple of months, no one from Google has responded to acknowledge the problems or make any promises about fixing them.
It’s not the first time I’ve wondered about Google – it sometimes seems to flit from project to project without quite fulfilling early promises or delivering all of its promising technology.
Google’s book project is far bigger and more public, and will undoubtedly get more attention from Google’s executives and engineers. But I can’t help but feel a little uncertain about whether we should trust Google unreservedly with the world’s library.