According to the article, Google plans to scan at least 32 million volumes, and be finished within ten years. The article estimates the cost of this at $800,000,000; elsewhere I've read numbers as low as $100,000,000 million.
“Previously, when people have done scanning, they always were constrained by their budget and their scale,” Clancy told me. “They had to spend all this time figuring out which were the perfect ten thousand books, so they spent as much time in selection as in scanning. All the technology out there developed solutions for what I’ll call low-rate scanning. There was no need for a company to build a machine that could scan thirty million books. Doing this project just using commercial, off-the-shelf technology was not feasible. So we had to build it ourselves.”
A few years ago I was begging Frank Campbell to let me scan some titles at the ANS library. He had so many concerns I hinted that it might be better for me to just wait for Google. Frank seemed surprised that I would believe Google would scan numismatic works and auction catalogs.
As the article implies, Google doesn't have a choice. They cannot afford to decide which books are worthy of scanning. At $30 to scan a book it's cheaper to scan than to pay a graduate student to decide which books are worth scanning! If librarians in any of Google's member libraries thought a title was worth acquisitioning then it'll be on Google.
What won't be there? 20th century auction catalogs? Anything else?
In related news, Google's copyright lawyer has a blog.