Google Book Search treats government documents as copyrighted material
But there's one really distressing thing happening here. U.S. government documents are, by law, in the public domain, as they are produced with taxpayer funds. The only exceptions to public domain status are a limited number of cases in which private contractors have been hired to produce certain materials to which they retain copyright. But what Google seems to be doing is treating govdocs published after 1923 in the same way as they are treating non-government materials -- as potentially copyrighted material. This is unnecessarily cautious, and what it means is that only "snippets" are available and the document remains unreadable.
Take a look at an example or two:
Here's what appears to be Volume I of Violations of Free Speech and Rights of Labor, the famous LaFollette Committee hearings on industrial espionage, strikebreaking and other unsavory weapons used against labor in the 1930s. If you click on the link, you'll just see snippets. This is a public domain document published in 1936 by the Government Printing Office.
Here are published hearings before the Judiciary Committee in 1974 on the subject of "Warrantless Wiretapping and Electronic Surveillance." This is another public domain document that seems highly relevant at this moment.
And here's one volume of a large collection of Senate hearings investigating the munitions industry, probably sometime in the 1930s (the Google "metadata" shows no date).
By contrast, here's a pre-1923 government publication that is readable in full through Google's book reader: Benton MacKaye's fascinating report, Employment and Natural Resources (1919). MacKaye, as many of you may know, was the visionary planner and early "network thinker" who came up with the idea for the Appalachian Trail. It's great to see this report online.
I found dozens of govdocs that were just presented as "snippets," and there are probably thousands more. I've written Google Book Search to ask about what's going on. The real question, as I see it, is whether they are being overcautious (which the 1923ish cutoff would suggest) or worried that making complete page images of government documents easily available might somehow be a threat to their business model. Either way, it's not a good thing to treat government documents as if they were copyrighted material.
(Rick, speaking in his individual capacity)
UPDATE: I've heard from people who may be in a position to know something about this issue. They remind me that many Congressional hearings reprint copyrighted information (e.g., news articles and excerpts from publications). Although the copyright situation involving these reprinted extracts is uncertain, Google is proceeding with extreme caution. I haven't yet checked to see whether post-1923 govdocs containing no reprinted material are restricted from full access. If this is indeed the case, it's unfortunate that negative reaction to Google Book Search has put them in such a defensive position.
FURTHER UPDATE: The embedded copyrights issue doesn't seem to apply. It appears that many other post-1923 government documents are presented only as "snippets." Examples:
The GPO Monthly Catalog for July 1934, a "pure" government document containing only government-generated content
Laws Relating to Shipping and Merchant Marine (1927)
United States Government Manual (1971-72)
and Copyright Law of the United States (date not indicated).
It looks as if Google is definitely considering all post-1923 works under copyright.