Today I had the pleasure of attending the Digital Public Library of America plenary session at the National Archives and Records Administration. It was one of those moments where you see something and you instantly know that this is going to be huge. The heaviest hitters in library science and digital access were there in full force, all of them throwing their support at this new coordinated initiative that, if successful, will revolutionize digital access to not only the United States, but to the world. And I’m not just saying that, I really, really believe that this is going to be an utterly transformative movement in the world of internet culture.
Let me get all the name dropping out of the way. Harvard, Stanford, The Internet Archive, Wikipedia, Public Knowledge, The National Archives and Records Administration, The Library of Congress, The Smithsonian, The National Endowment for the Humanities, The Institute for Museum and Library Services, The American Library Association, State Library of Texas, The Sloan Foundation, The Arcadia Foundation, The Gates Foundation… Carl Malamud, Brewster Kahle, Bob Darnton, Susan Hildreth, Maureen Sullivan… This thing was HUGE. The scale of it, never before attempted, and never before possible, and they brought everyone to the table, including interested parties from a similar project called Europeana, and the director of the British Library just happened to stop by. The other fascinating aspect of this was the participation of rank and file librarians (like myself) and library school students. They are really making an effort to spread the word and reach out to get the kind of feedback that they need to really develop a service that’s going to transform society.
And on one small, and interesting, detail: the entire conference was illustrated simultaneously by two different live artists. It was like watching RSA Animate live! All I could see of them was their pixie-like heads and their colored pens zooming along, but these ladies were incredible. They were able to summarize hours and hours of presentations into cool wall sized graphics. I’ve never seen anything like it done before my eyes. I want these ladies at every meeting I ever have.
I’m going to try and reconstruct the day from my tweets. Hopefully it won’t be too mangled.
Up first there was a welcoming prologue from the National Archivist David Ferreiro who turned it over to James Leach from the NEH. Leach talked about C.P. Snow’s concept of the Two Cultures: Sciences and Humanities, and how today’s culture is merging those two fields via projects like this. His driving note was that we need to develop an “infrastructure of ideas.” This was immediately followed by a generous donation from the Alfred P. Sloan Foundation of $2.5 million dollars toward the project, which was then followed by an equally generous matching $2.5 million from the Arcadia Foundation. Yeah. That just was announced almost randomly in front of everyone there. The speaker from Arcadia talked about how digitization projects to date have been haphazard boutique kinds of projects with a little money here and a little money there to make a small thing accessible online. His call to action was to develop the big box version of that, going from the boutique to the Wal-Mart phase. Everyone kind of gasped and chuckled.
The money bomb was followed by a report from some big players in the digitization movement here in Washington. Library of Congress has scanned and made available 28 million of their 148 million items in their collections, and are itching to get the rest out there, much of it public domain books with priority scanning for American History titles. IMLS was looking for projects that they could directly fund to help increase the DPLA movement. The resounding statement here was collaborate and conquer. NARA spoke about their mandate to make available a trove of some 400 million declassified documents by 2013. All of which are pending review by relevant agencies. The National Archivist wants to digitize absolutely every piece of paper in the archive and make it freely available. BOLD. In the follow up questions it was asked if they were considering the difference between making something accessible versus making something discoverable. Ferreiro made the statement that “if it’s not online it doesn’t exist.” I’ll come back to that in a minute. This was followed up by a lot of talk about massive amounts of metadata as well as accessibility for the blind and others as well. Lynne Brindley, the director of the British Library stood up and mentioned that they have opened up all of their metadata under a Creative Commons 0 license. A director from the Smithsonian also chimed in stating that they have 137 million items that they want to make available as well, most of them natural history specimens.
Now let me take a moment to just talk about metadata. Many of you who read this blog already know what that is, but for those of you who don’t let me try and explain it in plain English. When you go to the library and you use their online catalog to search for a book, that catalog is created from a database containing about 80-100 fields of information about that book from the title, author, and subject to really obscure things like the height of the book, its language, illustrators (if it has one), I could go on and on and on. Anyhow, that data, is data about the properties of that book. We call that metadata. Now, books aren’t the only things that have metadata. Everything does! Pictures online have metadata, items in museums have metadata, archives are loaded with metadata. The crazy thing is that wildly different standards have arisen for different industries, and all of that unique information is often only readable by systems specifically designed to read that database code. That’s one of the major hurdles in a project like this that wants to combine the forces of libraries, museums, archives and user generated content. It’s a metadata nightmare! But they are thinking about this and in a major way. More on metadata in the beta sprints.
Bob Darnton from Harvard wrapped up that session with a very inspirational vision that this is not just a project for America, but a project that is international in scope via partnerships with similar cultural heritage projects like Europeana. It’s easy to see that coming via open metadata standards between DPLA and Europeana. In fact they plan to do a digital exhibit on the history of European migration to the United States as one of their earliest partnership projects.
The next panel consisted of many of the visionary people behind the DPLA movement. The first was John Palfrey from Harvard. His vision of this system was not one unique repository, but rather an access point that coordinated online access to the digital treasures that are the purview of local institutions. He reinforced that the metadata itself needed to be open to everyone, and that the code that powers the DPLA be made available for local customization projects, like a Sourceforge for Libraries. He concluded with a hilarious idea about creating “scannebagos” to go out to different little towns and scan their documents and get them online. Peggy Rudd from State Library of Texas pushed the idea of making the DPLA so resourceful that it would itself spawn a verb, ala Googling, viz. DPLAing. Doesn’t have the same ring, but I like this vision of saying “I’m going to check The Library for it.” Brewster Kahle spoke about three simple ideas to build a digital America. The thing is we already are living in the digital America, and the services that we create today are what is going to drive the future of digital access online. His three points were to make everything in the public domain freely available, make orphaned works available to lend, and to buy digital copies of new works and lend them. Straightforward, and covers everything. Amanda French from the Center for History and New Media had what was the most poetic speech about the vision of the DPLA. She began by reading an aubade by John Donne, and talking about how we are clinging to our love of books as the sun is rising on a digital era. Her conclusion was to find the balance between the digital products that we absolutely need, as well as the necessity of the physical space of the library and that would lead us to the gleeful rendez-vous with the soul of the library. Carl Malamud was the final speaker and his was a call to action. He sounded a rallying cry to create a new public works program of digitizing our nation’s heritage. “Deploy the Internet Corps of Engineers!” It was astounding.
It was in this last panel’s question and answer session that we revisited the sentiment “if it’s not online, it doesn’t exist.” Several other people, Kahle and Malamud I believe, echoed that sentiment. When an audience member questioned this, asking “doesn’t this denigrate the physical work? Won’t people decide to not go to that museum, if they’ve already seen the entire collection online?” Amanda French chimed in and restated it. “If it’s not online people don’t know it exists.” Making content freely available increases it’s value by exposing and promoting it. How would anyone know if a museum in Iowa has a Caravaggio painting? Perhaps in the knowing of that information a person may plan a trip to Des Moines, thus increasing tourism through open access.
It was at this point that we went to lunch. I had a great conversation with some lawyers from Public Knowledge and Berkeley about the Hathi Trust / Author’s Guild lawsuit and the ridiculousness of it. It was great and the food was awesome.
I’m going to take a break in the narrative here and post the second half of the day with all of the technical details and visionary work as well as my questions and dreams in the next post.