From my understanding of Abbie Grotke’s “Web Archiving at the Library of Congress,” the main idea behind archiving online content is “to reflect the true evolution of society, government, and culture online…[in order to] ensure that a representative sample of the web is preserved.” While this seems a simple enough notion, the actual implementation of this preservation process carries with it a number of challenges and questions along the way.
The largest of these challenges is the question of what exactly should be preserved in order to fully realize the goal of presenting the changing ideas, values, and beliefs of contemporary society for future study. In some cases, such as Roy Rosenzweig’s case study of the online collection effort following the terrorist attacks of September 11th, 2001, the answer is obvious. Events carrying such heavy global, national, and historic significance clearly require documentation, as do any digital content associated with them. The question of what sort of content is worthy of preservation is where the proverbial waters tend to get murky. Functioning as a simultaneously public and personal space, the web serves as a platform for not only “official” responses to such high-profile events (in the form of online newspapers, web pages for government organizations, and the like), but also the reactions of everyday individuals through social media networks, interpersonal messages, and online journals and blogs.
This combination of professional and recreational content offers historians a vast array of material through which can be gathered a well-rounded snapshot of the core ideals, beliefs, and values of contemporary society. As pointed out by Jinfang Niu in “An Overview of Web Archiving,” this snapshot is not limited to what can be considered the “better” aspects of society (“literature, scientific publishing”), but also representative of some of the “worst” (“advertising, pornography”). Such an array of material ranging from the ephemeral to the long-lasting, from the enlightening to the unsavory form the basis for the challenge of deciding which content to preserve within the digital archiving system. This initial question spawns further issues, such as those of copyright, ownership, and fair use.
How far should an institution go when it comes to archiving online material? Should a certain type of content take priority for preservation over others (scholarly vs. personal, for example)? If so, does this skew the “well-rounded” snapshot of society that historians are striving to preserve?
Check out some video interviews for Association of Research Libraries’ Code of Practices in Fair Use here. Does this Code of Practices ease some of the challenges faced in web archiving?