Archival Description for Web Archives

If you follow me on Twitter, you may have seen that the task I set out for myself this week was to devise a way to describe web archives using the tools available to me: Archivists’ Toolkit, Archive-It, DACS and EAD. My goals were both practical and philosophical: to create useful description, but also to bring archival principles to bear on the practice of web archiving in a way that is sometimes absent in discussions on the topic. And you may have seen that I was less than entirely successful.

Appropriate to the scope of my goals, the problems I encountered were also both practical and philosophical in nature:

  • I was simply dissatisfied with the options that my tools offered for recording information about web archives. There were a lot of “yeah, it kind of makes sense to put it in that field, but it could also go over here, and neither are a perfect fit” moments that I’m sure anyone doing this work has encountered. A Web Archiving Roundtable/TS-DACS white paper recommending best practices in this area would be fantastic, and may become reality.
  • More fundamentally, though, I came to understand that the units of arrangement, description and access typically used in web archives simply don’t map well onto traditional archival units of arrangement and description, particularly if one is concerned with preserving information about the creation of the archive itself, i.e., provenance.

Records Management for Discards

Maybe this is a familiar problem for some other archivists. You have a collection that you’ve just finished processing — maybe it’s a new acquisition, or maybe it’s been sitting around for awhile — and you have some boxes of weeded papers leftover, waiting to be discarded. But for some reason — a reason usually falling outside of your job purview — you are not able to discard them. Maybe the gift agreement insists that all discards be returned to the donor, and you can’t track down the donor without inviting another accession, and you just don’t have time or space for that right now. Maybe your library is about to renovate and move, and your curators are preoccupied with trying to install 10 exhibitions simultaneously. Maybe the acquisition was a high-value gift, for which the donor took a generous tax deduction, and your library is legally obligated to keep all parts of the gift for at least three years. Maybe your donor has vanished, the gift agreement is non-existent, or the discards are actually supposed to go to another institution and that institution isn’t ready to pay for them. The reasons don’t matter, really. You have boxes of archival material and you need to track them, but they aren’t a part of your archival collection any more. How do you manage these materials until the glorious day when you are actually able to discard them?

We’ve struggled with this at Duke for a long time, but it became a more pressing issue during our recent renovation and relocation. Boxes of discards couldn’t just sit in the stacks in a corner anymore; we had to send them to offsite storage, which meant they needed to be barcoded and tracked through our online catalog. We ended up attaching them to the collection record, which was not ideal. Because the rest of the collection was processed and available, we could not suppress the discard items from the public view of the catalog. (Discards Box 1 is not a pretty thing for our patrons to see.) Plus, it was too easy to attach them to the collection and then forget about the boxes, since they were out of sight in offsite storage. There was no easy way to regularly collect all the discard items for curators to review from across all our collections. It was messy and hard to use, and the items were never going to actually be discarded! This was no good.

I ended up making a Discards 2015 Collection, which is suppressed in the catalog and therefore not discoverable by patrons. All materials identified for discard in 2015 will be attached to this record. I also made an internal resource record in Archivists’ Toolkit (soon to be migrated to ArchivesSpace) that has a series for each collection with discards we are tracking for the year. It is linked to the AT accession records, if possible. In the resource record’s series descriptions, I record the details about the discards: what is being discarded, who processed it, who reviewed it, why we haven’t been able to discard it immediately, and when we expect to be able to discard the material (if known). The Discard Collection’s boxes are numbered, barcoded, and sent to offsite storage completely separated from their original collection — as it should be. No co-mingling, physically or intellectually! Plus, all our discards are tracked together, so from now on, I can remind our curators and other relevant parties at regular intervals about the boxes sitting offsite that need to be returned, shredded, sold, or whatever.

I’d love to hear other approaches to discards — this is a new strategy for us, so maybe I’ve missed something obvious that your institution has already solved. Let me know in the comments. Happy weeding, everyone!