When thinking about legacy data I have a whole lot of “legacy” to contend with; the Rare Book & Manuscript Library was officially established in 1930, though our collections date back to the founding of the University in 1754. There are 12 different units within the library, and, for a long time, collection management was practiced on the “unit” rather than department level with different curatorial units keeping their own collection lists and donor files. This all leaves me with a LOT of non-standardized data to wrangle.
This legacy data is scattered over various spreadsheets, shelf lists, catalog records, accession registers, and accessions databases (one homegrown, the other Archivists’ Toolkit). This means that as much information as we have about our collections it is all dispersed and I’ve never really been able to answer questions like “how big is your collection?” “How much of that is processed” or “how many manuscript collections does the RBML hold?” I know that’s not super unusual, but it can be frustrating; especially because we do know… sort of… I mean, let me get together a bunch of spreadsheets … and then count lines…. and most of our collections are represented on at least one of those… well, except the ones that aren’t…or are there multiple times…and the binders– don’t forget the binders! I think a lot of you know what I mean. We HAVE collection data, we just don’t have it stored or structured in a way that can be used or queried particularly effectively, and what we do have is in no way standardized.
Getting intellectual control over the collection has been an ongoing project of mine since I started my position, but it has ramped up as we have been starting to think more seriously about making a transition to ArchivesSpace. If we are going to make a significant commitment to using a new collection management system the first step needs to be to have the ability to populate that system with reliable, complete, and standardized information about our collections. This has led me to spend a significant chunk of my time the last year either updating our AT accession records to make sure that they are compliant with DACS’s requirements for single-level minimal records, or adding collection information from other sources into AT so that we have collection-level information about every collection in one place. I chose to go with minimal rather than optimal records since we are not using AT to generate finding aids or to hold the bulk of our descriptive metadata or any component metadata (more on that later!). My goal here is to get baseline intellectual control over our collection, so I am keeping the records lean. I am, however, adding both processing status and physical location in addition to the DACS required elements so that I can more easily determine the size of our backlog, set processing priorities, and to encourage myself and my colleagues to get used to looking in one central place for all key collection information. More about some of the strategies I’m using to make all of this happen in upcoming posts!