This is a talk that I gave at the Radcliffe Workshop on Technology and Archival Processing on April 3, 2014. I hope you enjoy what I had to say. I think it dovetails nicely with the work the four of us do on this blog.
I’m very happy to be here today. As Ellen mentioned, my name is Maureen Callahan. I currently work at the Tamiment Library at New York University in a technical services role. In our context, which I know isn’t unique, almost all of our arrangment and description work is done by very new professionals or pre-professionals. This means that most of my job is teaching, coaching and supervising – and making sure that all of the workers I supervise have the infrastructure, support, and knowledge they need to meet our obligations to users and donors.
Because I work with pre-professionals, I think it’s important to be deliberate and take the time to explain the values behind archival description – what our obligations are, how to make our work transparent, what’s valuable and what isn’t, how we should be thinking about how we spend our time, and how to look at the finding aid that we’ve created from a researcher’s point of view.
When the organizers asked me to present, they included a few questions, questions that have been weighing on my mind too.
In their initial email, Ellen Shea and Mary O’Connell Murphy asked, “Is the product of a finding aid worthy of the time required to make them considering emerging technologies? Where do you think research guides might be headed in the future? How do you think they must change in order to improve access to archival collections and meet today’s user’s needs?”
Most provocatively, they asked, “What do researchers really want from finding aids? Do they want them at all?”
And I think that the answer is no. And maybe. And yes.
At its core, I think that this question gets at what is and isn’t valuable about what archivists do, and what might be good for us to pay more attention to.
So, what do finding aids do? Why do we create them?
OK, so we can start by looking at finding aids as a way to address the practical problem of giving potential researchers access to unique or rare material that can only be found in a single location, behind a locked door in a closed stacks. Until you come here and show us your ID and solemnly swear that you’re going to follow our rules, the finding aid is all you get. This is the deal. So, to answer the question of whether researchers want finding aids – no. They don’t. They want the records. But they get the guide first.
And many parts of a finding aid – the parts that we spend so much of our time creating – take this imperfect surrogate role. Many finding aids are built on the model of looking at a body of records, dividing it into groupings (either physically or intellectually, usually both), and then faithfully representing files in that grouping to a mind-numbing level of meticulous detail. I’m going to call this model a map.
And what this slide, which is based on an analysis of the finding aids at the Tamiment Library, will show you is that yes, this work is getting done. We have plenty of information about what the materials tell us about their titles and dates and how much we have of it. But this slide only tells us information about finding aids that have been created. I also know that backlogs are still a problem at a lot of repositories, especially mine. This “mapping” model of tedious representation, starting at the beginning and going to the end, means that often the end never comes. We have plenty of collections that aren’t represented at all. Is this serving our users? Does this meet our donors’ expectations? Can’t we find a better way?
I’m looking forward to hearing from speakers today and tomorrow who will talk about how we can get machines to do some of this mapping for us. Because, as far as I’m concerned, good riddance. I don’t think that archivists are just secretaries for dead people, and I welcome as much automation as we can get for this kind of direct representation of what the records tell us about themselves.
Indeed, it’s already happening. At my institution, we’re just starting to work through the process of accessioning electronic records, and I can already see how tools like Forensic Toolkit help us to get electronic records to describe themselves.
After all, electronic records are records. Digital archives are archives. This is our present, future, and poorly-served past. And in the case of electronic records, we have ways of transcending the problem of our collections being singly, uniquely sited, requiring a mapping of what’s inside.
But some collections are, indeed, unique and sited. Before going on, I want to be pragmatic about the idea of scanning everything that isn’t born-digital, that does require a certain degree of mapping. I think we should be scanning a lot, I think we should be scanning much more than we are, but I don’t think that we necessarily should be scanning everything. I think we should scan what the people want. The city archives of Amsterdam, which has the most complete and sophisticated scanning operations that I’ve encountered, has committed to providing researchers with what they want not by scanning everything (they estimate that it would take 406 years to do so for all 739 million pages in their holdings, even in an extremely robust production environment) but by scanning what the users want to see. After all, what if you want to see the 739 millionth scan? And in order to figure out what the people want, we need some minimal level of mapping. Not every file, not in crazy, tedious detail, but some indication of what’s in a collection
So, we’ve dispensed with much of the map. Didn’t that feel good? What else is a finding aid? What else does the archivist do? What else do our researchers need from us?
At the next level of abstraction, a really good finding aid is a guide. In this painting by Eugene Delacroix, we see Virgil leading Dante across the river Styx. I don’t want to take this metaphor too far, but I do think that there’s a role for the archivist to help researchers understand our materials by explaining the collections, pointing out pitfalls and rich veins of content, rather than just representing titles on folders.
I can see, in some contexts, that it makes sense for an archivist to spend quality time really understanding the records and explaining this understanding so that each researcher doesn’t have to wade through it every time. When I teach description, I urge workers to evaluate rather than represent records. For instance, does a correspondence series include long, juicy, hand-written letters wherein the writer pours his heart out? Or are they dictated carbon copies based on forms? A title of “Letter from John Doe to Jane Smith” doesn’t tell us this, but an archivist’s scope and content note can. It takes a lot of time to type “Correspondence” and the date a zillion times. Wouldn’t researchers prefer an aggregate description and date range with a nice, full note about what kinds of correspondence with what kinds of information she can expect therein? This is a choice to guide rather than map.
So here, we’re representing information about the collection that a researcher would need to spend a lot of time to discover on his own. And by the way, I’m not claiming a breakthrough. Seasoned archivists do this all the time. It’s also what Greene and Meissner were talking about in their 2005 article – our value is in our focus on the aggregate and the judgment required to make sense of records, rather than just representing them.
So to answer the original question, I would say that maybe, yes, maybe, researchers do want these kinds of finding aids where some of the sensemaking has already been done for them. The scale of archives is large, and it may indeed be inefficient to expect researchers to browse scanned document after scanned document to get a good understanding of what this all means together.
But there’s an even higher level of abstraction central to our role as archivists that should be included in our finding aids, which I rarely see documented comprehensively or well. This is the information about a collection that no amount of time with a collection will reveal to a researcher – it has to do with the archivists’ interventions into a collection, the collection’s custodial history, and the contexts of the records’ creation.
This last bit – getting to understand who created records, why they were created, and what they provide evidence of – really gets to the nature of research. These are the questions that historians and journalists and lawyers and all of the communities that use our collections ask – they don’t just see artifacts, they see evidence that can help them make a principled argument about what happened in the past. They want to know about reliability, authenticity, chain of custody, gaps, absences and silences.
This is the core work of archivists. This is what we talked about over and over again when I was in graduate school, and what has been drilled into me as the true value we, as archivists, add to the research process. We occupy a position of responsibility, of commitment to transparency and access. Researchers expect us to tell them this information, and we do a terrible job of doing so.
The above slide is based on the same corpus of finding aids at the Tamiment Library. While we did a great job of documenting what we saw before us, we did an abysmal job of explaining who gave us the collection and under what circumstances, how we changed the collection when we processed it, and what choices we made about what stays in the collection and what’s removed. And from what I can tell, it’s pretty consistent with the kinds of meta-analysis done by Dean and Wisser, and also by Bron, Proffitt, and Washburn in their recent articles analyzing EAD tag usage.
Like I say, communicating this in the finding aid is some of the most important work we do, and we do a pretty bad job of it. I have no reason to believe that my library is unique in this.
Because I also know, when I go to describe records, especially legacy collections that have sat unprocessed for a long time, I often have to do this by guessing. I’m like an archaeologist who tries to figure out the life of these documents before they came to me based on the traces left behind. It’s what I most want to explain, and what I often have the least evidence of.
This is an area where curators — collectors — whatever you call them — can intervene, where the best of breed are invaluable. After all, we’re not doing archaeology and working with the remains of long-dead civilizations. Creators, heirs or successors are usually around — they’re the ones who packed the boxes and dropped off the materials. Let’s make sure that we sit them down and talk with them then. Let’s make sure we’re getting all of the good stuff. Let’s make sure we really understand the nature of the records before we ask the processing archivist — usually a person fairly low in the organizational hierarchy, often a new professional, and almost always the person with the least access to the creator — to labor at reconstruction when just asking the creator might reveal all.
I have one short anecdote from my own repository to help illustrate this problem. In 1992, the Tamiment Library acquired the records of the Church League of America from Liberty University in Lynchburg Virginia. The Church League of America was a group created in the 1930s to oppose left-wing and social gospel influences in Christian thought and organizations through research and advocacy. The first iteration of the finding aid for this collection could be described as a messy map – a complicated rendering of the folder titles found in this extensive collection, without much explanation of what it all means and how it came.
Two years ago, before I came to Tamiment, my colleagues did a re-processing project. In doing so, they realized that these records had a rich history and diverse creators — far richer than what the finding aid had indicated. It turns out that the collection is an amalgamation of many creators’ work, including the files of the Wackenhut Corporation, which started as a private investigations firm and moved on to be government contractor for private prisons. The organization maintained files on four million suspected dissidents, including files originally created by Karl Barslaag, a former HUAC staff member, and only donated them to the Church League of America in 1975 as a way of side-stepping the Fair Credit Reporting Act.
Until re-processing happened, researchers had an incomplete picture of the relationship between private commerce and non-profit organizations that converged to become the lobbying arm of the anti-Communist religious right.
So back to our original question. Do researchers want finding aids qua finding aids? No, maybe, yes. They want the stuff, not descriptions of the stuff. They might want some help navigating the stuff. And they absolutely want all the help that they can get with uncovering the story behind the story.
Before I turn this over to Trevor, I want to add a brief coda about how we should be thinking of finding aids as discovery tools as long as we decide to have them.
Let’s start with a reality check – how are finding aids used? What do we know about information-seeking behavior around archival resources?
The first and most important thing that we know is that discovery happens through search engines. It is true that some sophisticated researchers know what kinds of records are held at what repositories – that the Tamiment Library holds records of labor and the radical left, or that Salman Rushdie’s papers are at Emory. But I think that we can all agree that “just knowing” isn’t a good strategy to make sure that researchers discover our materials!
This was the understanding that we started with at Princeton (my previous job) when we decided to revise our finding aids portal. Previously, our finding aids looked like a lot of other finding aids – very, very long, often monograph-length webpages that give a map – and the better once (there were many better once there), would also be a good guide as well.
Basically, we decided to surrender to Google. We hoped that by busting apart the finding aid into the components that archivists create (collections, series, files and items), and letting Google index it all, our users would be able to come directly to the content that they want to find.
This is the dream. A researcher searches Google for George Kennan’s the Long Telegram, and we can give her exactly what she’s looking for, in the context of the rest of the papers. We also wanted the finding aid to be actionable – a researcher can ask a question about the material, request to see it in the reading room, and, if it had been scanned, would be able to look at images directly in the context of the finding aid.
In this case, you can see a report on Jack Ruby from Allen Dulles’s Warren Commission files.
While we’re putting so much effort into making our finding aids into structured data, let’s make our finding aids function as data. Let’s make it so that we can sort, filter, compare, comment and annotate. Why do we take our EAD, which we’ve painstakingly marked up, and render it in finding aids as flat HTML?
Let’s work together to take the next step, to think critically about the metadata we’re creating, and then make sure that it’s readable by the machines that present it to our users.