Rehousing is Not Processing

This post has been stewing around since last July, but builds nicely on our Extensible Processing book reviews and Maureen’s post on containers.

In supervising processing projects I’ve come across some wacky decisions made over the years. While I’ve started projects from the beginning, a large portion of projects I’ve supervised have been already in progress, started under someone else (or multiple others).

One reoccurring issue I’ve noticed is how often people say processing when they mean something else or they mean a very small part of processing. Rehousing is a common culprit, but other activities fall into this too. Here’s two examples of “processing projects” that mostly aren’t at all:

1. Collection was processed in the 1970s. A paper finding exists with notes, series descriptions, and a folder (or item) level container list. Materials are in acidic boxes and acidic folders, but labeled. Actions taken during the project: Rehoused all materials into acid-free folders and acid-free boxes, including creating additional, smaller folders. Changed the series arrangement and physically re-arranged materials. Excel container list was created by retyping (retyping!) the content of the paper finding aid  and adding in the newly created folders for conversion to EAD.

So many times I came across similar ongoing processing projects with a justification that the materials needed better housing. Often, other tasks got tacked on such as redoing the series outline (even if there were no additional materials to add and no evidence that the current series outline had issues.)

2. Preliminary inventory in Word exists at the folder level using creator created titles for a large volume of organizational records. Finding aid existed with summary notes and linked out to a PDF of the preliminary inventory. Actions taken during project:

  • Collection was rehoused into acid-free folders, staples removed, some preservation photocopying done, oversize materials removed and rehoused separately (separation sheets completed)
  • Materials were reviewed on an item level and marked as restricted. Some redaction might have happened. Sometimes the restricted materials were removed to a new folder with the same title and marked as restricted (using separation sheets in the original folder). Sometimes the restricted materials were left in place and the whole folder was labeled restricted.
  • Excel container list was created by retyping (retyping!) the exact information on the folder (aka the exact information already in the preliminary Word list) as materials were re-foldered. Largely, the creator titles were kept with some additions. Dates for folders were added or edited. Excel list will be converted to EAD.
  • Folders were physically grouped by letter of alphabet based on the folder title. Ex: All the folders starting with “A” are physically together in “A” boxes, but not in actual alphabetical order yet. (Currently, those folders are being arranged in alphabetical order in acid-free boxes. Look for an update on how long/how expensive just this one phase takes!)

Both of these examples were large projects that occurred over many years (often with pauses due to turn over and lack of resources). Looking back, what value did we add? The collections are in more stable housing than before and in one case we know more about restricted material. But, otherwise, what have we gained for our users that we didn’t already have?

Essentially, these were called processing projects but are really rehousing and restriction review projects. Not projects to create access to materials or bring intellectual or physical order to the materials. After all, they both already had a documented intellectual and physical order that should have been described in our finding aid notes (at whatever level.)

What we should do instead:

  • Put resources towards creating access to materials over rehousing materials.
  • Develop a baseline housing standard that you can live with. It might be that all materials are in acid-free boxes. Or maybe it’s just that your boxes aren’t falling apart.
  • Get over the idea that all collections need to be physically arranged and re-housed during processing (or re-processing). Rehousing a collection into acid-free folders and/or acid-free boxes is not the main goal processing. The task does not create access to collections or describe the materials. It’s housekeeping. It’s not necessary to include in a processing project.
  • Specifically state what rehousing tasks will occur in the processing plan and at what level. Justify spending processing resources on this. Don’t include it just because you’re used to including this task during processing.
  • Prioritize materials, at a repository level, that risk severe damage or information loss due to current housing based on importance. Develop a specific budget/set of resources for this type of work. Tap into the resources of your preservation/conservation department when available.

When facing resistance to not including rehousing in a processing project numbers are your friend. “Do we want to rehouse this collection that’s already pretty stable or do we want to take those resources and create access to more collections?” is often too abstract for people. Attaching actual costs to rehousing work (labor AND supplies) can help to push people resistant or nervous about dropping rehousing to focus on activities that create access. Treating rehousing work as separate from processing can also help to decouple the idea that your intellectual and physical order must always match.

Advertisements

On Containers

I’m here  to talk about boxes. Get excited.

I’ve been spending a LOT of time lately thinking about containers — fixing them, modelling them, figuring out what they are and aren’t supposed to do. And I’ve basically come to the conclusion that as a whole, we spend too much time futzing with containers because we haven’t spent enough time figuring out what they’re for and what they do.

For instance, I wrote a blog post a couple of months ago about work we’re doing to remediate stuff that should not but is happening with containers — barcodes being assigned to two different containers, two different container types with the same barcode/identifier information, etc. Considering the scale of our collections, the scale of these problems is mercifully slight, but these are the kinds of problems that turn into a crisis if a patron is expecting to find material in the box she ordered and the material simply isn’t there.

I’m also working with my colleagues here at Yale and our ArchivesSpace development vendor Hudson Molonglo to add functionality to ArchivesSpace so that it’s easier to work with containers as containers. I wrote a blog post about it on our ArchivesSpace blog. In short, we want to make it much easier to do stuff like assigning locations, assigning barcodes, indicating that container information has been exported to our ILS, etc. In order to do this, we need to know exactly how we want containers to relate to archival description and how they relate to each other.

As I’ve been doing this thinking about specific container issues, I’ve had some thoughts about containers in general. Here they are, in no particular order.

What are container numbers doing for us?

A container number is just a human-readable barcode, right? Something to uniquely identify a container? In other words, speaking in terms of the data model, isn’t this data that says something different but means the same thing? And is this possibly a point of vulnerability? At the end of the day, isn’t a container number  something that we train users to care about when really they want the content they’ve identified? And we have a much better system for getting barcodes to uniquely identify something than we do with box numbers?

In the days that humans were putting box numbers on a call slip and another human was reading that and using that information to interpret shelf location, it made sense to ask the patron to be explicit about which containers were associated with the actual thing that they want to see. But I think that we’ve been too good at training them (and training ourselves) to think in terms of box numbers (and, internally, locations) instead of creating systems that do all of that on the back end. Information about containers should be uniform, unadorned, reliable, and interact seamlessly with data systems. Boxes should be stored wherever is best for their size and climate, and that should be tracked in a locations database that interacts with the requesting database. And the actual information should be associated seamlessly with containers.

This means that instead of writing down a call number and box number and reading a note about how materials of this type are stored on-site and materials of another type are stored off-site, let’s take a lot of human error out of this. Let’s let them just click on what they want to see. Then, the system says “a-ha! There are so many connections in my database! This record is in box 58704728702861, which is stored in C-29 Row 11, Bay 2, Shelf 2. I’ll send this to the queue that prints a call slip so a page can get that right away!” And instead of storing box numbers and folder numbers in the person’s “shopping cart” of what she’s seen, let’s store unique identifiers for the archival description, so that if that same record get’s re-housed into box 28704728702844 and moved to a different location, the patron doesn’t have to update her citation in any scholarly work she produces. Even if the collection gets re-processed, we could make sure that identifiers for stuff that’s truly the same persists.

Also, don’t tell me that box numbers do a good job of giving cues about order and scale. There are waaaaaayyyyy better ways of doing that than making people infer relationships based on how much material fits into 0.42 linear feet.

We have the concepts. Our practice needs to catch up, and our tools do too.

Darn it, Archivists’ Toolkit, you do some dumb things with containers

Archival management systems are, obviously, a huge step up from managing this kind of information in disparate documents and databases. But I think that we’re still a few years away from our systems meeting their potential. And I really think that folks who do deep thinking about archival description and standards development need to insert themselves into these conversations.

Here’s my favorite example. You know that thing where you’re doing description in AT and you want to associate a container with the records that you just described in a component? You know how it asks you what kind of an instance you want to create? That is not a thing. This is just part of the AT data model — there’s nothing like this in DACS, nothing like it in EAD. Actual archival standards are smart enough to not say very much about boxes because they’re boxes and who cares? When it exports to EAD, it serializes as @label. LABEL. The pinnacle of semantic nothingness!

This is not a thing.

This is not a thing.

Like, WHY? I can see that this could be the moment where AT is asking you “oh, hey, do you want to associate this with a physical container in a physical place or do you want to associate it with a digital object on teh interwebz?” but there’s probably a better way of doing this.

My problem with this is that it has resulted in A LOT of descriptive malpractice. Practitioners who aren’t familiar with how this serializes in EAD think that they’re describing the content (“oh yes! I’ve done the equivalent of assigning a form/genre term and declaring in a meaningful way that these are maps!”) when really they’ve put a label on the container. The container is not the stuff! If you want to describe the stuff, you do that somewhere else!

Oh my gosh, my exclamation point count is pretty high right now. I’ll see if I can pull myself together and soldier on.

Maybe we should be more explicit about container relationships.

Now, pop quiz, if you have something that is in the physical collection and has also been microfilmed, how do you indicate that?

In Archivists’ Toolkit, there’s nothing clear about this. You can associate more than one instance with an archival description, but you can also describe levels of containers that (ostensibly) describe the same stuff, but happen to be a numbered item within a folder, within a box.

Anything can happen here.

Anything can happen here.

So this means that in the scenario I mentioned above, it often happens that someone will put the reel number into container 3, making the database think that the reel is a child of the box.

But even if all of the data entry happens properly, EAD import into Archivists’ Toolkit will take any three <container> tags and instead of making them siblings, brings the three together into parent-child instance relationship like you see above. This helps maintain relationships between boxes and folders, but is a nightmare if you have a reel in there.

EAD has a way of representing these relationships, but the AT EAD export doesn’t really even do that properly.

 <c id="ref10" level="file">
   <did>
     <unittitle>Potter, Hannah</unittitle>
     <unitdate normal="1851/1851">1851</unitdate>
     <container id="cid342284" type="Box" label="Mixed Materials (39002038050457)">1</container>
     <container parent="cid342284" type="Folder">2</container>
   </did>
 </c>

 <c id="ref11" level="file">
   <did>
     <unittitle>Potter, Horace</unittitle>
     <unitdate normal="1824/1824">1824</unitdate>
     <container id="cid342283" type="Box" label="Mixed Materials (39002038050457)">1</container>
     <container parent="cid342283" type="Folder">3</container>
   </did>
 </c>

Here, we see that these box 1’s are the same — they have the same barcode (btw, see previous posts for help working out what to do with this crazy export and barcodes). But the container id makes it seem like these are two different things — they have two different container id’s and their folders refer two two different parents.

What we really want to say is “This box 1 is the same as the other box 1’s. It’s not the same as reel 22. Folder 2 is inside of box 1, and so is folder 3.” Once we get our systems to represent all of this, we can do much better automation, better reporting, and have a much more reliable sense of where our stuff is.

So if we want to be able to work with our containers as they actually are, we need to represent those properly in our technology. What should we be thinking about in our descriptive practice now that we’ve de-centered the box?

“Box” is not a level of description.

In ISAD(G) (explicitly) and DACS (implicitly), archivists are required to explain the level at which they’re describing aggregations of records. There isn’t a vocabulary for this, but traditionally, these levels include “collection”, “record group”, “series”, “file” and “item.” Note that “box” is not on this list or any other reasonable person’s list. I know everyone means well, and I would never discourage someone from processing materials in aggregate, but the term “box-level processing” is like nails on a chalkboard to me. As a concept, it should not be a thing. Now, series-level processing? Consider me on board! File-group processing? Awesome, sounds good! Do you want to break those file groups out into discrete groups of records that are often surrounded by a folder and hopefully are associated with distinctive terms, like proper nouns? Sure, if you think it will help and you don’t have anything better to do.

A box is usually just an accident of administravia. I truly believe that archivists’ value is our ability to discern and describe aggregations of records — that box is not a meaningful aggregation, and describing it as such gives a false impression of the importance of one linear foot of material. I’d really love to see a push toward better series-level or file-group-level description, and less file-level mapping, especially for organizations’ records. Often, unless someone is doing a known item search, there’s nothing distinct enough about individual files as evidence (and remember, this is why we do processing — to provide access to and explain records that give evidence of the past) to justify sub-dividing them. I also think that this could help us think past unnecessary sorting and related housekeeping — our job isn’t to make order from chaos*, it’s to explain records and their context of creation of use. If records were created chaotically and kept in a chaotic way, are we really illuminating anything by prescribing artificial order?

This kind of thinking will be increasingly important when our records aren’t tied to physical containers.

In conclusion, let’s leave the robot work to the robots.

If I never had to translate a call number to a shelf location again, it would be too soon (actually, we don’t do that at MSSA, but still). Let’s stop making our patrons care about boxes, and let’s start making our technology work for us.


* This blog’s title, Chaos –> Order, is not about bringing order to a chaotic past — it’s about bringing order to our repositories and to our work habits. In other words, get that beam out of your own eye, sucka, before you get your alphabetization on.

 

Book Review: Extensible Processing. Case Studies and Conclusion

And we’ve come to the end. For me, the most fun part of this book is the case studies at the end. Here, everything that Dan had been talking about in previous chapters comes together and we see the concrete ways that extensible processing principles help solve big problems (huge problems, really — repositories in disarray huge, processing 2,500 feet in two years huge, giving access to huge volumes of records without violating HIPAA huge).

Instead of going through each case study, I thought I would pull out some winning strategies that helped archivists move mountains. But first the roll-call of devoted archivists taking smart approaches to their projects (I’ve tried to link to relevant materials online — really, though, read the case studies in Dan’s book).

So, what worked really well? What made it possible for these archivists to do so amazing remediation and program-building work?

  • Focus, deadlines, and scoping a project properly are the winning combination to finish a project. Giving a project a finite timeline forces participants to articulate our central values. Don’t let yourself become consumed by unimportant details.
  • Change your repository today to avoid the backlog of tomorrow — start with accessioning. A lot of what’s done as processing in these projects are activities that I would describe as retrospective accessioning (getting intellectual and physical control, understanding groupings of materials by creator/function, hunting for any agreements with donors that may impact access, use, or permission to dispose of materials), but with important information lost to time. Dan’s chapter on accessioning and Audra Eagle Yun’s case study on building an accessioning program make such a strong case that you’ll never know more about these materials than the moment they come through the door, so that’s the time to determine and meet a baseline level of control.
  • Re-use existing description — wherever you may find it — whenever possible. Creators know much more about their records than the rest of us ever will — Adriane’s case study made a great case for finding, recording, and re-using high-level description to help stay at a high-level understanding of the records. This means that you need to get comfortable with description as data, so that you can make it good and put it where it belongs. Maybe some posts on this blog can help you think though that!
  • If you’re in a position of responsibility over people, processes or systems, be smart about how you spend your time. Create a ranked list of the biggest things that you could do to improve access to the records you collect. Maybe that’s working with IT to make sure that the time-consuming, nagging process that your staff has to work around gets fixed. Maybe that means filling some training gaps. Maybe this means that you stop processing on a single collection and organize a survey of your entire holdings. Maybe it’s making sure you have a better database for tracking locations. If you ever find yourself saying, “I’m just too busy to think it through,” you’re already in the danger zone — you’re implicitly admitting that the way work is being done now is probably not the best way. Put two hours on your calendar to do some deep thinking, read these case studies, consult with your colleagues, and make sure that work is being done the way that works best for everyone.
  • Principles are sacred, procedures are not. You’re here to provide authentic, reliable evidence of the past through the records people leave behind which others can access during the course of research. Make sure that every procedure in your repository exists in service to that goal. Maybe this means that instead of doing item-level review for restrictions, you figure out that it makes more sense from an access and resources perspective to do review on demand. Maybe this means that you allocate staff that used to do arrangement to doing digitization.

Like we’ve said all week, this is a great book — practical, principled, helpful, approachable, and rooted in the language and values of archivists. Anyone seeking a way to improve her own practice will find valuable advice.

Book Review: Extensible Processing. But What About…

Chapter 9 addresses questions and concerns raised about extensible processing. Dan provides responses based on archival theory, practices, projects, and goals to a wide range of topics, details how extensible processing can actually help solve the issues raised, and calls for more critical analysis (and actual change) of other archival functions. There are gem quotes/talking points in every section (I resisted listing them all!) that show why objections aren’t reasons to not pursue extensible processing. He reiterates the strengths of extensible processing and its flexible nature to accommodate many situations. Dan offers data points to gather to make decisions about additional description work for selected materials, which may also help to address some of the issues raised.

As someone who has worked at two institutions building extensible processing programs, I have heard every single one of the arguments presented in this chapter against changing how we provide access to materials (sometimes all of them in the same meeting!) To me, lots of the arguments against extensible processing techniques really come down to two fundamental experiences or beliefs:

We care about creating access to select collections, want to do it in the same ways as before, and think we can’t really do anything about the backlog without a major influx of resources (which we won’t ever have.) OR We care about creating access to the most amount of collections possible, realize our methods have created a backlog, and are willing to try different approaches to eliminate the backlog.

Why do so many people still fall into the first category? We have it in our power to change our practices to create basic access to all our holdings. Why wouldn’t you get behind that idea?

Because you want control? Because you want your boxes to look pretty? Because you want your folders in a very specific order? Because you’re nervous about changing your daily tasks? Because you’re worried that a step/detail for one collection/series/folder/item won’t get done as it has before? Because you’re scared to make harder decisions and think more broadly?

Dan continually shows in this chapter (and the whole book) that extensible processing offers a way out. Even if you don’t happen to like the details, it gets you much closer to your goal of providing access to all your collections. A good extensible processing program will push for systemic decisions and changes in other areas. It also means being able to talk about our work differently. Consider that a common thread among the objections (regardless of the topic/specifics) is intimately tied to the archivists’ identify and professional status. Dan’s last two paragraphs are so well said:

Rather than damaging the profession, extensible processing practices have the potential to enhance the profession’s standing with researchers, donors, and resource allocators. Gains in intellectual control of collection materials, the rates at which newly donated material is made available, and the removal of barriers to access can all be used to demonstrate the value of extensible processing and of archivists themselves. Archivists should strive to stress these aspects of their work, rather than the traditional housekeeping of physical processing, boxing, and labeling.

If archivists are not refoldering, wedding, arranging, or describing the same way every time, what is left to do? Making difficult decisions and looking at the big picture, including when to stop and move on to the next collection. Looking at complex collections and recognizing the patterns and relationships between and within them. Making the high-level arrangement and appraisal decisions. Responding to users by basing processing priorities and decisions about levels of processing on information about what collections are used the most. Solving problems and being creative in finding ways to provide access to collections. All of these are incredibly valuable, and highly valued, skills for archivists who will lead the way in delivering archival material to users. [1]

I think this chapter is a must read for everyone at institutions with backlogs. It will provide those advocating for extensible processing with additional talking points and evidence. For those who may be resisting extensible processing techniques, chances are that the chapter has covered your concern and could lead to productive conversations and shared understandings with your colleagues.


[1] Santamaria, Daniel A. Extensible Processing for Archives and Special Collections: Reducing Processing Backlogs. Chicago: ALA Neal-Schuman, 2015, 139-140.

 

Extensible Processing: Who is Involved and Who Cares?

So earlier in this series Maureen looked at the chapters dealing with why repositories should implement an extensible processing program and Meghan looked at the chapters that talk about the hows of implementation. I am focusing here on who is involved in implementing and maintaining an extensible processing program. My review focuses on Chapters 6-8, sections that in one way or another assess the ways that an extensible processing program plays well with others, from the professional community and its systems (through the rigorous application of standards based description), with repository staff and administration (through effective management of staff and advocating to management and administrators), and with users (through seeing online digitized content as an end goal of the processing process).

One really important aspect of this book is that it makes a very serious case that while archival collections may all be unique, the ways that we approach them are not. The fundamentals of our work stay the same as does the end goal of quickly and effectively serving our user communities.  Extensible processing techniques are carried out in similar ways at the collection level and the repository level, and they are supported and guided by widely accepted professional standards. While some detractors of baseline processing and other extensible processing techniques claim that these approaches are incompatible with standardized archival practice, Dan moves point by point through the most relevant sections of DACS explaining why the careful adherence to standardized description, far from being incompatible with minimal processing, in fact undergirds the entire enterprise of an extensible processing program. Archival descriptive standards are specifically designed to be flexible and to accommodate a range of levels of description and local practices. If they work right, and we do our jobs, they provide a way for the entire professional community to participate in and guide the principles behind individual processing programs at individual repositories.

So this sort of processing program is firmly based in broad professional standards, but on a more localized level there are any number of people that are involved in arrangement and description work.  Chapter 8 focuses in on the repository level, and addresses how to lead, administer, and manage and extensible processing program, with a major focus on project planning and management.  This section highlights one of the real strengths of the book– its concrete, realistic, and implementable advice. Santamaria walks the reader through various decision making processes, discusses criteria for priority setting, lays out specific elements of a processing plan, discusses resource allocation and personnel decisions, and how and why to adhere to firm timelines. This chapter is an excellent road map for a manager interested in talking the principles throughout the book and making them a reality. The specific suggestions are supplemented by a series of appendices that provide examples of processing plans and other forms of documentation to assist archivists in codifying their practice and moving towards and extensible processing model.  This is a chapter I will be coming back to and reviewing when I need to manage new projects, create buy-in from staff, and advocate for extensible processing procedures to my management and administration.

The final people affected by our arrangement and description decisions are, of course, our users. Chapter 7, Digitization and Facilitating Access to Content, investigates user expectations around digital delivery of archival content (and our remarkable failure to meet them). Dan not only calls for digitization to be an integrated aspect of archival processing work (rather than a separate program) but frames this argument, usefully and importantly as an ethical consideration of equitable access to collection resources. He states that

Just as with processing, if our goal is to provide ‘open and equitable access’ to collections material, archivists need to use all the tools at our disposal to allow researchers of all backgrounds access, not just those who can afford to travel to our repositories and visit during the work week. [1]

He then goes on to suggest models for broad digitization and concrete suggestions for how repositories can work digitization into workflows, work with vendors, and manage privacy and copyright issues, but, for me, the heart of the chapter is the same message that is the heart of the book and of this processing model as a while, the insistence on equitable access.

These three chapters clearly articulate that the adherence to standards, the focus on end-user access, and the high levels of planning and management acumen that go into an extensible processing program serve to reiterate to the archival community that minimal processing is not lazy, sloppy processing. Dan reminds us, in what I think is one of the most important lines in the book, that

In an efficient extensible processing program the intellectual work of arranging material into broad groupings takes the place of neatly ordering items in folders and folders in series [2]

As archivists we add value to collections by applying our knowledge of how people and organizations work and how to think critically about the records that they create in that process. As a community we need to use our real professional skills to assess the records that our repositories hold. Quickly and competently assessing the nature of records is a difficult and skilled high level work; refoldering is not. We need to focus our professional skills and our repositories’ resources where it counts and where it is most likely to provide value to our various communities of stakeholders.


[1] Santamaria, Daniel A. Extensible Processing for Archives and Special Collections: Reducing Processing Backlogs. Chicago: ALA Neal-Schuman, 2015, 85

[2] ibid, 72

Book Review: Extensible Processing from Accessioning to Backlogs

Book Club time! I’ve really enjoyed reading Daniel Santamaria’s new book, Extensible Processing for Archives and Special Collections. I asked for the chance to review Chapters 3, 4, and 5, because I thought that they would most directly relate to the accessioning and processing that I handle in my job. Since starting to read, however, I’ve found that Extensible Processing is much more holistic than its chapter headings imply — so, my first recommendation is to just read the whole thing. There are pieces of each chapter that I have found incredibly relevant to my work, beyond the three chapters designated for processing, backlogs, and accessioning.

I really like how Dan offers practical steps for establishing an extensible processing program. In Chapter 3, he explains how processing collections should be viewed as cyclical, rather than linear; he argues that collections and their descriptions should “always be seen as works in progress” (p. 29). His approach to processing always focuses on getting the material available quickly, with the caveat that more arrangement and description can always follow later, if use demands it. Chapter 4 is tailored to addressing backlogs, and encourages institutions to use collections assessments and surveys as a means of defining and prioritizing their backlog in a structured way. Chapter 5 takes the processing guidelines from Chapter 3 and applies them to accessioning, encouraging processing at accessioning. (Dan is not a fan of archives’ traditional approach of adding accessions to the backlog.)

Of the three chapters I am reviewing for this blog, I found Chapter 5, on accessioning, to be the most thought-provoking. We implemented processing at accessioning years ago at Duke, but we constantly acquire additions to collections and find ourselves pondering the most practical approach to integrating these new accessions with their existing collections. Dan advises adding accessions to the end of the collection, in their own series; this is certainly an approach I have used, but it has always felt like something temporary. Dan suggests that I just get used to this lack of closure. (I’m fine with that idea.) Another solution we sometimes use at Duke is to interfile smaller additions within the original collection, a tactic that Dan opposes because it can take a lot of time and upsets original order (p. 61-62). My view is that interfiling one or two folders can be much easier (and more practical, in the long term) than adding folders to a random box at the end of the collection. But, even when I disagree with the logistics of Dan’s approach, I agree with his overall argument: when adopting extensible processing, accessions are no different than any other archival collection — and therefore it should be the user’s needs that drive our processing energies.

Whenever I read something well-written with lots of practical tips, I end up excited about getting back to work and just fixing everything. I found Dan’s review and application of MPLP more practical, and less perplexing (see what I did there?) than a number of other approaches I’ve read or heard about in recent years. It helps to remind ourselves that collections can always be revisited later. I appreciate his flexible approach to processing, which constantly keeps the user’s needs at the forefront. He believes in collecting and then using data, which eliminates the guessing game we often play when trying to prioritize processing demands. Furthermore, he repeatedly references the importance of buy-in from other staff and the library administration, a topic he explores further in Chapter 8. “Oooh, my colleagues would not like this” was definitely something at the front of my mind as I read his guidelines emphasizing the intellectual, rather than physical, arrangement and description of materials. I look forward to applying his suggestions as we continue to refine our processing procedures at Duke.

Book Review: Extensible Processing. Why Extensible Processing is Essential

This week, our core group of editors will review Extensible Processing for Archives and Special Collections: Reducing Processing Backlogs by Daniel A. Santamaria.


Many successful archival repositories have, for a very long time, operated in ways to make sure that their practices scale to their collections sizes, staffing resources, and user needs. But it seems that it’s only been in the last ten years, since the publication of Mark Greene and Dennis Meissner’s “More Product, Less Process: Revamping Traditional Archival Processing” and the associated cascade of conference presentations, case studies, and affiliated articles, that processing procedures as a whole have moved toward something that we can talk about, think critically about, and ultimately re-examine the purpose of.

This book provides the first comprehensive framework that I’ve seen about how to run a repository based on extensible processing principles — principles that are firmly rooted in deeply-held archival values and the logical extension of Greene and Meissner’s argument that every procedure in a library needs to be held to the scrutiny of materials’ availability for use. And, since this blog is largely about repository-wide projects (and shifting our thinking toward taking care of everything in our care instead of thinking about processing project after processing project), it seems like an excellent fit for our interests and audience.

Chapter one starts with a sobering analysis of the backlog problem. In short, backlogs are growing, staffing is flat, collecting continues, the records we collect as evidence of our creators’ lives and work are more voluminous than ever, and few of us are doing anything differently to help address the fact that patrons can’t see our collections. He pulls what I found to be a shocking statistic — according to an OCLC research survey of special collections libraries in late 2010, internet-accessible finding aids only exist for 44% of collections [1], despite the fact that it seemed like one couldn’t throw a rock at a conference between 2005-2010 without hitting someone having a discussion about Greene and Meissner’s article.

So, there’s obviously a problem. Despite MPLP’s very good advice that we need to be willing to look at our work differently if we want to overcome the problem of scale, it’s simply not happening in too many repositories. And here, I think, is where this book makes an important intervention in the archival literature.  Santamaria provides reasoned, step-by-step advice toward building a program where patrons are better served, donors’ expectations are met, and staff aren’t constantly trying to climb out from a hole of tasks yet to be performed with no relief in sight.

Given the choice, it’s a lot more professionally satisfying to work in a place that doesn’t accept the inevitability of backlogs. I worked for Dan at Princeton from the beginning of 2011 through 2013. If you’re wondering what it’s like to work at a place with a true philosophy of access first, and where one examines, each time, what processing means for that collection (and in the context of the other work that needs to be done) and why you’re doing it that way — well, it’s a lot of fun. I had come in at a particularly exciting time — because of the smart decisions that Dan and other archivists at Mudd had made in years previous, the backlog was dead. We were able to work on projects (like the Princeton Finding Aids site), that relied on creative engagement with our description, our materials, and our users. I believe that this kind of project was only possible because Dan had already built a culture of intellectual engagement with our work, where each member of the team understood our mission and the purposes of archival description.

For anyone overwhelmed by her repository, things can be different. But relief can only come if you’re willing to take a hard look at why you do what you do. More than that, you might have to spend more time managing and planning (and less time treading water, hoping that change will come externally). Chapter two provides six principles for an extensible processing program.

  1. Create a baseline level of access to all collections material
  2. Create standardized, structured description
  3. Manage archival materials in the aggregate
  4. “Do no harm”: limit physical handling and processing
  5. Iterate: conduct further processing in a systematic but flexible way
  6. Manage processing holistically

I believe that what separates professional archivists from interested enthusiasts is a commitment to managing our time in ways that our best for researchers and collections. This book makes a compelling case for a deliberate approach, which requires that archivists make prudent decisions and hard choices every day.

Throughout this book… emphasis is placed on decision-making, prioritization, and adherence to archival principles and standards — concepts that apply to archivists at many levels and in every kind of organization. [2]

I’m convinced that we all have the capability to approach our work this way — but that 44% number doesn’t lie. We need to treat the problem of backlogs like the crisis it is. I look forward to Meghan’s review tomorrow, which will cover chapters 3-5 and discuss concrete steps any archivist can take to effectively manage processing and kill the backlog.


[1]  Santamaria 2, quoting Dooley, Jackie and Katherine Luce. “Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives.” OCLC Research, 2010. It’s interesting that according to the survey, 74% of collections would have online finding aids if analog copies were converted and made available online.

[2] Santamaria, Daniel A. Extensible Processing for Archives and Special Collections: Reducing Processing Backlogs. Chicago: ALA Neal-Schuman, 2015, X