A very brief guide to deleting records with the ArchivesSpace API, from a total tyro

If you’ve ever used cURL before, you don’t need this.

Also, the videos and documentation that Hudson Molonglo put together are really stellar and recommended to anyone starting with this.

This guide is a true project-pad of my notes of how I did this. It might also be useful for those of us who never had formal training with scripting, but are in charge of the archival data in our repositories and appreciate power tools. Obviously, the problem with power tools is that you can cut your arm off. Use this carefully. Use in test/dev. Ask someone to check your work if you’re doing something truly crazy.

Here’s what I did

This came up for me because I had done a failed test migration (we think there’s a weird timestamp problem in the accessions table) and I wanted to delete the repository and all records in the repository in ASpace before trying again. As far as I can tell, there isn’t a great way to delete thousands of records in the user interface. So, the API seemed the way to go.

I figured this out by watching the video and reading the documentation on GitHub, and then doing a little extra googling around to learn more about curl options.

If you’re using a Mac, just fire up the terminal and get on with your life. I use a Windows PC at work, so I use Cygwn as a Unix emulator. The internet gave me good advice about how to add curl.exe.

Note: you won’t be able to do any of this unless you have admin access.

Let’s start with “Hello, World!”

$ curl 'http://test-aspace.yourenvironment.org:port/'

In this example, the url before the colon should be your ASpace instance (use test/dev!) and “port” should be your port. The response you get should basically just tell you that yes, you have communicated with this server.

Connect to the server

$ curl -F password='your password' 'http://test-aspace.yourenvironment.org:port/users/admin/login'

Here, you’re logging on as admin. The server will respond with a session token — go ahead and copy the token response and make it a variable, so you don’t have to keep track of it.

export TOKEN=cc0984b7bfa0718bd5c831b419cb8353c7545edb63b62319a69cdd29ea5775fa

Delete the records

Here, you definitely want to check the API documentation on GitHub. Basically, this tells you how to format the URI and the command to use. For instance, below, I wanted to delete an entire repository. I found out, though, that I couldn’t delete the repository if it had records that belonged to it. Since agents and subjects exist in ASpace without belonging to a repository, and since accessions and digital records hadn’t successfully migrated, I only needed to delete resource records.

$ curl -H "X-ArchivesSpace-Session: $TOKEN" -X "DELETE" 'http://test-aspace.yourenvironment.org:port/repositories/3/resources/[278-1693]'

So, I passed something to the header that gave my token ID, then I sent a command to delete some records. But which ones?

Let’s parse this URI. The first part is my ASpace test server, the port is my port.

The next thing to understand is that each repository, resource, accession, agent, whatever, has a numeric ID. URIs are formatted according to the record type and the ID. So, I go to repositories/3, because the resources I want to delete are in a particular repository, and that repository has the numeric ID of “3”. In order to find this out, you can look in the ASpace interface, or you can send a call to yoururl/repositories, which will give you a json response with id (and other) information about all of the repositories on your server.

After that, I tell curl which resource records I want to delete. There’s probably a better way, but I figured this out by sorting resources by date created, both ascending and descending, to find out what the first and last IDs are. I’d imagine, though, that if I didn’t want to look that up and I just asked for

'http://test-aspace.yourenvironment.org:port/repositories/3/resources/[1-2000]'

I would probably be okay, because it’s only deleting resource records in repository 3 and I want to get rid of all of those anyway. I’d get an error for resources that don’t exist in that repository, but it wouldn’t break anything. I had wondered if there are wildcards for curl, so that I could get ANY number after resources, but (according to some brief googling) it doesn’t look like there are.

What does this all mean?

Uh, I don’t know? I mean, the API is obviously very powerful and amazing, and I’m glad I didn’t have to figure out a way to delete those records in the interface. But I’m really just starting to dip my toe into the potential of this. I’m sure you can look forward to more updates.

One thought on “A very brief guide to deleting records with the ArchivesSpace API, from a total tyro

  1. Pingback: How I learned to stop worrying and love the API | Chaos —> Order

Leave a comment