Round-tripping EAD, updated — how to not lose those barcodes

Several weeks ago, Bonnie published a hugely-useful checklist of things to think about when trying to round-trip data from Archivists’ Toolkit to EAD and back to Archivists’ Toolkit to help minimize data loss.

Don’t lose those barcodes!

In a comment, my colleague Mark gave a very sensible suggestion for how to not lose barcodes during that trip. When you export an EAD from AT, the container information looks something like this:

<c id="ref21" level="file">
    <did>
        <unittitle>Notecards</unittitle>
        <container id="cid1063028" type="Box" label="Mixed Materials (39002042652603)">7</container>
        <unitdate normal="1930/1940" type="inclusive">circa 1935-1950</unitdate>
    </did>
</c>

What I see here is that two separate fields in the AT database have been jammed together for EAD export. “Mixed materials” lives in the ArchDescriptionInstances table in AT as instanceType. 39002042652603 lives in the ArchDescriptionInstances table as barcode. During export, these are brought together as container @label.

So, the trick is to get them back into the fields in tables where they belong. By default, AT imports container @label to ArchDescriptionInstances.instanceType. You want to get the barcode information into barcode. Here’s how.

I sometimes have bad luck getting AT to re-import if container @label=”Mixed Materials (39002042652603)” — I do a big find/replace to make that look like @label=”39002042652603″

Then, I import into AT.

Getting the barcodes where they belong is pretty easy, but it has to happen in SQL in the back-end database. If you haven’t written SQL before, you might want to run this by a database administrator, just to have another set of eyes on it. You DEFINITELY want to do this in the test or dev version of your database first.

Step 1: copy over barcodes from instanceType to barcode (this assumes 14-digit barcodes)

UPDATE ArchDescriptionInstances 
SET barcode = instanceType
WHERE instanceType REGEXP '^[0-9]{14}$';

Step 2: replace barcodes in instanceType with “Mixed materials” (this assumes that you want everything to be “Mixed Materials”. I have a rant in the works about how inappropriate it is to have this label on containers anyway, but I’ll save that for another day).

UPDATE ArchDescriptionInstances 
SET instanceType = 'Mixed materials'
WHERE instanceType REGEXP '^[0-9]{14}$';

Congratulations! You haven’t lost your barcodes! You can now delete the previous resource record and enjoy your new-and-improved one.

But that’s not all…

Extra super-secret tips for folks keeping Voyager item information in Archivists’ Toolkit.

So, where I work, this wouldn’t be enough. We have a whole other set of data in the ArchDescriptionInstances table that isn’t serialized in EAD. Container type, location code, Voyager bibid, container restriction and a Boolean of whether this information has been exported to Voyager are kept in the ArchDescriptionInstances table in user defined fields. We then have a background program that sends this information to Voyager, where holdings/item records are created.

This means that if I round-trip EAD, this item information that exists is disassociated with the descriptive information. This doesn’t work.

Note: If you’re also using the plug-in that Yale developed to keep Voyager holdings information in AT, could you please drop me a line? We’re starting to think about migration paths to ArchivesSpace.

But I had a brainwave — if I round-trip EAD with barcodes, I now have a key. The same information associated with a barcode which is now blank (what’s the Voyager bib/holdings ID? was it exported?), is present in a different record with the same barcode from when the information was entered before, before the EAD was round-tripped.

This means that no one has to re-enter the same information or re-export to Voyager. Check out my SQL update below — same warnings and caveats apply.

 

start transaction;

UPDATE schema.ArchDescriptionInstances AS ArchDesc

JOIN (
SELECT
barcode,
userDefinedString1,
userDefinedString2,
locationId,
userDefinedBoolean1,
userDefinedBoolean2
FROM schema.ArchDescriptionInstances
WHERE userDefinedString1 = "{{{bibid}}}"
) AS newdata ON newdata.barcode=ArchDesc.barcode
SET
ArchDesc.userDefinedString1 = newdata.UserDefinedString1,
ArchDesc.userDefinedString2 = newdata.UserDefinedString2,
ArchDesc.locationId = newdata.locationId,
ArchDesc.userDefinedBoolean1 = newdata.userDefinedBoolean1,
ArchDesc.userDefinedBoolean2 = newdata.userDefinedBoolean2

WHERE
ArchDesc.userDefinedString1 = ""
AND ArchDesc.userDefinedString2 = "";

commit;

3 thoughts on “Round-tripping EAD, updated — how to not lose those barcodes

  1. Maureen, thanks for the timely post. I needed to import some barcodes into AT for the first time today.

    At Duke, we’ve stored our barcodes in EAD in any number of ways over the years and I needed to import them all into AT (in preparation for migrating to ArchivesSpace).

    I wrote some XSLT to move all the barcode values to container/@label as you suggested, imported the EAD into AT, and then used Navicat to move the barcode values from the instanceType field to the barcode field in the SQL database. It worked nicely. I highly recommend Navicat if you need to work directly with the AT SQL database, but aren’t super comfortable writing SQL statements. Navicat provides a nice GUI for viewing the database tables and sorting, filtering, and batch updating cells. It then writes the corresponding SQL statements in the background and executes them for you. You can also preview your updates before committing them

    Thanks again for the great post.

  2. Pingback: Chaos —> Order | On Containers

Leave a reply to Noah Huffman Cancel reply