Converting Preliminary Inventories to Tables with Macros: Folder Titles and Numbers

In my last post, I talked about how we have used macros at Maryland to start converting preliminary inventories into more useful formats to prepare for eventual importation to ArchivesSpace and create EAD finding aids. Using another macro, I have replaced folder title rows containing “(2f)” or “(3f)” with multiple rows. Using these two macros together, each single line in our tables will represent a single folder. To complete the process, I used a formula that quickly creates folder numbers. Folder numbers will allow you to sort the spreadsheet data while preserving the physical order and location of each folder.

In a Word document, the 2f or 3f shorthand might help preserve space in an inventory or finding aid, or it might seem more logical or aesthetically pleasing. But in a table, we want an accurate representation of each unit (the folder). In the Athletic Media Relations example that I discussed in my last post, there were 67 rows that contained either 2f, 3f,or 4f:

Untitled-2

Copying these lines one by one would have taken an excessive amount of time. Instead of copying a folder title over and over again, I automated the task by using this macro:

Sub two_f()
' Sub and End Sub are commands that say the macro will start and end
'Do is a command that marks the beginning of a loop
Do
Columns("B:B").EntireColumn.Select
'Search the B column for 2f instances
Selection.Find(What:=", (2f)", After:=ActiveCell, LookIn:=xlFormulas, _
LookAt:=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, _
MatchCase:=False, SearchFormat:=False).Activate
'Create a new line and copy down
ActiveCell.Offset(1, 0).Rows("1:1").EntireRow.Select
Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
Selection.FillDown
'replace the “2f” from the line you copied and line you pasted
ActiveCell.Offset(-1, 1).Range("A1:A2").Select
Selection.Replace What:=", (2f)", Replacement:="", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
Loop Until Cells.findnext is nothing
Range("A1").Select
MsgBox "Conversion complete!"
End Sub

This macro can be adjusted for instances of “(3f)”, “(4f)”, etc. The following code shows the appropriate adjustments:

Sub three_f()
' Sub and End Sub are commands that say the macro will start and end
'Do is a command that marks the beginning of a loop
Do
Columns("B:B").EntireColumn.Select
'Search the B column for 3f instances
Selection.Find(What:=", (3f)", After:=ActiveCell, LookIn:=xlFormulas, _
LookAt:=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, _
MatchCase:=False, SearchFormat:=False).Activate
'Create two new lines and copy down each time
ActiveCell.Offset(1, 0).Rows("1:1").EntireRow.Select
Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
Selection.FillDown
ActiveCell.Offset(1, 0).Rows("1:1").EntireRow.Select
Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
Selection.FillDown
'replace the “3f” from the line you copied and two lines you pasted
ActiveCell.Offset(-2, 1).Range("A1:A3").Select
Selection.Replace What:=", (3f)", Replacement:="", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
Loop Until Cells.findnext is nothing
Range("A1").Select
MsgBox "Conversion complete!"
End Sub

The code could also be adjusted accordingly to accommodate instances of 4f, 5f, etc. The macros will make the following adjustments to our example:

Untitled-3

Within a few seconds, the macro created about 200 lines in this Athletic Media Relations spreadsheet.

As with any macro that you might copy and paste into your VBA Project Window, save frequently and make sure this fits your needs exactly to prevent any unintended results. For example, the sample code above searches for and replaces “(3f)”, but your inventories might not use parentheses.

Now that each line in this table represents a single folder in this collection, we should create a folder number to preserve the order inside each box.

Formulas have been used elsewhere on this blog. Like macros, formulas are extremely powerful for data cleanup and this one will hopefully give you some inspiration. As Cassie has written earlier, the IF formula can be very useful for testing a condition of your choice for a column of data. It can also be used for creating new data. In this case, I want to create a folder number.

This formula uses the same If/Then/Else logic as Cassie’s previous example. The =IF command requires three statements separated by commas. The first (“If”) is the true or false condition you are testing. The second (“Then”) is the result you want if the condition has been met; the third (“Else”) is the result that you want if the condition was not met.

In this formula, we want to see if a folder is in the same box as the previous folder. If it is, we want to continue a sequence of folder numbers (1 to 2, 2 to 3, etc.). If this folder is not in the same box, we want to reset the sequence of folder numbers and start over with the number one (“1”).

In the Excel IF formula, this logical expression looks like this:

=IF(A2=A1,B1+1,1)

The one problem here is that this formula will not work for your very first folder, since it has no previous line to refer back to. To resolve this, just put the number “1” as your first folder number and put the formula in the cell for the second folder number.

Untitled-4

After you hit enter, fill down and don’t forget to copy the cells, and Paste Special your values!

As with macros, help with formulas is easy to find through Google, especially on sites such as MrExcel.com or StackOverflow.

Advertisements

Converting Preliminary Inventories to Tables with Macros: Moving Box Numbers

At University of Maryland we have lots of folder- (or item-) level inventories for processed, partially processed, or unprocessed collections. Preliminary inventories were mostly created using Microsoft Word, which is meant to create written works on printed pages, not the tabular data that we need to operate between systems. Complicating matters further, these files were each structured quite differently. My challenge this summer was to bring some of these Word documents into a standardized table format. Getting this data into Excel is a first step before we can use it in an EAD finding aid/ArchivesSpace. This has required quite a bit of cleanup and conversion. By using macros and formulas, we can make the conversion process a little easier.

I started the conversion process using many of the same regular expressions that Bonnie has described here on the blog. I especially looked out for tabs and line breaks, which you can locate in Word by searching for ^p or ^t (as well as ^13 and ^9, if you are using wildcards). To convert the inventory into a table, I had to make sure that each folder title was separated by one line break, with no line breaks in between. This would allow me to copy and paste the contents of the Word document into Excel and create a clean table.

Cleaning up a Word document with regular expressions was only part of the conversion process. I was still left with a number of redundant tasks moving around standardized data in Excel. First among them was to eliminate lines reporting a box number and move that number into a separate column. This is where I started introducing macros into our conversion work.

A Microsoft Office macro is essentially a process that automatically executes a task or series of tasks. You can run one by clicking an option on a list or even pressing a custom-made hotkey. Microsoft Office macros use Visual Basic for Applications (VBA) programming language. VBA is intended to be a simple, forgiving language, but it still takes some time to learn. I will point out a few simple steps you can take to get started with macros and even how to customize each code.

The simplest way to create a macro is to record it. Microsoft Word and Excel can “record” your actions by translating your activity in the program into a written code. You can then save these actions and re-create them later by “playing” the macro.

To record a macro, go to the View tab, click on the Macros list, and click “Record Macro.” When you are finished recording, go back to the Macros list and click “Stop Recording.” To replay the macro (re-run the same operations you recorded) or edit the source code, go to the Macro list and click View Macro, then “Run” or “Edit”.

In some instances, you may already have the code for a macro but do not have it stored into your copy of Microsoft Excel. To insert the pre-written code into Excel, type Alt+F11 to open the VBA Project Window, then Insert -> Module. Paste your code into the new window and press the Play button.

The example that I will use is from one of our Athletic Media Relations accessions. If I were to move the words “Box 5” manually, I would perform a series of steps: search for “Box 5”, copy the cell contents, delete the row, find an adjacent empty cell to paste the value “5”, and fill “5” down. I wanted to turn this:

Excel Sample Before Macro Conversion

into this:

Untitled-2

This particular inventory only had thirteen boxes, and could be converted in a minute or two manually, but that could take a very long time if you have a preliminary inventory with hundreds of boxes. So far I have applied the macro to about thirty inventories comprising several hundred boxes and saved hours of work. I used the following code to automate the process:

Sub Macro1()
' Sub and End Sub are commands that say the macro will start and end
'Create a new A column for the box number
Columns("A:A").Select
Selection.Insert Shift:=xlToRight, CopyOrigin:=xlFormatFromLeftOrAbove
'Take the value Box 1 from Cell B1 to Cell A1
ActiveCell.Offset(0, 1).Range("A1").Select
Selection.Cut
ActiveCell.Offset(0, -1).Range("A1").Select
ActiveSheet.Paste
'Fill the box number down
Selection.Copy
Range(Selection, Selection.End(xlDown)).Select
ActiveSheet.Paste
'Delete the row that initially had Box 1 in it
Rows(1).Delete
Columns("B:B").Select
'Do is a command that marks the beginning of a loop
Do
'Selection.Find searches within the selection (the B column) for a cell with a box number
Selection.Find(What:="box *", After:=ActiveCell, LookIn:=xlFormulas, _
LookAt:=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, _
MatchCase:=False, SearchFormat:=False).Activate
ActiveCell.Select
Selection.Cut
ActiveCell.Offset(0, -1).Select
ActiveSheet.Paste
'Take out the word box in the A column so it is just a number
Columns(1).Cells.Replace What:="Box ", Replacement:="", LookAt:=xlPart, _
SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _
ReplaceFormat:=False
Selection.Copy
Range(Selection, Selection.End(xlDown)).Select
ActiveSheet.Paste
ActiveCell.Rows("1:1").EntireRow.Select
Application.CutCopyMode = False
Selection.Delete Shift:=xlUp
Columns("B:B").Select
'This statement causes the commands to repeat until there are no more cells with the word Box in it
Loop Until Cells.FindNext Is Nothing
'These last five lines clean up the leftover box number from the last fill down
Selection.End(xlDown).Select
ActiveCell.Offset(1, -1).Select
Range(Selection, Selection.End(xlDown)).Select
Selection.ClearContents
Range("A1").Select
'End Sub is a required command that ends the macro
End Sub

Before you press “play,” a few words of warning. First, make sure you save a backup copy. You cannot undo the changes that are made through a macro! Any existing VBA code may need minor changes and tweaks to work with a new document, so be careful to review your code and make sure it fits your needs. Also, check to see if the macro will produce any unintended results when it runs.

The above sample macro identifies the “Box x” field by searching for the word “box” followed by a space and more characters. So make sure that there are no folder titles with the word “box” in it! (For example, in the sample screenshots above, “Box Scores” would need to be changed to read “Box-Scores”.)

For additional macro and VBA tips and workarounds, search Google for using “Excel”, “macro” and “VBA” as search terms. MrExcel.com and StackOverflow are especially useful sites.

In the next blog post, I will introduce a few other Excel macros and functions we have used at Maryland to continue this conversion process.