Now that we’re eliminated most of our duplicate bulk dates let’s take a look at the plethora of date formats in our accession records. Does your repository have local definitions for how to record collection dates? My guess is most would have something somewhere, even if not a formal written policy. We have a small section of our processing manual detailing preferred date formats and unacceptable date formats. It is suppose to apply to collection dates in accession records and in finding aids. Do people follow it? Sometimes. Usually not. Or they follow some of it, but not all of it. We also have dates created before the processing manual existed. The samples below are just from a portion of our accession records, so we might have additional formats yet to be discovered, but you’ll get the idea.
Our date fields could contain three elements: year only, month and year, or month, day, and year. The type might be a single date, multiple single dates, range, multiple ranges, or a combination of these (although that isn’t specified). For dates in accession records I have already gone ahead and removed any variation of the word “circa”. There’s also a healthy amount of “unknown” and “undated” speckled throughout.
|Element, type||unitdateinclusive (Beast field)|
|Year, multiple single||1913, 1963|
|1945 or 1946|
|1953, 1961, 1969, 1994|
|1954, 1956, 1966-1967, 1971|
|1958, 1960, 1962|
|1966, 1967, 1969|
|1967, 1968, 1969|
|1995, 2000, undated|
|1921-1981 and undated|
|2000-2001 (FY 2001)|
|late 1980s-early 1990s|
|Year, multiple range||1920s, 1969-1975|
|Year, single and range||1928; 1938-1962 and undated|
|Month Year, single||November 1962|
|Month Year, range||January 1977- November 1981|
|Otober 1920-Marh 1921|
|Month, Day, Year, single||11/9/1911|
|June 14, 1924|
|Marh 8, 2006|
|Otober 26, 1963|
|Month, Day, Year, multiple single||12/19/2005; 4/4/2006|
|January 5, 2000, July 12, 2000|
|9/19 & 9/20/2007|
|Month, Day, Year, range||10/24-10/26/2008|
|January 30, 2011-February 2, 2011|
|Marh 22-24, 2001|
|Otober 13, 1987-Deember 7, 1987|
Here’s a summary of the issues:
- Punctuation is not standard. Multiple dates may be separated with a period, comma, semi-colon, ampersand, or the word “and”.
- We used a variety of methods to convey we were unsure of the date, such as ?, (?), [ ], [?], (approx.) in addition to all the circa variations. I’m guessing there are other dates we weren’t sure of, but we didn’t specify that.
- Spacing isn’t consistent. Sometimes there are no spaces around punctuation, others times one, two , or more spaces.
- Spelling. Sometimes we just couldn’t spell October or March (the most popular offenders apparently)
- Formats are all over the place, even comparing the same element and type. Ex: March 22-24, 2001 compared to March 22, 2001-March 24, 2001.
- Use of decades was a common practice.
- Providing single dates instead of ranges. Do we really need to say “1966, 1967, 1969” instead of “1966-1969” if we’re only missing 1968?
Next post we’ll talk about the instructions and rules we’re developing for cleaning this up and how we go about executing those decisions.