Friday, September 30, 2011

EAD and Microsoft Access

Our collection database, and database of record, KIDB, is an Access database. We also have most of our folder lists in an Access database. What this means is that virtually all the metadata needed to construct EAD guides for our collections exists in Access databases. So instead of having a collection cataloged in MARC, extracting EAD from the MARC record using Terry Reese’s MarcEdit, tagging the container list with Ead McTaggart, merging the two resultant documents, and finally doing a fair amount of editing. I wanted push-button EAD.

I am not including all the code for doing this here. There is just too much of it, in reality it is most of the KIDB database. I will include the code that actually writes the front matter portion of the EAD document. You can find the code for the container list in an earlier post about EAD McTaggart. First a little bit about how to get ready to push the EAD button.

When I started this KIDB already had many of the elements needed to create an EAD guide. The collection title, creator, collection number, extent, etc. Some of the other material is pretty much boilerplate: contact information, repository, restrictions, citation, etc. What was missing were the descriptive elements like biography, abstract, organizational history, related collections, subjects. Step one was to build into KIDB a way to hold those elements and keep track of them. I did this with four tables. One, tblFrontMatter, has fields for the collection ID number, abstract, scope and content note, biography, organizational history, related collections, and subjects. A second, tblEntities, has fields for contact information, user information, and repository. The third keeps track of who added what to tblFrontMatter for which collection and when. The fourth is used to record who made any changes to the data in tblFrontMatter and when it was done.

With all the necessary data now contained in one database it was simply a matter of writing the code to tag it properly. The entire tagged code for the front matter is compiled in one variable, which is then added to a table designed to hold it. Then Ead McTaggart is called and he adds the tagged container list to the table, one row for each folder. When that is done a report pulls out the data row by row and exports it as an xml file. The table is then emptied.

Here is the code for tagging the front matter:

    strText = "<" & Chr(63) & "xml version=" & Chr(34) & "1.0" & Chr(34) & " encoding=" & Chr(34) & "utf-8" & Chr(34) & Chr(63) & ">"
    strText = strText & "<" & Chr(63) & "xml-stylesheet type=" & Chr(34) & "text/xsl" & Chr(34) & " href=" & Chr(34) & "../styles/style.xsl" & Chr(34) & Chr(63) & ">" & Chr(10)
    strText = strText & "<" & Chr(33) & "DOCTYPE ead PUBLIC " & Chr(34) & Chr(43) & "//ISBN 1-931666-00-8//DTD ead.dtd "
    strText = strText & "(Encoded Archival Description (EAD) Version 2002)//EN" & Chr(34) & " " & Chr(34) & "../dtds/ead.dtd" & Chr(34) & ">" & Chr(10)
    strText = strText & "<ead>" & Chr(10) & "<eadheader repositoryencoding=" & Chr(34) & "iso15511"
    strText = strText & Chr(34) & " relatedencoding=" & Chr(34) & "MARC21" & Chr(34) & " countryencoding="
    strText = strText & Chr(34) & "iso3166-1" & Chr(34) & " scriptencoding=" & Chr(34) & "iso15924" & Chr(34)
    strText = strText & " dateencoding=" & Chr(34) & "iso8601" & Chr(34) & " langencoding=" & Chr(34) & "iso639-2b" & Chr(34) & ">" & Chr(10)
    strText = strText & "<eadid mainagencycode=" & Chr(34) & "nic" & Chr(34) & " countrycode="
    strText = strText & Chr(34) & "us" & Chr(34) & " publicid=" & Chr(34) & "-//Cornell University::"
    strText = strText & "Cornell University Library::Kheel Center for Labor-Management Documentation and Archives//"
    strText = strText & "TEXT(US::NIC::KCL0" & strCollNum & "::" & strCollTitle & ".)//EN" & Chr(34) & ">"
    strText = strText & "KCL0" & strPathNum & ".xml</eadid>" & Chr(10)
    strText = strText & "<filedesc>" & Chr(10)
    strText = strText & "<titlestmt>" & Chr(10)
    strText = strText & "<titleproper>Guide to " & strCollTitle & "<date> " & strDate & "</date></titleproper>" & Chr(10)
    strText = strText & "<titleproper type=" & Chr(34) & "sort" & Chr(34) & ">" & strCollTitle & "</titleproper>" & Chr(10)
    strText = strText & "<author>Compiled by  " & strProcessor & "</author>" & Chr(10)
    strText = strText & "</titlestmt>" & Chr(10)
    strText = strText & "<publicationstmt>" & Chr(10)
    strText = strText & "<publisher>Kheel Center for Labor-Management Documentation and Archives, Cornell University Library</publisher>" & Chr(10)
    strText = strText & "<date>" & Format(dteDate, "MMMM dd, yyyy") & "</date>" & Chr(10)
    strText = strText & "</publicationstmt>" & Chr(10)
    strText = strText & "<notestmt>" & Chr(10)
    strText = strText & "<note audience=" & Chr(34) & "internal" & Chr(34) & ">" & Chr(10)
    strText = strText & "<p><subject>Labor</subject></p>" & Chr(10)
    strText = strText & "</note>" & Chr(10)
    strText = strText & "</notestmt>" & Chr(10)
    strText = strText & "</filedesc>" & Chr(10)
    strText = strText & "<profiledesc>" & Chr(10)
    strText = strText & "<creation>Finding aid encoded by KIDB, Ead McTaggart, and " & strEncoder & ", <date>" & Format(Date, "MMMM dd, yyyy") & "</date></creation>" & Chr(10)
    strText = strText & "</profiledesc>" & Chr(10)
    strText = strText & "</eadheader>" & Chr(10)
    strText = strText & "<frontmatter>" & Chr(10) & "<titlepage>"
    strText = strText & "<titleproper>Guide to the " & strCollTitle & "<lb/></titleproper>" & Chr(10)
    strText = strText & "<num>Collection Number: " & strCollNum & "</num>" & Chr(10)
    strText = strText & strAddress
    strText = strText & "<defitem>"
    strText = strText & "<label>Compiled by:</label>"
    strText = strText & "<item>" & strProcessor & "</item>"
    strText = strText & "</defitem>"
    strText = strText & "<defitem>"
    strText = strText & "<label>EAD encoding:</label>"
    strText = strText & "<item>" & strEncoder & ", " & Format(Date, "MMMM dd, yyyy") & "</item>"
    strText = strText & "</defitem>"
    strText = strText & "</list>"
    strText = strText & "<date>© " & DatePart("yyyy", Date) & " Kheel Center for Labor-Management Documentation and Archives, Cornell University Library </date>" & Chr(10)
    strText = strText & "</titlepage>" & Chr(10) & "</frontmatter>" & Chr(10)
    strText = strText & "<archdesc level=" & Chr(34) & "collection" & Chr(34) & ">" & Chr(10) & Chr(9) & Chr(9) & "<did>" & Chr(10)
    strText = strText & "<head id=" & Chr(34) & "a1" & Chr(34) & ">DESCRIPTIVE SUMMARY</head>" & Chr(10)
    strText = strText & "<unittitle label=" & Chr(34) & "Title:" & Chr(34) & " encodinganalog=" & Chr(34) & "MARC 245$a" & Chr(34) & ">" & strCollTitle & "," & Chr(10)
    strText = strText & "<unitdate encodinganalog=" & Chr(34) & "MARC 245$f" & Chr(34) & ">" & strDate & "</unitdate>" & Chr(10)
    strText = strText & "</unittitle>" & Chr(10)
    strText = strText & "<unitid label=" & Chr(34) & "Collection Number:" & Chr(34) & ">" & strCollNum & "</unitid>" & Chr(10)
    strText = strText & "<origination label=" & Chr(34) & "Creator:" & Chr(34) & ">" & Chr(10)
    strText = strText & "<persname encodinganalog=" & Chr(34) & "MARC 100" & Chr(34) & " role=" & Chr(34) & "creator" & Chr(34) & ">" & strCollCreator & "</persname>" & Chr(10)
    strText = strText & "</origination>" & Chr(10)
    strText = strText & "<physdesc label=" & Chr(34) & "Quantity:" & Chr(34) & " encodinganalog=" & Chr(34) & "MARC 300" & Chr(34) & ">" & dblLinear & " linear ft.</physdesc>" & Chr(10)
    strText = strText & "<physdesc label=" & Chr(34) & "Forms of Material:" & Chr(34) & ">Articles, reprints, pamphlets, correspondence, photographs.</physdesc>" & Chr(10)
    strText = strText & strRepository & Chr(10)
    strText = strText & strAbstract
    strText = strText & "<langmaterial label=" & Chr(34) & "Language:" & Chr(34) & ">Collection material in <language encodinganalog=" & Chr(34) & "MARC 041" & Chr(34) & " langcode=" & Chr(34) & "eng" & Chr(34) & ">English</language>" & Chr(10)
    strText = strText & "</langmaterial>" & Chr(10)
    strText = strText & "</did>" & Chr(10)
    strText = strText & strTopOrgHist
    strText = strText & strBio
    strText = strText & strOrgHist
    strText = strText & strScope
    strText = strText & strSubjects
    strText = strText & "<descgrp><head id=" & Chr(34) & "a10" & Chr(34) & ">INFORMATION FOR USERS</head>"
    strText = strText & "<accessrestrict><head>Access Restrictions:</head>"
    strText = strText & "<p>Access to the collections in the Kheel Center is restricted. Please contact a reference archivist for access to these materials.</p>"
    strText = strText & "</accessrestrict><userestrict><head>Restrictions on Use:</head>"
    strText = strText & "<p>This collection must be used in keeping with the Kheel Center Information Sheet and Procedures for Document Use.</p>"
    strText = strText & "</userestrict><prefercite><head>Cite As:</head>"
    strText = strText & "<p>" & strCollTitle & " #" & strCollNum & ". Kheel Center for Labor-Management Documentation and Archives, Cornell University Library.</p>"
    strText = strText & "</prefercite></descgrp>"
    strText = strText & strRelated



A few notes on some of the variables. The collection number is in two variables: strCollNum has the collection number as it appears in KIDB (i.e. 5169/043 AV), strPathNum has the collection number formatted to work as a valid file name (i.e. 5619-043av). How you choose to store multiple value fields will determine how you populate variables like strCreator, strSubjects, strRelated. I list creators in one field, delimited with a semi-colon. It is the same with related collections. When I pull the data I use the split function to separate out the values, format and tag then for EAD, then reassemble them in a single variable. Subjects are stored with a line break between them, each one beginning with the MARC field code (600: for a person and so on). For subjects I split on the line break (chr(10)) and use the MARC field to set the tags (persname, corpname, etc.) and the attributes.

No comments:

Post a Comment