Monday, March 14, 2011

Re-Formatting Finding Aids for Conversion to EAD: Scope Content Notes

Many of our legacy finding aids here at the Kheel Center are either Microsoft Word documents or scans of hard copy documents. Since I use Ead McTaggart to tag container lists these need to be reformatted as Microsoft Excel documents. If a simple copy and paste does not convert the text to columns that can be easily done using the “Text to Columns” function in Excel. What is a little more difficult is when the title and scope/content notes are parse into separate lines rather than separate columns:


A B C
Box Folder Title
1 1 Folder One Title
Folder One Scope Content
1 2 Folder Two Title
Folder Two Scope Content


What I need to do in this case is move the scope/content notes from C3 to D2 and C5 to D4. You can drag and drop, but there is a quicker way: in cell D2 enter this formula:

=If(A3>0,””,C3)
This translates to “If the value in cell A3 is greater than zero (that is it contains a box number) set the value of cell D2 to a zero length string, if the value of cell A3 is not greater than zero (that is it has no box number) set the value of cell D2 equal to the value of cell C3.”

Moving cursor to the handle at the lower right corner of cell D2, changes it from a large white cross to a small black cross. Double clicking will fill the column down with the formula. The column will fill down until it reaches an empty cell in column C. This may, or may not, be the bottom of column C. Using “control, down arrow” to test this, the cell selection will move to the bottom of the column. If that is also the bottom of column C repeat the fill down process until the bottom of column C is reached.

The spreadsheet now looks this:


A B C D
Box Folder Title ScopeContent
1 1 Folder One Title Folder One Scope Content
Folder One Scope Content
1 2 Folder Two Title Folder Two Scope Content
Folder Two Scope Content

But, column D actually contains only formulas, not text. Highlight column D and copy, then right click the column and select “Paste Special”, then select “Values”. I now have text in column D.

At this point I insert a new column, called “Index”, between the Title and ScopeContent columns. Enter “1” into cell D2 and “2” into cell D3, then select both D2 and D3 and fill down. This gives me an index I can use to maintain or re-establish the proper order of the data.


A B C D E
Box Folder Title Index ScopeContent
1 1 Folder One Title 1 Folder One Scope Content


Folder One Scope Content 2
1 2 Folder Two Title 3 Folder Two Scope Content


Folder Two Scope Content 4

Now sort the table on column A. This will group all the lines with no box numbers together at the bottom of the spreadsheet. I can now delete those rows and resort on column D, Index:


A B C D E
Box Folder Title Index ScopeContent
1 1 Folder One Title 1 Folder One Scope Content
1 2 Folder Two Title 3 Folder Two Scope Content

At this point I re-index so that there are no missing numbers in the Index column. Not really necessary, but I do it anyway.

The finding is now formatted in Excel for conversion by Ead McTaggart. Except that the dates are probably part of the title.

Next week: Re-Formatting Finding Aids for Conversion to EAD: Parsing the Date From the Title

No comments:

Post a Comment