Thanks to recent grants, we have started on the next phase of development for the IMLS and NSF funded Digital Index of North American Archaeology (DINAA) project.
DINAA aims to aggregate datasets curated by US state government offices to build an open gazetteer of North American archaeological and historical sites. As we prepare additional state datasets for DINAA’s expansion, we’re also exploring ways to link DINAA with other key information sources on the Web.
Published by Open Context, each site record in DINAA has a unique and stable Web address that also serves as an identifier (a URI). This enables DINAA to support “Linked Open Data” (LOD) applications. Linked Open Data methods help us tie DINAA data with rich information resources curated by museums, researchers, other digital repositories, and government sources. Open Context can index some of these relationships, allowing us to offer powerful map-based search functions that enable users to discover information about archaeological sites that may be scattered across several different websites. Here’s a link to archaeological sites in DINAA that relate to content curated by other websites.
We recently cross-referenced the current DINAA dataset with the Federal Register. The Federal Register provides notifications of decisions and other news relating to the administration of laws and regulations. Regulatory processes greatly impact and shape the practice of archaeology in the US, and the Federal Register offers a key information resource for understanding governance of the archaeological past.
However, in order to gain archaeologically meaningful insights from the Federal Register, the register’s notifications need to be further contextualized. A key element of context can come from DINAA, via Linked Open Data methods. Archaeological sites in Federal Register notifications are typically listed with “Smithsonian Trinomials“, a kind of alphanumeric identifier assigned to sites by government agencies. By themselves, these trinomials are just strings of letters and numbers with very little meaning. However, DINAA curates Smithsonian Trinomial identifiers along with rich geospatial (at low-levels of precision to protect site security), chronological, and other metadata. By matching Smithsonian Trinomial identifiers in Federal Register documents with DINAA, we’re able to add rich spatial, chronological, and other metadata to government documents. This added context helps make the Federal Register a more meaningful window onto how we regulate the archaeological past.
Methods
While, the full source code for relating the Federal Register to DINAA records is here, here is an outline of the key steps:
Search and Retrieve Relevant Documents: The Federal Register has a powerful and well-documented API (Application Program Interface) that allowed us to search and harvest relevant notifications. We use the keyword search functions to find combinations of state names along with the following search terms: archaeology, archaeological, archeology, archeological, NAGPRA.
Find Trinomials in Documents: Once we retrieved documents from the searches defined above, we searched through each document to find instances of Smithsonian Trinomials in the DINAA corpus. For each matching trinomial, we recorded a simple linked data assertion as:
a given DINAA record -> "Is Referenced By" (a Dublin Core Terms property) -> the URI to the Federal Register document.
Index Relationships: After recording how DINAA records are “referenced by” different Federal Register documents, we re-indexed the DINAA records. This enables Open Context to power search and browse functions and to show map-based visualizations of the spatial “footprint” of documents in the Federal Register.
As DINAA expands, we will periodically update the Federal Register link to reflect the increasing number of archaeological sites related to Federal Register documents. We are working on many other similar relationships, such as with publications in JSTOR that use Smithsonian Trinomials, the National Register of Historic Places, the Canadian Archaeological Radiocarbon Database (CARD), and others. We will announce these on this site as they become available and will report on how these Linked Open Data services are being used in research.