We interrupt our vacation for a short blog post.
We are very pleased to report that the National Endowment for the Humanities just awarded a Digital Humanities Implementation grant to the Alexandria Archive Institute in support of our efforts to develop data publishing services with Open Context. In collaboration with a team of archaeologists working in the Mediterranean region, our project will further develop workflows to publish archaeological datasets as “Linked (Open) Data“, so that they can be used and integrated with many diverse data sources available on the Web to address important research topics.
We are grateful the NEH for their generous support and to our colleagues that bring energy, ideas, and talent to this collaborative effort. We’ll make further announcements about this and other projects in the upcoming weeks. Until then, it’s back to our previously scheduled vacation.
Posted in Events, Projects.
– July 27, 2012
We are very pleased to announce the online publication of the second installment of the Upper Tigris Archaeological Research Project’s excavation data, images, and documentation.
The Upper Tigris Archaeological Research Project (UTARP), under the direction of Bradley J. Parker (University of Utah), was active in the Upper Tigris River Region of southeastern Turkey between 1998 and 2011. The online data publication in Open Context aims to be a complete accounting of all of the excavation records as well as all of the records of the subsequent analyses from the Upper Tigris Archaeological Research Project’s excavations at the site of Kenan Tepe in southeastern Turkey (www.utarp.org).
This data publication includes records and analyses from Areas A, B, C, D, E, G, H and I, with findings from Ubaid, Late Chalcolithic, Early Bronze Age, Middle Bronze Age, and Iron Age contexts. With the publication of these data we can now say that approximately 95% of the data from Kenan Tepe are published. The database includes more than 30,000 images, 1900 journal entries and 43,000 records of contexts and finds. These data complement other datasets from several other sites in the Near East also published in Open Context.
A final installment of the last 5% of the data, which includes issues that remain to be resolved and an accounting of analyses that are still underway, will be forthcoming later this fall. All the UTARP data are available free-of-charge, under liberal Creative Commons Attribution licensing conditions, at:
Bulk down load of data can be obtained through Open Context’s GitHub repository. Additional options for downloading data tables as comma separated values (CSV) is forthcoming.
Posted in Data publications, News, Publication.
– July 3, 2012
We’re really pleased to collaborate with some amazing people and projects that have a deep understanding and appreciation for the Web. The Pelagios project is working out some very low-barrier-to-entry approaches to share references to ancient places (curated by the Pleiades Gazetteer). Since place is such an important concept for ancient studies, shared notions of place represent a potentially powerful and useful way to link data in multiple collections. Pelagios is doing just that, and they have successfully brought together 15+ different Web-based collections to collaborate and share place metadata. In bringing these collections together Pelagios has developed an “API” (Application Program Interface) that lets Web developers build upon a powerful index of multiple collections, all cross-referenced by links to ancient places.
We’re happy to note that Pelagios now cross-references some Open Context data now. I say “some” because Open Context has rather more Neolithic and Bronze Age material than the Classics emphasis of Pelagios. But as we publish more data from Classical Archaeology (see the “Rough Cilicia” example), Open Context will have more overlap with the other collections indexed by Pelagios.
In related developments, the NEH funded “Linked Ancient World Data Institute” (LAWDI) will soon meet in New York. Sebastian Heath and Nick Rabinowitz worked on a powerful new tool for making use of all sorts of linked data now shared by various online collections (including the many partners brought together by Pelagios). The awld.js library calls up useful data referenced by Web URIs (stable hyper-links) for display. This displays rich contextual information for users. For example, a reference to “Antioch” may mean a city in Northern California, an ancient town in Cilicia, or a more major ancient city further east. The linked data helps to remove this sort of ambiguity, and the awld.js library presents this contextualizing information clearly to users. Here’s a great example of the awld.js library in action, and here are two examples from Open Context:
All of these developments represent an important development in data sharing methods for archaeology. No longer are we just talking about putting data on the Web. Now we’re talking about making our data part of the Web. The rich and highly meaningful interlinking between various Web-based collections helps to better reflect the deep contextualization of knowledge of the past. Moreover, this approach is inherently collaborative. The growth and success of another group’s project has immediate benefits to all other projects that exchange linked data. It’s all part of developing a growing and dynamic information ecosystem, and it’s very exciting to help cultivate.
Posted in Projects.
– May 14, 2012
Wondering how to publish your data with Open Context? We have news for you!
With help from our Editorial Board, we have released the first version of Open Context’s Editorial Policies & Author Guidelines. This document contains essential information about publishing with Open Context, including:
- Open Context’s open access and copyright policies,
- what you should expect of the editorial processes involved with data publication,
- a list of key information that should accompany your data publication, and
- tips on how to clean up your content in preparation for digital publication.
The Guidelines also include more specific guidance for a few sub-fields of archaeology (zooarchaeology, GIS, human osteology). These are under review and will see frequent updates, as well as guidance from additional sub-fields.
A primary aim of our work is to make the data publication process more streamlined so that publishing data becomes an expected and regular part of scholarly communication. We hope the Guidelines move us a step or two closer to this goal, and we welcome your comments!
Posted in News, Publication.
– May 1, 2012
We’re delighted to announce the publication of “Other People’s Data: A Demonstration of the Imperative of Publishing Primary Data” in the Journal of Archaeological Method and Theory. The lead author is Prof. Levent Atici (UNLV), a member of the Open Context Editorial Board. The “online first” version of the paper can be accessed here. The authors will also share an Open Access pre-print (allowed by Springer) of the final version of the paper in the coming week.
This paper is an outcome of an AAI project funded by an NEH/IMLS Advancing Knowledge grant exploring user needs in archaeological data sharing. This paper’s co-authors (Levent Atici, Justin Lev-Tov, Sarah Whitcher Kansa and Eric Kansa) all participated in the NEH/IMLS study. They recognized that “data reuse” in archaeology is an area that is in critical need of more exploration. This paper reflects the co-authors’ attempts to grapple with this topic by documenting their reuse of data collected by another researcher. The results of their collaborative study highlight implications for data sharing, archiving and publishing programs.
Abstract: This study explores issues in using data generated by other analysts. Three researchers independently analyzed an orphaned, decades-old zooarchaeological dataset and then compared their analytical approaches and results. Although they took a similar initial approach to determine the dataset’s suitability for analysis, the three researchers generated markedly different interpretive conclusions. In examining how researchers use legacy data, this paper highlights interpretive issues, data integrity concerns, and data documentation needs. In order to meet these needs, we propose greater professional recognition for data dissemination, favoring models of “data publication” over “data sharing” or “data archiving.”
Posted in News, Publication.
– April 16, 2012
We’re very pleased to announce the publication of a significant portion of the Kenan Tepe excavations. Excavations at Kenan Tepe, directed by Bradley Parker (University of Utah) and co-directed by Lynn Swartz Dodd (University of Southern California), represent part of the investigations of the Upper Tigris Archaeological Research Project (UTARP). UTARP organized major excavation and survey programs aimed at defining archaeological correlates of ancient imperialism, colonialism and culture contact in an area that was, for much of Mesopotamian history, a frontier zone between the centralized states of Mesopotamia and the much less centralized cultures of its Anatolian periphery.
This initial release of Kenan Tepe data in Open Context represents the first installment of data and includes all Area F records where UTARP team members excavated twenty-two trenches of various sizes and depths in an effort to illuminate remains dating to the Late Chalcolithic period and Early Bronze Age at the site. Excavation records from further areas will be added in the near future to Open context and will be followed by the print publication of several final report volumes in the next few years.
UTARP’s Area F data from Kenan Tepe can be accessed at the Alexandria Archive Institute’s Open Context website at:
Because the UTARP team had excellent data management, it was possible to more fully use many of Open Context’s features not commonly used in other projects. Archaeological documentation draws upon diverse structured data (esp. tabular data), less structured texts (diaries, journals), and media (drawings, photos, and other media types). The UTARP team kept excellent records and had very clear file-naming conventions that allowed us to link all of these different types of documentation together. This makes it easier to organize and navigate this large body of content. For example, one can follow links from top-plans to see day-to-day progress in excavation. See this example:
Posted in Data publications, News, Projects.
– March 26, 2012
For the most part, we’re able to upgrade and update Open Context without major disruption. Today, however, is an exception. We’ve made some significant updates to the data structure behind the site so that Open Context can better support additional query features, especially using standard units of measure. We’re also upgrading Apache-Solr (the software that powers Open Context’s faceted search and other query services). The new query services will support searches according to geospatial facets. This will make Open Context able to support better interfaces for querying geospatial data, especially data from archaeological surveys.
If all goes well (fingers crossed!), we’ll be back online by next Monday. Unfortunately, many of the new capabilities won’t be immediately apparent. However, they will make it easier for us to implement some great user interface ideas recommended by Phoebe France, an expert interactions designer. In the next few months, Open Context overall design and user experience should improve greatly, in part because of these upgrades to the back end systems.
Posted in Uncategorized.
– March 22, 2012
We’ve recently completed exporting the majority of the data from Open Context to GitHub. For most data in Open Context, we link directly into the GitHub repository where the version history of the XML representation can be seen. Here’s an example: A coin from Domuztepe (the GitHub link follows the thumbnails).
GitHub is mainly a code-repository for software projects. However, it’s seeing more use for other applications that need robust version-control and transparency in development processes. GitHub now serves “open government” applications, including a project that actively tracks changes in US legal code. GitHub also serves many “open science” purposes, mainly source-code for scientific analysis software, but also, increasingly datasets. In fact, we already found some archaeological data together with analytic methods in GitHub, published by Thomas Dye.
GitHub has some fascinating potential for sharing archaeological data. GitHub provides robust version control. Changes are tracked and documented so they can be reviewed, and accepted or rejected by collaborators. This provides more transparency into data manipulations. This is a great feature, since we had a problem with our initial XML dump that we used to populate the repository in GitHub. Some of our documents did not have proper UTF-8 character encoding (needed to properly represent non-Latin characters). We fixed the output problem and we’re committing updated, better data to the repository.
To us, one of GitHub’s greatest advantages is in allowing datasets to be easily “forked” (i.e. duplicated and taken in a new direction). This gives people the freedom to take a dataset, work with it independently, and transform it to meet their needs. The provenance and history of forking is retained. We’ve made data portability a priority in Open Context with lots of emphasis on Web-services and machine-readable data. GitHub works towards these goals by providing a fantastic community and collaborative space to work with data in new ways.
Please note, Open Context does not rely upon GitHub for long-term archiving and data preservation. Open Context also works with the California Digital Library‘s Merritt repository. Open Context uses of GitHub mainly to encourage collaboration and transparency, and not for data preservation.
Posted in Data publications, News, Projects.
– March 12, 2012
This week (March 5-10) is Open Education Week, raising awareness of the open education movement and its impact on teaching and learning worldwide. Today, over 200 universities worldwide put open educational content online—a concept that was only in its infancy a decade ago. The Open Education movement has been spurred by support from such organizations such as the William and Flora Hewlett Foundation, which funded dozens of grassroots efforts to democratize educational resources (huge kudos to Hewlett for these investments). The AAI was a grantee in the Hewlett Foundation’s Open Educational Resources program, and a series of grants led to the development of Open Context.
Along with advances in Open Education tools and resources over the past decade, the concept of “Open Data” has also gained traction. Open Context’s vision is to enable scholarship through Open Data. This means that data are shared openly with links that resolve in useful information so that they can be associated in all sorts of ways across a vast, open Web. We work hard to align Open Context to this ideal of openness, while working to ensure that datasets meet professional expectations. Our current efforts try to align Open Data with professional needs. To that end we’re working to develop processes for editorial oversight and we’re working with the library community to better support citation, archiving, and use metrics.
Ten years ago, when the AAI was a recently-incorporated nonprofit, we held a fundraising event in which we talked about “putting together puzzle pieces of the past.” We scattered puzzle pieces around the tables and talked about how a person with one puzzle piece can only understand her piece in a limited way. But if she could access the pieces held by scholars at the other tables, she’d have a greater understanding of how her piece fit the larger picture. At that time, the term “Open Data” didn’t see much use. But that’s what we were talking about—tools and approaches to making data on the Web open, discoverable, and reusable. That has been the vision of Open Context since its inception.
With the ever-increasing amount of information available worldwide, the next ten years is sure to see exciting developments in the area of Open Education and Open Data. The vision articulated by Cathy Casserly (ED of Creative Commons) for the Open Education movement applies also to Open Data: the information increasingly is out there, but the challenge now is to improve and build upon it. For scholarly data, we envision a future where open data publication is an expected outcome of research, and tools are available to easily access data across the Web, combine them, and use them in new research. Open Data are essential for this vision of broad and seamless access to information to be realized. Tim Berners-Lee, inventor of the World Wide Web, explains the power of linked data in a widely referenced 2009 TED talk. Essentially, the more (Linked) Open Data that we can connect, the more powerful the data become.
Posted in Events, News.
– March 8, 2012