Comments on OSTP Open Data Policy

Today, Open Context’s Eric Kansa spoke (via phone) at the meeting on Public Access to Federally-Supported Research and Development Data and Publications: Data, hosted by the National Research Council of the National Academies. The meeting, taking place May 16-17, is hearing invited and public comments on the White House OSTP memo on expanding access to data resulting from federally-funded research, with the aim of informing agencies as they develop policies in response to the memo.

The NRC has posted a video of the meeting online. In addition, you can read the AAI’s comments in a document that includes responses from various individuals/organizations, some of whom also spoke at the meeting. In the meantime, here are Eric’s comments:

My name is Eric Kansa, and I manage and direct Open Context, an open access, open licensed data publication venue for archaeology and related fields. I’ve also participated in text-mining in the digital humanities. Text-mining really shows that the boundaries between text and data are increasingly burred, and that texts (publications) increasingly share many of the open intellectual property requirements critical to the re-useability of data.

While we focus on editorial and peer review services on data contributions, we work closely with colleagues at the University of California, California Digital Library, an institution that provides us with essential digital repository & persistent identity services. With Open Context, we are grateful for grant support from the National Endowment for the Humanities, particularly the Office of Digital Humanities, the National Science Foundation (see current work), and private foundations. We’re one example of how the lines between the humanities and sciences are increasingly blurred, and that’s a good thing.

In receiving support from multiple federal agencies, I think coordination across agencies is vital. Research suffers when stove-piped in artificial silos. Similarly other agencies also support and even mandate research, especially to enforce laws in historical preservation and environmental protection. Data practices relating to compliance-oriented research also need to be harmonized with agencies that support mainly academic oriented-research.

Based on over 10 years experience promoting greater data openness and professionalism in archaeology, I think it critical for policy-making to promote dynamism and innovation in the management of data. Data needs are diverse and ever evolving. We need to encourage that dynamism by welcoming new entrants with new ideas and approaches to data management, data preservation, dissemination and reuse.

There’s often a tacit assumption that data are a “residue” of research, and a researcher’s primary responsibility with respect to data centers mainly on preservation. I think that is limiting, and in some circumstances, data can and should be valued as a primary outcome of research. To borrow a phrase from my colleagues at the California Digital Library, data can also be a “first class citizen” of scholarly production. Data can also play a central role in new modes of scholarly communications, with approaches like “data sharing as publication”, or exhibition, or even data sharing as a kind of open-source release cycle. The point is, data can play many and expanding roles in researcher communications. Policy should not assume that data should only play the role of a secondary, supplemental outcome to research.

The need to foster dynamism also needs to inform thinking about financial sustainability. Public policy needs to recognize that the sustainability of particular organizations and practices in the research endeavor is only a means to an end in promoting the public good. Sustainability of particular interests should not be an end to itself. “Resiliency” may be a better term, since it may better capture our obligations for data and knowledge stewardship without lock-in to particular set of institutions or practices.

In other words, notions of data “openness” need to expand beyond technical and licensing concerns, but also to the organizations and people participating in the research community’s information ecosystem, esp. the next generation of students who will have their own needs and priorities with respect to data. True resiliency will require real funding, an issue where OSTP policy memo falls short. And I urge agencies to work with the research community, libraries, and others to honestly understand funding requirements. We need this to make a clear case to the American public about investing in unlocking the richness of research data.

Earlier this week, the NRC sponsored a related meeting to hear comments on the other part of the OSTP memo, relating to public access to publications resulting from federally-funded research. The AAI submitted comments for this meeting, as well, which you can read here in a PDF containing all mail-in responses.

