Today, I’m happy to give you a “guest post” by Robert K. Englund, director of the excellent UCLA-Max Planck Institute for the History of Science Cuneiform Digital Library (CDLI) project which I reviewed earlier.
In a highly interesting article by Azhideh Moqaddam describing the recent Jiroft discoveries (“Ancient geometry and “*Proto-Iranian” scripts, South Konar Sandal mound inscriptions, Jiroft,” Fs. Kreyenbroek  53-103), the author cites (p. 54 n. 12) a page from our CDLI domain in reference to the current state of a proto-Elamite sign list.
The reader, however, will be hard-pressed to find the reference “http://cdli.ucla.edu/wiki/index.php?title=Proto-Elamite&redirect=no, p. 4” in the web. It might refer to what is currently “http://cdli.ucla.edu/wiki/doku.php/proto-elamite,” a CDLI wiki page in which Jacob Dahl offers a quick overview of the growing electronic resources for proto-Elamite research.
Since this is just one of a number of citations of CDLI URLs in recent paper publications, it may be timely to make a statement about the purpose and reliability of web resources such as ours when they are used in hard publications, in particular to underscore the distinct persistence of only three types of CDLI URLs: 1) the lead domain address itself, “http://cdli.ucla.edu/” ; 2) the journals page (http://cdli.ucla.edu/pub.html) and the individual contribution URLs of CDLJ, CDLB and CDLN ; 3) the addresses of individual cuneiform text artifacts of the form “http://cdli.ucla.edu/P115925.”
As is obvious to users, web research and communication have many strengths that slow-moving, analog resources such as bound books and journal volumes cannot match. For instance, web dissemination of information very radically expands the pool of potential readers and responders–and includes in the readership whole regions and demographics that would otherwise never be exposed to the A[rchiv] f[ür] O[rientforschung]s and C[uneiform] M[onograph]s of the Assyriological community, with their hefty price tags and often years-long production schedules, nor certainly to the raw file documentation of very dispersed artifact collections. Then too, hyperlinked resources compress to a few seconds the reference checks that otherwise occupy an afternoon, if the proofer is fortunate enough to work in Berlin, or Chicago, and these hyperlinks in academic publications, among other advantages, finally offer our footnote geniuses the opportunity to embed note in note, ad infinitum–flights of resource access that can transport established professors back to their heady days of discovery, seated at a table decked with “many a tome.”
Well developed data creation and dissemination strategies look to text and image file format standardizations that protect data from generational loss, and in open access platforms they endeavor to facilitate the harvest, aggregation and re-use of core data and their annotation by experts, thus leading to a certain “cloud security” of important data sets. But the grave problem of simple URL decay remains. This is not just a matter of this or that website leaving the internet, funding disappeared or director incapacitated; nor the rollback of data access following the activities of intellectual property demons; but perhaps more importantly it points to the inherent instability of internal pages and their content within a given domain.
CDLI is assuredly not alone in its understanding of its domain addresses as in part stable, in part unstable. There is much pressure to improve the usability of project web pages up and down the line; at the same time, everyone wants to build good research resources for long-term use. Administrators of small digital libraries know all too well how painful is the stage of converting operations to persistent, versioned data sets online–that is, at the point in the chart where the data persistence line moving up crosses the data production and improvement line moving down. This is in fact the stage where custodians of data persistence–librarians–enter our work and, by backing up archival files to versioned and permanent repositories, protect our data from ourselves and our various destinies. Thus CDLI is currently collaborating with UCLA’s Digital Library Program to enter image and text files to the so-called “archival resource keys” assigned by the California Digital Library of the University of California; such keys–unique alpha-numeric strings–establish permanent URL’s for all processed archival files associated with some discrete cuneiform text artifact. Such artifacts will continue to carry the internally generated “P numbers” that identify entries in CDLI, but will have the added protection of a state institution–the University of California–that will enjoy a longer life than most humanities projects.
We have created a convenient short URL for each cuneiform inscription in our files, for instance “http://cdli.ucla.edu/P361694” pointing to the web page documenting an Old Assyrian tablet in the recently digitized Rosicrucian Egyptian Museum collection published by M. Larsen as Old Assyrian Archives 1 (PIHANS 96; Leiden 2002) no. 51. Aside from the high domain address http://cdli.ucla.edu/ and the addresses of our online journal contributions (“http://cdli.ucla.edu/pubs/cdlj/2009/cdlj2009_007.html” will, with high certainty, always lead to R. McC. Adams, “Old Babylonian Networks of Urban Notables,” CDLJ 2009:7), these individual text addresses are the only URLs in CDLI that can be confidently cited in hard-print publications, though obviously still with less confidence than a reference to some printed resource deposited in a library. Should, one day, some other public institution agree to assume full responsibility for an ongoing CDLI, a simple redirect will care for the permanence of these current CDLI URLs found within the UCLA domain.
I personally would not, for the time being, cite in print any resource in CDLI that makes no claim to at least the level of permanency offered by the individual text addresses and the top-level links to CDLI itself, and to its online journals. All else–and this includes transliteration content–is subject to eventual renaming, decay, or, as should be clear, is a moving target with content improvements that, given our resources, cannot at present be properly time-stamped for purposes of reliable print citation.