
Monday, March 28, 2011
Siva Vaidhyanathan, The Googlization of Everything

Thursday, December 2, 2010
Improving Web Statistics

I've been looking into ways to increase search ranking and web stats for my digital archive. One really helpful presentation I found:
"Search Engine Optimization for Digital Collections" by Kenning Arlitsch, Patrick OBrien
and Sandra McIntyre.
The authors discuss the unique problems of digital libraries and ways to solve them. They explain how to use Google Webmaster Tools to check for webcrawler errors (definitely worth checking out if you run your own website!) and various technical ways to improve indexing.
We've also been brainstorming ways to get more use out of our institution's Twitter and Facebook accounts. They're both pretty active, which is great, but we're mainly just posting "This Day in Cold War History" links. I'd like to get us interacting more with our users, in hopes that that will increase our followers/visibility, and thereby the number of people being redirected to our actual website.
Related Links:
- Andy Woodworth just put up an interesting post about using Facebook ads: Selling Myself. Literally.
- bit.ly also just held an API creation contest, which lists many cool tools, including Your Twitter Trending Topics. It compares your number of bit.ly clickthroughs with the words in your tweets, helping you see which topics your followers click on the most.
Wednesday, November 24, 2010
'Cause Tonight is the Night

My digital archive uses Dublin Core and I’ve been looking into best practices. This led to the realization that we currently violate one of the central tenets of Dublin Core, the One-to-One Principle:
In general Dublin Core metadata describes one manifestation or version of a resource, rather than assuming that manifestations stand in for one another. For instance, a jpeg image of the Mona Lisa has much in common with the original painting, but it is not the same as the painting. As such the digital image should be described as itself, most likely with the creator of the digital image included as a Creator or Contributor, rather than just the painter of the original Mona Lisa.
Like many cultural heritage projects, my digital archive has cheerfully ignored the One-to-One Principle for years, combining metadata about both the digital file and physical original in a single record. I’m not planning to change this because--abstract principles aside--mixed records make more sense for both our users and our local situation.
In an article on current practice and the One-to-One Principle, Steven Miller of the University of Wisconsin gets to the heart of the problem for me:
…many practitioners, including those who are well aware of the One-to-One principle, come to their digital collection projects with the intent to create records only for their digital resources. They are creating metadata for an online collection of digital resources, not a database or catalog of both their analog holdings and their digitized files.
My archive doesn't even have real physical material (all of our documents are photocopies or scans from other archives), so why go to the trouble of creating two separate records for each item? Not to mention, double records would be a headache if we ever exposed our metadata for aggregators.
In the same article, Miller recommends a compromise solutions:
- Follow the One-to-One Principle as much as possible, with the bulk of a record focusing on either the digital or the original,
- use the source field to explain the relationship between the digital and original versions (i.e. “Digital reproduction of photographic print in the So-and-so Collection, located in the Such-and-such Archive.”)
*One caveat: I’m not crazy about some of Miller’s DC mappings in his examples. For instance, in one he uses the "Contributor" field for the name of the institution holding the original physical document, which I don’t think is right. It makes much more sense in the Publisher or Relation field. See Arwen Hutt and Jenn Riley, “Semantics and Syntax of Dublin Core Usage in Open Archives Initiative Data Providers of Cultural Heritage Materials,” p. 6.
Monday, November 22, 2010
PDF/A Link Dump
PDF/A is a new(ish) file format. It’s a long-term archival version of the classic PDF format we all know and love. Basically, it’s the same as regular old PDF, but it’s guaranteed to look exactly the same years from now when you open it on your holographic iPhone. It should be super easy to implement since the scanning software we currently use, Adobe Acrobat Pro, already has settings for scanning/converting to PDF/A.
Report from Ohio State University Library which discusses different options for converting documents using Microsoft Word and Adobe Acrobat Pro.
Great Adobe Acrobat Pro tutorial which explains exactly which features are and aren't PDF/A compliant. (Note: The narrator has a very soothing accent.)
Uses email to verify attached PDF/A documents.