Wednesday, September 10, 2008

Week 3 Readings

Identifiers and Their Role in Networked Information Applications

The information on URNs was new for me, but I have encountered PURLs many times when accessing articles from digital libraries. PURLs are very useful for ensuring citations remain valid. I wonder if OCLC is solely responsible for hosting PURLs or if digital libraries are hosting their own PURLs. For instance, when I see a PURL for an EBSCOHost
article, is it related to OCLC or to EBSCOHost?


I was also unaware that new Internet protocols will supersede HTTP. Is this happening now?


DOIs present some troubling issues for academic scholarship. Lynch wrote that the advent of DOIs is "likely to mean that the author of the citing work will need to obtain the DOI of the work that he or she wishes to cite either from the owner of the cited work or from some third party, and accessing a citation would then involve interaction with the DOI resolution service, raising privacy and control issues." I imagine this could discourage citations because of the time and effort required on the part of the citing individual. What if the cited author cannot be reached or found? Consider what this would mean for students making routine citations of authors' works. I agree with Lynch that "the act of reference should not rely upon proprietary databases or services."


Digital Object Identifier System

The structure of DOIs seems very well organized. The ingenious lack of rules about DOI length should allow its use far into the future, whereas the finite number ISBNs and ISSNs is problematic.

This article does not mention any of Lynch's concerns. Paskin wrote very recently; perhaps Lynch's issues have since been resolved.


Arms, Chapter 9: Text

Arms wrote in 1999, "Optical character recognition remains an inexact process." This is still troubling digital library users today. I read several journal articles a week for my classes in the MLIS program and it is common to find a handful of mistakes in each article due to poor OCR. Often the context allows me to correct the mistake, but sometimes mistakes lead to confusion and lack of understanding. I am surprised that articles that must undergo such stringent peer review and editorial scrutiny can then be posted with flaws in expensive subscription databases. It is very interesting that outsourced manual typing has been cheaper than OCR combined with proofreading. I have heard that non-native English speakers are sometimes more accurate at English data entry than native English speakers, because they must pay close attention to each unknown character.

Arms speaks about three approaches to page description: TeX, PostScript and PDF. PDF seems to have cornered the market now; most digital libraries offer articles in PDF format. Have things changed drastically since 1999, or are
TeX and PostScript still being used?

1 comment:

Coral said...

I feel like PostScript is on its way out, but that's just my impression.... As far as I know, TeX is used only by the most particular of folks; a coworker of mine used to write his papers that way and compile them straight to PDF. The resulting file looked so much nicer than a MS Word-to-PDF conversion that I was actually pretty sold on that approach, myself.

I never got awesome at using it, though.