3 minute read.
Oh No, Not You Again!
Oh dear. Yesterday’s post “Using ISO URNs” was way off the mark. I don’t know. I thought that walk after lunch had cleared my mind. But apparently not. I guess I was fixing on eyeballing the result in RDF/N3 rather than the logic to arrive at that result.
There are three namespace cases (and I was only wrong in two out of the three, I think):
I was originally going to suggest the use of “data:” for the PDF information dictionary terms here but then lunged at using an HTTP URI (the URI of the page for the PDF Reference manual on the Adobe site) for regular orthodox conformancy and good churchgoing:
@prefix pdf: <http://www.adobe.com/devnet/pdf/pdf_reference.html> .
This was wrong on two counts:
a) Afaik no such use for this URI as a namespace has ever been made by Adobe. And it is in the gift of the DNS tenant (elsewhere called “owner”) to mint URIs under that namespace and to ascribe meanings to those URIs.
b) Also the URI is not best suited to a role as namespace URI since RDF namespaces typically end in “/” or “#” to make the division between namespace and term clearer. (In XML it doesn’t make a blind bit of difference as XML namespaces are just a scoping mechanism.) So to have a property URI as
does the job but looks pretty rough and more importantly precludes (at least, complicates) the possibility of dereferencing the URI to return a page with human or machine readable semantics. Better in RDF terms is one of the following:
In the absence of any published namespace from Adobe for these terms, I think it would have been more prudent to fall back on “data:” URIs. So
@prefix pdf: <data:,> .
This is correct (afaict) and merely provides a URI representation for bare strings.
Had we wanted to relate those terms to the PDF Reference we might have tried something like:
And if we had wanted to make those truly secondary RDF resources related to a primary RDF resource for the “namespace” we could have attempted something like:
Note though that the “data:” specification is not clear about the implications of using “#”. (Is it allowed, or isn;t it?) We must suspect that it is not allowed, but see this mail from Chris Lilley (W3C) which is most insightful.
The example was just for demo purposes, but (as per 1a above) it is incumbent on the namespace authority (here ISO) to publish a URI for the term to be used. Anyhow, the namespace URI I cited
@prefix pdfx: <urn:iso:std:iso-iec:15930:-1:2001> .
would not have been correct and would have led to these mangled URIs:
It should have been something closer to
@prefix pdfx: <urn:iso:std:iso-iec:15930:-1:2001:> .
This was the one correct call in yesterday’s post.
@prefix _usr: <data:,> .
The only problem here would be to differentiate these terms from the terms listed in the PDF Reference manual, although the PDF information dictionary makes no such distinction itself.
To sum up, perhaps the best way of rendering the PDF information dictionary keys in RDF would be to use “data:” URIs for all (i.e. a methodology for URI-ifying strings) and to bear in mind that at some point ISO might publish URNs for the PDF/X mandated keys: ‘GTS_PDFXVersion‘ and ‘GTS_PDFXConformance‘. So,
# document infodict (object 58: 476983):
@prefix: pdfx: <data:,> .
@prefix: pdf: <data:,> .
@prefix: _usr: <data:,> .
<> _usr:Apag_PDFX_Checkup "1.3";
pdf:Author "Scott B. Tully";
pdf:Producer "Acrobat Distiller 4.05 for Macintosh";
pdf:Subject "A document from our PDF archive. ";
pdf:Title "Tully Talk November 2001";
pdf:Trapped "False" .