Re: Library of Congress Subject Headings as SKOS Linked Data from Mark Birbeck on 2008-06-17 (public-lod@w3.org from June 2008)

From: Mark Birbeck <mark.birbeck@webbackplane.com>
Date: Tue, 17 Jun 2008 20:18:21 +0100
To: "Richard Cyganiak" <richard@cyganiak.de>
Cc: "Hausenblas, Michael" <michael.hausenblas@joanneum.at>, "Ed Summers" <ehs@pobox.com>, "SWD Working SWD" <public-swd-wg@w3.org>, public-lod@w3.org, "RDFa mailing list" <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <ed77aa9f0806171218h1d99ecedoab356604691c7558@mail.gmail.com>
Hi Richard,

This is something that comes up a lot in 'compound' documents, such as
those containing XForms and SVG.

Many moons ago I did some work on this, and suggested using profile in
the way you have, but letting the value carry the 'hasFeature' strings
used in the DOM. For example, saying a UA supports XForms 1.0 and any
version of RDFa would look like this:

  application/xhtml+xml;profile=xforms 1.0 rdfa;q=1.0

I've never got round to writing it up though, but as you can see, it's
pretty much the same as the conclusion you have come to.

The Compound Document Format Working Group takes a slightly different
approach, which is that they create fixed profiles that contain a
prescribed set of features, and if your UA supports the entire
profile, they announce it with a URI.

This is introduced here:

  <http://www.w3.org/TR/2007/CR-CDR-20070718/#identification>

and a full example is here:

  <http://www.w3.org/TR/WICDFull/#identification>

Obviously I prefer 'our' approach, since I just hate the idea of
having centralised profiles that you have to wait for someone to
administer.

But either way (using full URIs or something like the 'hasFeature'
strings), neither of the approaches includes an identifier for RDFa,
so it's something we should think about getting down somewhere.

Regards,

Mark

On Tue, Jun 17, 2008 at 4:03 PM, Richard Cyganiak <richard@cyganiak.de> wrote:
>
> Michael,
>
> On 11 Jun 2008, at 09:25, Hausenblas, Michael wrote:
>>>
>>> (I'm ignoring the availability of RDFa in my argument --
>>> unfortunately there is no way for a client to indicate that it supports
>>
>> RDFa AFAIK,
>>>
>>> so it cannot really be factored into the content negotiation equation.)
>>
>> As you may guess I have a comment regarding this one ;)
>>
>> I'm note sure if I understand this correctly ('no way for a client to
>> indicate that it supports RDFa') but, though RDFa uses
>> 'application/xhtml+xml' MIME type, there are several ways to
>> declare/detect RDFa content.
>
> Yes, I know. The problem I was getting at is a different one: There is no
> way for a *client* to *announce* that it supports RDFa.
>
> When doing content negotiation, a server has to look at the Accept header
> sent by the client, and make a decision about which variant to send.
> Unfortunately, there is no way for the server to distinguish a plain old Web
> browser from an RDFa-capable client.
>
> What is a server to do if it has some really nice RDF data, but also a
> simple XHTML rendering of the RDF with embedded RDFa? If the client is
> RDFa-aware, then sending the XHTML page is probably best, because it gives
> the client the best of two worlds. The client can decide wether to display
> the syled XHTML or wether to treat it as data and e.g. show it in a
> tabular/faceted fashion.
>
> But if the client is a Tabulator-style browser without RDFa support, then
> the server should send the RDF/XML to let the client get all the data
> goodness.
>
> Tabulator sends this in its Accept header:
> application/xhtml+xml;q=1.0, application/rdf+xml;q=0.8
>
> That's reasonable. It's a Web browser, so it can deal perfectly with XHTML.
> It also includes a data browser which is not quite as sophisticated, so a
> preference of 0.8 for RDF/XML is fine. So our hypothetical server should
> send RDF/XML in this case.
>
> How would a client announce that it is a Web browser (perfect XHTML
> support), that it also can deal reasonably well with RDF/XML, but that it
> would really prefer to get the best of both worlds if the server has RDFa?
> Which Accept header should make our server respond with RDFa?
>
> I don't know, because there is no way to announce RDFa support in Accept
> headers.
>
> Hence, RDFa-capable browsers will probably often get served RDF/XML, because
> they don't tell the server that they would prefer an RDFa-enabled styled
> HTML page over an unstyled RDF/XML page.
>
> I might propose something like
>
> application/xhtml+xml;profile=rdfa;q=1.0
>
> in the Accept header, but am not sure about unintended side effects.
>
> Best,
> Richard
>
>
>
>> All my statements are based on the latest
>> RDFa syntax CR document at [1].
>>
>> First (not preferred by some people ;) you could/should use the type
>> declaration (<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
>> "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">). Further, people are
>> encouraged to use the according @profile (<head
>> profile="http://www.w3.org/1999/xhtml/vocab">) and/or the version
>> attribute (<html xmlns="http://www.w3.org/1999/xhtml"
>> version="XHTML+RDFa 1.0" ... />).
>>
>> Finally, the W3C TAG also addresses this issue, see [2].
>>
>> Please feel free to discuss this issues also at our RDFa community Wiki
>> (e.g. at [3]).
>>
>> Cheers,
>>        Michael
>>
>> [1] http://www.w3.org/MarkUp/2008/CR-rdfa-syntax-20080612/#s_conformance
>> [2]
>> http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html#UsingRDFa
>> [3] http://rdfa.info/wiki/Tutorials#Good_practice
>>
>> ----------------------------------------------------------
>> Michael Hausenblas, MSc.
>> Institute of Information Systems & Information Management
>> JOANNEUM RESEARCH Forschungsgesellschaft mbH
>>
>> http://www.joanneum.at/iis/
>> ----------------------------------------------------------
>>
>>
>>> -----Original Message-----
>>> From: public-swd-wg-request@w3.org
>>> [mailto:public-swd-wg-request@w3.org] On Behalf Of Richard Cyganiak
>>> Sent: Wednesday, June 11, 2008 1:18 AM
>>> To: Ed Summers
>>> Cc: SWD Working SWD; public-lod@w3.org
>>> Subject: Re: Library of Congress Subject Headings as SKOS Linked Data
>>>
>>>
>>> Ed,
>>>
>>> A very cool service, and exemplary attention to detail!
>>>
>>> Of course, I still have a few suggestions! I haven't read through the
>>> entire thread, so apologies if some of this was mentioned already.
>>>
>>> (I saw 303s being mentioned in the thread -- you are doing things the
>>> right way, there's no need to do 303s at <sh95000541>. It is an
>>> information resource and therefore 200 is fine. The concept is
>>> <sh95000541#concept>, a URI that cannot be directly dereferenced via
>>> HTTP, so you are consistent with httpRange-14, as explained in the
>>> Cool URIs document. This is one of the nice things about hash URIs.)
>>>
>>> 1. The content-negotiated URI should send a "Vary: Accept" header.
>>> This helps caches to deal correctly with content-negotiated resources.
>>>
>>> 2. The correct MIME type for N3 is "text/rdf+n3;charset=utf-8", not
>>> "text/n3". (I think the spec used to recommend text/n3, but has been
>>> changed some time ago.)
>>>
>>> 3. I would suggest adding a few triples to the RDF/XML and N3
>>> versions, to link the generic document to its variants, and the
>>> generic document to the concept. Example (choose your own favourite
>>> properties):
>>>
>>> <sh95000541> foaf:primaryTopic <sh95000541#concept> .
>>> <sh95000541> dcterms:format <sh95000541.rdf> .
>>> <sh95000541> dcterms:format <sh95000541.n3> .
>>> <sh95000541> dcterms:format <sh95000541.json> .
>>> <sh95000541> dcterms:format <sh95000541.html> .
>>>
>>> This helps RDF browsers to relate all those resources.
>>>
>>> 4. The content negotiation could benefit from a little bit of
>>> tweaking. You correctly handle q values, which is great. It would be
>>> even better if there was a slight bias towards the non-HTML
>>> formats. I
>>> would argue that the data variants are quite a bit more useful than
>>> the HTML variant, as RDF-aware clients can do all sorts of cool stuff
>>> with the RDF that are not possible . Therefore, a client that
>>> indicates identical preference for HTML and RDF/XML should be served
>>> RDF/XML. FWIW, Tabulator has a preference of 1.0 for XHTML and
>>> 0.8 for
>>> RDF/XML. It would be great if your algorithm would return RDF/XML in
>>> this case.
>>>
>>> (I'm ignoring the availability of RDFa in my argument --
>>> unfortunately there is no way for a client to indicate that it supports
>>
>> RDFa AFAIK,
>>>
>>> so it cannot really be factored into the content negotiation equation.)
>>>
>>> 5. Ideally, you would add the skos:prefLabels of all related concepts
>>> to the RDF output. This would support navigation in RDF browsers.
>>>
>>> Again, great work!
>>>
>>> Best,
>>> Richard
>>>
>>>
>>> On 9 Jun 2008, at 14:54, Ed Summers wrote:
>>>
>>>>
>>>> I'd like to announce an experimental linked-data, SKOS representation
>>>> of the Library of Congress Subject Headings (LCSH) [1] ... and also
>>>> ask for some help.
>>>>
>>>> The Library of Congress has been participating in the W3C
>>>
>>> Semantic Web
>>>>
>>>> Deployment Working Group, and has converted LCSH from the MARC21 data
>>>> format [2] to SKOS. LCSH is a controlled vocabulary used to index
>>>> materials that have been added to the collections at the Library of
>>>> Congress. It has been in active development since 1898, and was first
>>>> published in 1914 so that other libraries and bibliographic utilities
>>>> could use and adapt it. The lcsh.info service makes 266,857 subject
>>>> headings available as SKOS concepts, which amounts to 2,441,494
>>>> triples that are separately downloadable [3] (since there isn't a
>>>> SPARQL endpoint just yet).
>>>>
>>>> At the last SWDWG telecon some questions came up about the way
>>>> concepts are identified, and made available via HTTP. Since we're
>>>> hoping lcsh.info can serve as an implementation of SKOS for the W3C
>>>> recommendation process we want to make sure we do this
>>>
>>> right. So I was
>>>>
>>>> hoping interested members of the linked-data and SKOS communities
>>>> could take a look and make sure the implementation looks correct.
>>>>
>>>> Each concept is identified with a URI like:
>>>>
>>>> http://lcsh.info/sh95000541#concept
>>>>
>>>> When responding to requests for concept URIs, the server content
>>>> negotiates to determine which representation of the concept
>>>
>>> to return:
>>>>
>>>> - application/xhtml+xml
>>>> - application/json
>>>> - text/n3
>>>> - application/rdf+xml
>>>>
>>>> This is basically the pattern that Cool URIs for the Semantic Web
>>>> discusses as the Hash URI with Content Negotiation [4]. An additional
>>>> point that is worth mentioning is that the XHTML representation
>>>> includes RDFa, that also describes the concept.
>>>>
>>>> At the moment the LCSH/SKOS data is only linked to itself, through
>>>> assertions that involve skos:broader, skos:narrower, and
>>>
>>> skos:related.
>>>>
>>>> But the hope is that minting URIs for LCSH will allow it to be mapped
>>>> and/or linked to concepts in other vocabularies: dbpedia, geonames,
>>>> etc.
>>>>
>>>> Any feedback, criticisms, ideas are welcome either on either the
>>>> public-lod [5] or public-swd-wg [6] discussion lists.
>>>>
>>>> Thanks for reading this far!
>>>> //Ed
>>>>
>>>> [1] http://lcsh.info
>>>> [2] http://www.loc.gov/marc/
>>>> [3] http://lcsh.info/static/lcsh.nt
>>>> [4] http://www.w3.org/TR/cooluris/#hashuri
>>>> [5] http://lists.w3.org/Archives/Public/public-lod/
>>>> [6] http://lists.w3.org/Archives/Public/public-swd-wg/
>>>>
>>>
>>>
>>>
>
>
>



-- 
Mark Birbeck, webBackplane

mark.birbeck@webBackplane.com

http://webBackplane.com/mark-birbeck

webBackplane is a trading name of Backplane Ltd. (company number
05972288, registered office: 2nd Floor, 69/85 Tabernacle Street,
London, EC2A 4RR)
Received on Tuesday, 17 June 2008 19:19:04 UTC