Re: Library of Congress Subject Headings as SKOS Linked Data from Richard Cyganiak on 2008-06-17 (public-swd-wg@w3.org from June 2008)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Tue, 17 Jun 2008 11:03:18 -0400
To: "Hausenblas, Michael" <michael.hausenblas@joanneum.at>
Cc: "Ed Summers" <ehs@pobox.com>, "SWD Working SWD" <public-swd-wg@w3.org>, <public-lod@w3.org>, "RDFa mailing list" <public-rdf-in-xhtml-tf@w3.org>
Message-Id: <B75D4AA5-031A-4BB0-A34E-CC1DEF19E4F3@cyganiak.de>
Michael,

On 11 Jun 2008, at 09:25, Hausenblas, Michael wrote:
>> (I'm ignoring the availability of RDFa in my argument --
>> unfortunately there is no way for a client to indicate that it  
>> supports
> RDFa AFAIK,
>> so it cannot really be factored into the content negotiation  
>> equation.)
>
> As you may guess I have a comment regarding this one ;)
>
> I'm note sure if I understand this correctly ('no way for a client to
> indicate that it supports RDFa') but, though RDFa uses
> 'application/xhtml+xml' MIME type, there are several ways to
> declare/detect RDFa content.

Yes, I know. The problem I was getting at is a different one: There is  
no way for a *client* to *announce* that it supports RDFa.

When doing content negotiation, a server has to look at the Accept  
header sent by the client, and make a decision about which variant to  
send. Unfortunately, there is no way for the server to distinguish a  
plain old Web browser from an RDFa-capable client.

What is a server to do if it has some really nice RDF data, but also a  
simple XHTML rendering of the RDF with embedded RDFa? If the client is  
RDFa-aware, then sending the XHTML page is probably best, because it  
gives the client the best of two worlds. The client can decide wether  
to display the syled XHTML or wether to treat it as data and e.g. show  
it in a tabular/faceted fashion.

But if the client is a Tabulator-style browser without RDFa support,  
then the server should send the RDF/XML to let the client get all the  
data goodness.

Tabulator sends this in its Accept header:
application/xhtml+xml;q=1.0, application/rdf+xml;q=0.8

That's reasonable. It's a Web browser, so it can deal perfectly with  
XHTML. It also includes a data browser which is not quite as  
sophisticated, so a preference of 0.8 for RDF/XML is fine. So our  
hypothetical server should send RDF/XML in this case.

How would a client announce that it is a Web browser (perfect XHTML  
support), that it also can deal reasonably well with RDF/XML, but that  
it would really prefer to get the best of both worlds if the server  
has RDFa? Which Accept header should make our server respond with RDFa?

I don't know, because there is no way to announce RDFa support in  
Accept headers.

Hence, RDFa-capable browsers will probably often get served RDF/XML,  
because they don't tell the server that they would prefer an RDFa- 
enabled styled HTML page over an unstyled RDF/XML page.

I might propose something like

application/xhtml+xml;profile=rdfa;q=1.0

in the Accept header, but am not sure about unintended side effects.

Best,
Richard



> All my statements are based on the latest
> RDFa syntax CR document at [1].
>
> First (not preferred by some people ;) you could/should use the type
> declaration (<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
> "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">). Further, people are
> encouraged to use the according @profile (<head
> profile="http://www.w3.org/1999/xhtml/vocab">) and/or the version
> attribute (<html xmlns="http://www.w3.org/1999/xhtml"
> version="XHTML+RDFa 1.0" ... />).
>
> Finally, the W3C TAG also addresses this issue, see [2].
>
> Please feel free to discuss this issues also at our RDFa community  
> Wiki
> (e.g. at [3]).
>
> Cheers,
> 	Michael
>
> [1] http://www.w3.org/MarkUp/2008/CR-rdfa-syntax-20080612/#s_conformance
> [2]
> http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html#UsingRDFa
> [3] http://rdfa.info/wiki/Tutorials#Good_practice
>
> ----------------------------------------------------------
> Michael Hausenblas, MSc.
> Institute of Information Systems & Information Management
> JOANNEUM RESEARCH Forschungsgesellschaft mbH
>
> http://www.joanneum.at/iis/
> ----------------------------------------------------------
>
>
>> -----Original Message-----
>> From: public-swd-wg-request@w3.org
>> [mailto:public-swd-wg-request@w3.org] On Behalf Of Richard Cyganiak
>> Sent: Wednesday, June 11, 2008 1:18 AM
>> To: Ed Summers
>> Cc: SWD Working SWD; public-lod@w3.org
>> Subject: Re: Library of Congress Subject Headings as SKOS Linked Data
>>
>>
>> Ed,
>>
>> A very cool service, and exemplary attention to detail!
>>
>> Of course, I still have a few suggestions! I haven't read through the
>> entire thread, so apologies if some of this was mentioned already.
>>
>> (I saw 303s being mentioned in the thread -- you are doing things the
>> right way, there's no need to do 303s at <sh95000541>. It is an
>> information resource and therefore 200 is fine. The concept is
>> <sh95000541#concept>, a URI that cannot be directly dereferenced via
>> HTTP, so you are consistent with httpRange-14, as explained in the
>> Cool URIs document. This is one of the nice things about hash URIs.)
>>
>> 1. The content-negotiated URI should send a "Vary: Accept" header.
>> This helps caches to deal correctly with content-negotiated  
>> resources.
>>
>> 2. The correct MIME type for N3 is "text/rdf+n3;charset=utf-8", not
>> "text/n3". (I think the spec used to recommend text/n3, but has been
>> changed some time ago.)
>>
>> 3. I would suggest adding a few triples to the RDF/XML and N3
>> versions, to link the generic document to its variants, and the
>> generic document to the concept. Example (choose your own favourite
>> properties):
>>
>> <sh95000541> foaf:primaryTopic <sh95000541#concept> .
>> <sh95000541> dcterms:format <sh95000541.rdf> .
>> <sh95000541> dcterms:format <sh95000541.n3> .
>> <sh95000541> dcterms:format <sh95000541.json> .
>> <sh95000541> dcterms:format <sh95000541.html> .
>>
>> This helps RDF browsers to relate all those resources.
>>
>> 4. The content negotiation could benefit from a little bit of
>> tweaking. You correctly handle q values, which is great. It would be
>> even better if there was a slight bias towards the non-HTML
>> formats. I
>> would argue that the data variants are quite a bit more useful than
>> the HTML variant, as RDF-aware clients can do all sorts of cool stuff
>> with the RDF that are not possible . Therefore, a client that
>> indicates identical preference for HTML and RDF/XML should be served
>> RDF/XML. FWIW, Tabulator has a preference of 1.0 for XHTML and
>> 0.8 for
>> RDF/XML. It would be great if your algorithm would return RDF/XML in
>> this case.
>>
>> (I'm ignoring the availability of RDFa in my argument --
>> unfortunately there is no way for a client to indicate that it  
>> supports
> RDFa AFAIK,
>> so it cannot really be factored into the content negotiation  
>> equation.)
>>
>> 5. Ideally, you would add the skos:prefLabels of all related concepts
>> to the RDF output. This would support navigation in RDF browsers.
>>
>> Again, great work!
>>
>> Best,
>> Richard
>>
>>
>> On 9 Jun 2008, at 14:54, Ed Summers wrote:
>>
>>>
>>> I'd like to announce an experimental linked-data, SKOS  
>>> representation
>>> of the Library of Congress Subject Headings (LCSH) [1] ... and also
>>> ask for some help.
>>>
>>> The Library of Congress has been participating in the W3C
>> Semantic Web
>>> Deployment Working Group, and has converted LCSH from the MARC21  
>>> data
>>> format [2] to SKOS. LCSH is a controlled vocabulary used to index
>>> materials that have been added to the collections at the Library of
>>> Congress. It has been in active development since 1898, and was  
>>> first
>>> published in 1914 so that other libraries and bibliographic  
>>> utilities
>>> could use and adapt it. The lcsh.info service makes 266,857 subject
>>> headings available as SKOS concepts, which amounts to 2,441,494
>>> triples that are separately downloadable [3] (since there isn't a
>>> SPARQL endpoint just yet).
>>>
>>> At the last SWDWG telecon some questions came up about the way
>>> concepts are identified, and made available via HTTP. Since we're
>>> hoping lcsh.info can serve as an implementation of SKOS for the W3C
>>> recommendation process we want to make sure we do this
>> right. So I was
>>> hoping interested members of the linked-data and SKOS communities
>>> could take a look and make sure the implementation looks correct.
>>>
>>> Each concept is identified with a URI like:
>>>
>>> http://lcsh.info/sh95000541#concept
>>>
>>> When responding to requests for concept URIs, the server content
>>> negotiates to determine which representation of the concept
>> to return:
>>>
>>> - application/xhtml+xml
>>> - application/json
>>> - text/n3
>>> - application/rdf+xml
>>>
>>> This is basically the pattern that Cool URIs for the Semantic Web
>>> discusses as the Hash URI with Content Negotiation [4]. An  
>>> additional
>>> point that is worth mentioning is that the XHTML representation
>>> includes RDFa, that also describes the concept.
>>>
>>> At the moment the LCSH/SKOS data is only linked to itself, through
>>> assertions that involve skos:broader, skos:narrower, and
>> skos:related.
>>> But the hope is that minting URIs for LCSH will allow it to be  
>>> mapped
>>> and/or linked to concepts in other vocabularies: dbpedia, geonames,
>>> etc.
>>>
>>> Any feedback, criticisms, ideas are welcome either on either the
>>> public-lod [5] or public-swd-wg [6] discussion lists.
>>>
>>> Thanks for reading this far!
>>> //Ed
>>>
>>> [1] http://lcsh.info
>>> [2] http://www.loc.gov/marc/
>>> [3] http://lcsh.info/static/lcsh.nt
>>> [4] http://www.w3.org/TR/cooluris/#hashuri
>>> [5] http://lists.w3.org/Archives/Public/public-lod/
>>> [6] http://lists.w3.org/Archives/Public/public-swd-wg/
>>>
>>
>>
>>
Received on Tuesday, 17 June 2008 15:04:04 UTC