Re: Library of Congress Subject Headings as SKOS Linked Data from Ivan Herman on 2008-06-18 (public-lod@w3.org from June 2008)

From: Ivan Herman <ivan@w3.org>
Date: Wed, 18 Jun 2008 06:54:11 +0200
To: Richard Cyganiak <richard@cyganiak.de>
CC: "Hausenblas, Michael" <michael.hausenblas@joanneum.at>, Ed Summers <ehs@pobox.com>, SWD Working SWD <public-swd-wg@w3.org>, public-lod@w3.org, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <485894F3.8060909@w3.org>
I know it is not an ideal solution, but I use an Apache setup pattern 
for my XHTML+RDFa documents:

http://www.w3.org/QA/2008/05/using_rdfa_to_add_information.html

It was done as an answer to Tim's tabulator issues...

Ivan

Richard Cyganiak wrote:
> 
> Michael,
> 
> On 11 Jun 2008, at 09:25, Hausenblas, Michael wrote:
>>> (I'm ignoring the availability of RDFa in my argument --
>>> unfortunately there is no way for a client to indicate that it supports
>> RDFa AFAIK,
>>> so it cannot really be factored into the content negotiation equation.)
>>
>> As you may guess I have a comment regarding this one ;)
>>
>> I'm note sure if I understand this correctly ('no way for a client to
>> indicate that it supports RDFa') but, though RDFa uses
>> 'application/xhtml+xml' MIME type, there are several ways to
>> declare/detect RDFa content.
> 
> Yes, I know. The problem I was getting at is a different one: There is 
> no way for a *client* to *announce* that it supports RDFa.
> 
> When doing content negotiation, a server has to look at the Accept 
> header sent by the client, and make a decision about which variant to 
> send. Unfortunately, there is no way for the server to distinguish a 
> plain old Web browser from an RDFa-capable client.
> 
> What is a server to do if it has some really nice RDF data, but also a 
> simple XHTML rendering of the RDF with embedded RDFa? If the client is 
> RDFa-aware, then sending the XHTML page is probably best, because it 
> gives the client the best of two worlds. The client can decide wether to 
> display the syled XHTML or wether to treat it as data and e.g. show it 
> in a tabular/faceted fashion.
> 
> But if the client is a Tabulator-style browser without RDFa support, 
> then the server should send the RDF/XML to let the client get all the 
> data goodness.
> 
> Tabulator sends this in its Accept header:
> application/xhtml+xml;q=1.0, application/rdf+xml;q=0.8
> 
> That's reasonable. It's a Web browser, so it can deal perfectly with 
> XHTML. It also includes a data browser which is not quite as 
> sophisticated, so a preference of 0.8 for RDF/XML is fine. So our 
> hypothetical server should send RDF/XML in this case.
> 
> How would a client announce that it is a Web browser (perfect XHTML 
> support), that it also can deal reasonably well with RDF/XML, but that 
> it would really prefer to get the best of both worlds if the server has 
> RDFa? Which Accept header should make our server respond with RDFa?
> 
> I don't know, because there is no way to announce RDFa support in Accept 
> headers.
> 
> Hence, RDFa-capable browsers will probably often get served RDF/XML, 
> because they don't tell the server that they would prefer an 
> RDFa-enabled styled HTML page over an unstyled RDF/XML page.
> 
> I might propose something like
> 
> application/xhtml+xml;profile=rdfa;q=1.0
> 
> in the Accept header, but am not sure about unintended side effects.
> 
> Best,
> Richard
> 
> 
> 
>> All my statements are based on the latest
>> RDFa syntax CR document at [1].
>>
>> First (not preferred by some people ;) you could/should use the type
>> declaration (<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
>> "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">). Further, people are
>> encouraged to use the according @profile (<head
>> profile="http://www.w3.org/1999/xhtml/vocab">) and/or the version
>> attribute (<html xmlns="http://www.w3.org/1999/xhtml"
>> version="XHTML+RDFa 1.0" ... />).
>>
>> Finally, the W3C TAG also addresses this issue, see [2].
>>
>> Please feel free to discuss this issues also at our RDFa community Wiki
>> (e.g. at [3]).
>>
>> Cheers,
>>     Michael
>>
>> [1] http://www.w3.org/MarkUp/2008/CR-rdfa-syntax-20080612/#s_conformance
>> [2]
>> http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html#UsingRDFa
>> [3] http://rdfa.info/wiki/Tutorials#Good_practice
>>
>> ----------------------------------------------------------
>> Michael Hausenblas, MSc.
>> Institute of Information Systems & Information Management
>> JOANNEUM RESEARCH Forschungsgesellschaft mbH
>>
>> http://www.joanneum.at/iis/
>> ----------------------------------------------------------
>>
>>
>>> -----Original Message-----
>>> From: public-swd-wg-request@w3.org
>>> [mailto:public-swd-wg-request@w3.org] On Behalf Of Richard Cyganiak
>>> Sent: Wednesday, June 11, 2008 1:18 AM
>>> To: Ed Summers
>>> Cc: SWD Working SWD; public-lod@w3.org
>>> Subject: Re: Library of Congress Subject Headings as SKOS Linked Data
>>>
>>>
>>> Ed,
>>>
>>> A very cool service, and exemplary attention to detail!
>>>
>>> Of course, I still have a few suggestions! I haven't read through the
>>> entire thread, so apologies if some of this was mentioned already.
>>>
>>> (I saw 303s being mentioned in the thread -- you are doing things the
>>> right way, there's no need to do 303s at <sh95000541>. It is an
>>> information resource and therefore 200 is fine. The concept is
>>> <sh95000541#concept>, a URI that cannot be directly dereferenced via
>>> HTTP, so you are consistent with httpRange-14, as explained in the
>>> Cool URIs document. This is one of the nice things about hash URIs.)
>>>
>>> 1. The content-negotiated URI should send a "Vary: Accept" header.
>>> This helps caches to deal correctly with content-negotiated resources.
>>>
>>> 2. The correct MIME type for N3 is "text/rdf+n3;charset=utf-8", not
>>> "text/n3". (I think the spec used to recommend text/n3, but has been
>>> changed some time ago.)
>>>
>>> 3. I would suggest adding a few triples to the RDF/XML and N3
>>> versions, to link the generic document to its variants, and the
>>> generic document to the concept. Example (choose your own favourite
>>> properties):
>>>
>>> <sh95000541> foaf:primaryTopic <sh95000541#concept> .
>>> <sh95000541> dcterms:format <sh95000541.rdf> .
>>> <sh95000541> dcterms:format <sh95000541.n3> .
>>> <sh95000541> dcterms:format <sh95000541.json> .
>>> <sh95000541> dcterms:format <sh95000541.html> .
>>>
>>> This helps RDF browsers to relate all those resources.
>>>
>>> 4. The content negotiation could benefit from a little bit of
>>> tweaking. You correctly handle q values, which is great. It would be
>>> even better if there was a slight bias towards the non-HTML
>>> formats. I
>>> would argue that the data variants are quite a bit more useful than
>>> the HTML variant, as RDF-aware clients can do all sorts of cool stuff
>>> with the RDF that are not possible . Therefore, a client that
>>> indicates identical preference for HTML and RDF/XML should be served
>>> RDF/XML. FWIW, Tabulator has a preference of 1.0 for XHTML and
>>> 0.8 for
>>> RDF/XML. It would be great if your algorithm would return RDF/XML in
>>> this case.
>>>
>>> (I'm ignoring the availability of RDFa in my argument --
>>> unfortunately there is no way for a client to indicate that it supports
>> RDFa AFAIK,
>>> so it cannot really be factored into the content negotiation equation.)
>>>
>>> 5. Ideally, you would add the skos:prefLabels of all related concepts
>>> to the RDF output. This would support navigation in RDF browsers.
>>>
>>> Again, great work!
>>>
>>> Best,
>>> Richard
>>>
>>>
>>> On 9 Jun 2008, at 14:54, Ed Summers wrote:
>>>
>>>>
>>>> I'd like to announce an experimental linked-data, SKOS representation
>>>> of the Library of Congress Subject Headings (LCSH) [1] ... and also
>>>> ask for some help.
>>>>
>>>> The Library of Congress has been participating in the W3C
>>> Semantic Web
>>>> Deployment Working Group, and has converted LCSH from the MARC21 data
>>>> format [2] to SKOS. LCSH is a controlled vocabulary used to index
>>>> materials that have been added to the collections at the Library of
>>>> Congress. It has been in active development since 1898, and was first
>>>> published in 1914 so that other libraries and bibliographic utilities
>>>> could use and adapt it. The lcsh.info service makes 266,857 subject
>>>> headings available as SKOS concepts, which amounts to 2,441,494
>>>> triples that are separately downloadable [3] (since there isn't a
>>>> SPARQL endpoint just yet).
>>>>
>>>> At the last SWDWG telecon some questions came up about the way
>>>> concepts are identified, and made available via HTTP. Since we're
>>>> hoping lcsh.info can serve as an implementation of SKOS for the W3C
>>>> recommendation process we want to make sure we do this
>>> right. So I was
>>>> hoping interested members of the linked-data and SKOS communities
>>>> could take a look and make sure the implementation looks correct.
>>>>
>>>> Each concept is identified with a URI like:
>>>>
>>>> http://lcsh.info/sh95000541#concept
>>>>
>>>> When responding to requests for concept URIs, the server content
>>>> negotiates to determine which representation of the concept
>>> to return:
>>>>
>>>> - application/xhtml+xml
>>>> - application/json
>>>> - text/n3
>>>> - application/rdf+xml
>>>>
>>>> This is basically the pattern that Cool URIs for the Semantic Web
>>>> discusses as the Hash URI with Content Negotiation [4]. An additional
>>>> point that is worth mentioning is that the XHTML representation
>>>> includes RDFa, that also describes the concept.
>>>>
>>>> At the moment the LCSH/SKOS data is only linked to itself, through
>>>> assertions that involve skos:broader, skos:narrower, and
>>> skos:related.
>>>> But the hope is that minting URIs for LCSH will allow it to be mapped
>>>> and/or linked to concepts in other vocabularies: dbpedia, geonames,
>>>> etc.
>>>>
>>>> Any feedback, criticisms, ideas are welcome either on either the
>>>> public-lod [5] or public-swd-wg [6] discussion lists.
>>>>
>>>> Thanks for reading this far!
>>>> //Ed
>>>>
>>>> [1] http://lcsh.info
>>>> [2] http://www.loc.gov/marc/
>>>> [3] http://lcsh.info/static/lcsh.nt
>>>> [4] http://www.w3.org/TR/cooluris/#hashuri
>>>> [5] http://lists.w3.org/Archives/Public/public-lod/
>>>> [6] http://lists.w3.org/Archives/Public/public-swd-wg/
>>>>
>>>
>>>
>>>
> 
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Wednesday, 18 June 2008 04:54:55 UTC