Re: 200 OK with Content-Location might work from Kingsley Idehen on 2010-11-07 (public-lod@w3.org from November 2010)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sun, 07 Nov 2010 16:53:30 -0500
To: Phil Archer <phila@w3.org>
CC: public-lod@w3.org
Message-ID: <4CD71FDA.9020203@openlinksw.com>
On 11/7/10 2:35 PM, Phil Archer wrote:
> I share John's unease here. And I remain uneasy about the 200 C-L 
> solution.
>
> I know I sound like a fundamentalist in a discussion where we're 
> trying to find a practical, workable solution, but is a description of 
> a toucan a representation of a toucan?

Put differently, a Toucan has a structured description, expressed using 
triples, contained in an RDF resource (that has an Address), that's 
transmissible in a variety of formats (representation) to user agents 
over HTTP. The discernible attributes of the Entity (Thing) named: 
Toucan, are expressed in a digital representation of its description, by 
an observer.

You have 3 vital components (trinity) that constitute a structured 
description:

1. Toucan -- the real world object and description subject
2. URI -- identifier that has Toucan as referent
3. Resource -- the container of triples used to express description of 
the subject (Tucan).

> IMO, it's not. Sure, one can imagine an HTTP response returning a very 
> rich data stream that conveys the entire experience of having a toucan 
> on your desk - but the toucan ain't actually there.

Correct. It's just a referent of the URI. The URI may be a Name or an 
Address. The data -- fully self-describing -- determines the subject of 
the description by Name (via triples). Thus, overrides the ambiguities 
of HTTP as experienced by user agents.

If you break the trinity outlined above, and you overload on the term 
"Resource", all of this comes through as absolute gobbledygook to people 
who don't speak or comprehend fluent "Resource Moniker Overloading".
>
> I've been toying with the idea of including a substitution rule in a 
> 200 header.
>
> Following the practice of using /id/ for NIRs and /doc/ for their 
> descriptions, suppose a GET on http://example.com/id/toucan returned:
>
> 200 OK
> Apply-URI-substitution: s/id/doc/
> Content-type: text/html
> Blah blah...

Not required, we just have to make up our minds about when we stop 
overloading on "Resource". It's taken close to 12 years to accept that 
RDF/XML is gobbledygook outside the Semantic Web community. RDFa is 
sorta accepted reluctantly re. normative format, but it still isn't 
deemed normative. I have no idea how long its going to take for 
"Resource" overloading to dissipate, but I do know that EAV + Resolvable 
Identifiers will take shape elsewhere, and Linked Data (the concept) 
will be understood without today's syntax + model conflation alongside 
"Resource" terminology overload distractions.

>
> In just one trip, user agents would then be able to interpret this as 
> a document whose URI can be derived by performing the substitution, 
> knowing that the returned data describes the thing identified by the 
> original URI.

Linked Data aware user agents should be able to use 'Content-Location' 
header values to determine if they are dealing with a Name or an 
Address, by processing the EAV graph exposed by a resource (which 
includes RDF resources).  Nothing stops resources exposed by 
'Content-Location'  from being a container of EAV model data expressed 
using: OData+Atom, OData+JSON, GData, or any other markup/syntax. User 
Agents should negotiate data representation formats and then process the 
data using underlying data model semantics.

Don't want to beat a dead horse, but RDF != Linked Data (the concept). 
TimBL posted a note about how to effectively publish (actually inject) 
Linked Data into the World Wide Web. That doesn't in anyway confine the 
concept of Linked Data to RDF or even HTTP.
>
> This approach, and C-L approach, both require client side 
> implementation of course.
>
> My worry is that any 200-based solution is going to be poorly 
> implemented in the real world by both browsers and LOD publishers 
> (Talis excepted of course!) so that IRs and NIRs will be 
> indistinguishable 'in the wild'.
>
> 303 works already, and that is still the one that feels right to me. 
> I'm happy that the discussion here is centred on adding a new method 
> cf. replacing 303, especially as the HTTP-Bis group seems to have made 
> its use for LOD and explicit part of the definition.

This is not about replacing 303's re. slash terminated HTTP URIs used as 
Entity Names. It's simply another option for handling Name / Address 
disambiguation for this type of URI via the data itself.

Links:

1. 
http://www.slideshare.net/kidehen/understanding-linked-data-via-eav-model-based-structured-descriptions 
-- demystification of Linked Data using EAV model & HTTP URIs
2. http://www.slideshare.net/kidehen/iss-1 --  covers Syntax and 
Conceptual Schema (model) separation
3. http://ontolog.cim3.net/forum/ontolog-forum/2010-09/msg00318.html -- 
post from ontolog forum by John F. Sowa that explains "Entities" and 
other related matters
4. http://www.w3.org/People/Connolly/9703-web-apps-essay.html -- 1997 
article by DanC, that also sheds valuable insight into HTTP roots 
(Objective-C & Distributed Objects) and by implication the 
superficiality of the "Resource" moniker.


Kingsley
>
> Phil.
>
> On 07/11/2010 15:07, John Sheridan wrote:
>> Niklas,
>>
>> In general I am supportive of your and Ian's thinking. 200 OK with
>> Content-Location might work.
>>
>> However, three points from my perspective:
>>
>> 1) debating fundamental issues like this is very destabilising for those
>> of us looking to expand the LOD community and introduce new people and
>> organisations to Linked Data. To outsiders, it makes LOD seem like its
>> not ready for adoption and use - which is deadly. This is at best the
>> 11th hour for making such a change in approach (perhaps even 5 minutes
>> to midnight?).
>>
>> 2) the 303 pattern isn't *that* hard to understand for newbies and maybe
>> even helps them grasp LOD. Making the difference between NIRs and IRs so
>> apparent, I have found to be (counter-intuitively) a big selling point
>> for LOD, when introducing new people to the paradigm. Let's not be too
>> harsh on 303 - it does make an important distinction very clear for new
>> adopters and, in my experience, it seems to be an approach new people
>> grok quite quickly and easily.
>>
>> 3) I see much to commend in what Ian suggests, in practical terms. If
>> the community is going to move in that direction what we need is a clear
>> roadmap. An alternative pattern (say, 200 OK plus Content-Location)
>> needs to be (*very* quickly) alighted upon and then used in practice. We
>> would have to reconcile ourselves to the 303 pattern and the
>> alternative, operating side-by-side, for some period of time (years?).
>> Only once there is some breadth of usage, should the community seek to
>> deprecate the use of 303s. If this is a pattern the community wishes to
>> change, we have to gradually evolve our way to something different. We
>> can't just leap.
>>
>> Hope these thoughts help,
>>
>> John.
>>
>> On Sun, 2010-11-07 at 14:42 +0000, John Sheridan wrote:
>>> One use-case that we have with the Linked Data work for UK Government,
>>> is where we have a URI for a NIR at one (notionally more stable) domain
>>> which 303s to an IR at a different (less stable, organisationally
>>> orientated) domain.
>>>
>>> Often the NIR URI is something like
>>> http://{something}.data.gov.uk/id/something whereas the IR is on an
>>> organisation's own website.
>>>
>>> We do this because organisations in the public sector are unstable and
>>> subject to continual change (creation, merger, abolition) whereas the
>>> government as a whole is very stable.
>>>
>>> To give an example, the Open Government Licence (for the NIR of the
>>> licence) is http://reference.data.gov.uk/id/open-government-licence
>>> which 303s to
>>> http://www.nationalarchives.gov.uk/doc/open-government-licence/ (the IR
>>> of the current licence text, currently published by The National
>>> Archives, with HTML and RDF representations selected through conneg)
>>>
>>> We are looking at a similar pattern for local authorities. Each Council
>>> would have a NIR URI at (something like)
>>> local.data.gov.uk/id/{local-council-identifier} which would 303 to IR
>>> about that Council on the Council's own website.
>>>
>>> Our thinking is that the {something}.data.gov.uk URI is more likely to
>>> survive machinery of government changes, but the organisation
>>> responsible for (say) the Open Government Licence is always going to
>>> want to publish the IR about that on its own website, and should be
>>> encouraged to do so.
>>>
>>> The 303 pattern helps enable this pattern, which fits well in general
>>> with some of the challenges on Linked Data in the public sector.
>>>
>>> I would like to understand a little better how Ian's proposal maps to
>>> this use case.
>>>
>>> Grateful for comments,
>>>
>>> John.
>>>
>>> On Sun, 2010-11-07 at 12:11 +0100, Niklas Lindström wrote:
>>>> +1 indeed. Content-Location has definitely been overlooked. During
>>>> conneg, it is used to differ between a resource and its
>>>> representation(s), which are obviously different resources (well, not
>>>> necessarily the same). This distinction could certainly be enough to
>>>> remove the fundamental need for 303:ing on NIR:s (provided consensus
>>>> and some formal resolution).
>>>>
>>>> (I pondered on a similar issue in
>>>> <http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2010Feb/0007.html>, 
>>>>
>>>> regarding the identity of fragments. Perhaps that discussion would be
>>>> worth revisiting again in light of this?)
>>>>
>>>> Best regards,
>>>> Niklas
>>>>
>>>>
>>>>
>>>> On Fri, Nov 5, 2010 at 5:55 PM, Nathan<nathan@webr3.org>  wrote:
>>>>> Mike Kelly wrote:
>>>>>>
>>>>>> http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-12#page-14 
>>>>>>
>>>>>
>>>>> snipped and fuller version inserted:
>>>>>
>>>>>    4.  If the response has a Content-Location header field, and 
>>>>> that URI
>>>>>        is not the same as the effective request URI, then the 
>>>>> response
>>>>>        asserts that its payload is a representation of the resource
>>>>>        identified by the Content-Location URI.  However, such an
>>>>>        assertion cannot be trusted unless it can be verified by other
>>>>>        means (not defined by HTTP).
>>>>>
>>>>>> If a client wants to make a statement  about the specific document
>>>>>> then a response that includes a content-location is giving you the
>>>>>> information necessary to do that correctly. It's complemented and
>>>>>> further clarified in the entity body itself through something like
>>>>>> isDescribedBy.
>>>>>
>>>>> I stand corrected, think there's something in this, and it could 
>>>>> maybe
>>>>> possibly provide the semantic indirection needed when 
>>>>> Content-Location is
>>>>> there, and different to the effective request uri, and 
>>>>> complimented by some
>>>>> statements (perhaps RDF in the body, or Link header, or html link 
>>>>> element)
>>>>> to assert the same.
>>>>>
>>>>> Covers a few use-cases, might have legs (once HTTP-bis is a 
>>>>> standard?).
>>>>>
>>>>> Nicely caught Mike!
>>>>>
>>>>> Best,
>>>>>
>>>>> Nathan
>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>>
>> Please consider the environment before printing this email.
>>
>> Find out more about Talis at http://www.talis.com/
>> shared innovation™
>>
>> Any views or personal opinions expressed within this email may not be 
>> those of Talis Information Ltd or its employees. The content of this 
>> email message and any files that may be attached are confidential, 
>> and for the usage of the intended recipient only. If you are not the 
>> intended recipient, then please return this message to the sender and 
>> delete it. Any use of this e-mail by an unauthorised recipient is 
>> prohibited.
>>
>> Talis Information Ltd is a member of the Talis Group of companies and 
>> is registered in England No 3638278 with its registered office at 
>> Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
>>
>


-- 

Regards,

Kingsley Idehen 
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen
Received on Sunday, 7 November 2010 21:54:01 UTC