Re: Creation of Containers from Wilde, Erik on 2012-11-08 (public-ldp-wg@w3.org from November 2012)

From: Wilde, Erik <Erik.Wilde@emc.com>
Date: Thu, 8 Nov 2012 14:22:08 -0500
To: "public-ldp-wg@w3.org" <public-ldp-wg@w3.org>
CC: "nathan@webr3.org" <nathan@webr3.org>, Niclas Hoyer <niclas@verbugt.de>, Richard Cyganiak <richard@cyganiak.de>
Message-ID: <CCC13BA6.BBBA%erik.wilde@emc.com>
hello richard.

On 2012-11-08 8:30 , "Richard Cyganiak" <richard@cyganiak.de> wrote:
>HTML does just fine with a single media type. Why not RDF?

because in HTML, humans drive the interactions, and they understand the
"amazon order form media type" and can make decisions whether to click the
"POST order confirmation" or "POST order cancellation" button. HTML has
two layers of meaning:

- raw HTML, the HTML media type: got GET the next web page, go GET an
image, go GET a thumbnail, go POST form fields to some URI. this is good
enough to drive crawlers, and thus hugely useful. it doesn't really cover
the "true meaning" of the interactions by humans, though.

- human-readable labels that allow people to choose which links to follow.
this is not covered by the HTML media type and doesn't need to be, because
you're not executing stuff on this level in a crawler. if you do, things
get tricky because you either need to do some language processing, infer
meaning through some other extraction means, or rely on "profiles" such as
microformats/RDFa, which essentially specialize the media type into a more
expressive one.

the HTML web works because you have smart human operators hitting the
right buttons.

>>>2. In the RDF world, the semantics of the message is not communicated in
>>> a media type but in the RDF vocabularies used within the graph. RDF
>>>terms
>>> are globally unique, so this is unambiguous, unlike in say JSON where
>>>you
>>> need media types to distinguish your format from someone else's.
>> 
>> are you sure about that one? i think this is what started this thread:
>>the
>> inability to distinguish whether a server should interpret something as
>>an
>> opaque RDF graph (store this set of triples), or actually do something
>> based on the interaction semantics.
>That seems to be a non-issue for GET, PUT and DELETE. I can kind of see
>where you're coming from in the case of POST. But even there, the
>distinction between “take this set of triples, ignoring their semantics”
>and “take this set of triples, taking their semantics into account” still
>doesn't seem to call for a different media type. Again, the semantics is
>in the vocabularies. The fact that in some situations, one may want to
>exchange RDF graphs while  ignoring their semantics doesn't change that.

even for GET, it's not a non-issue. if you GET an HTML page as text/plain,
your browser (should) display it as source, because it is not supposed to
sniff anything. all it GETs is plain text, and HTML (also) is plain text.
same with scripting and styling. without labeling content, there are no
rules how to act on it. unless you sniff, which in practice is what many
do, but opens the doors to all sorts of security issues and
interoperability problems.

>>that is exactly what media types give
>> you, and what RDF by definition can not do just by itself, since it is
>> only a data format: talking about the interaction semantics of what you
>> expect to happen when you exchange certain representations.
>But you can define RDF vocabulary that specifies the interaction
>semantics. Or are you somehow disputing that this is possible?

i am not saying this is not possible, it certainly is; the question is
whether you're describing it in the right place. what i am saying is that
on the web, the label that triggers processing rules is the media type.
let's say we have 42 container interaction protocols, and somebody sends a
client a link to a container. how would that client tell the server that
it supports container interaction protocols 12 and 27, and then the server
can provide interaction affordances according to one of those protocols?

>>>4. This somewhat parallels the situation in HTML, where the interaction
>>> semantics are not in the media type but described in the payload --
>>> hyperlinks and forms. Although unlike in HTML, our ³forms² probably
>>>only
>>> need to cover a few hardcoded kinds of actions -- create a new resource
>>> in this container, go to the next page, stuff like that.
>> 
>> interaction semantics in HTML are in the media type,
>Well, fair enough, but my point is that you don't need to introduce new
>media types each time you build a new service with an HTML front-end,
>because the interaction semantics in HTML are rich enough and generic
>enough to work for all sorts of services. The same can work for RDF, with
>the addition of a few vocabularies.

see above, the fundamental difference is that HTML is driven by human
operators.

>>what you refer to are
>> the "human-oriented semantics" that are represented by anchor text and
>>so
>> forth. if interaction semantics weren't part of HTML itself, the web as
>>we
>> know it (and particularly any agents that crawl and index) would not
>>exist.
>I didn't claim that HTML has no interaction semantics. I claimed that the
>semantics of RDF representations are in the vocabularies used within the
>graph, and that it is possible to define vocabularies that specify
>interaction semantics, and that therefore one doesn't need to introduce
>new media types in order to enable RESTful interactions on the web. One
>needs new vocabularies. (Such as the terms that LDP introduces.)

vocabularies are important, but vocabularies are just about data.
protocols are more than data; they are based on exchanging data, but they
add a layer of interactions, and rules governing those, and peers engaging
in conversations governed by those rules. conversations need to carry that
context, or you have those situations like the one that started this
thread: is a certain bunch of data to be interpreted according to one set
of rules, or another set. maybe this scenario helps:

i am creating a blog post about AtomPub. i want it to contain sample code.
i can POST an <entry ..../> labeling it as application/atom+xml to the
collection for creating the actual blog post, and the server will take
this as a "entry resource" and interpret the XML for populating some
metadata fields. after that, i can POST an <entry ..../> labeling it as
application/xml to the same collection for creating something that has the
exact same content as the blog post itself, but is not interpreted as
having the same interaction semantics, it is just a "media resource". no
amount of sniffing in the world would help to make this interaction
possible if i hadn't labeled my requests by the proper conversational
context ("i am POSTing an entry resource", "i am POSTing a media resource
that just happens to look like an entry resource, but please don't get
confused by that, just treat it as random XML"). my apologies for using
XML here, but there really isn't anything in this scenario that's in any
way specific to XML; it's all about establishing and communicating
conversational context.

cheers,

dret.
Received on Thursday, 8 November 2012 19:23:07 UTC