Re: Creation of Containers from Kingsley Idehen on 2012-11-08 (public-ldp-wg@w3.org from November 2012)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Thu, 08 Nov 2012 15:41:49 -0500
To: "Wilde, Erik" <Erik.Wilde@emc.com>
CC: "public-ldp-wg@w3.org" <public-ldp-wg@w3.org>, "nathan@webr3.org" <nathan@webr3.org>, Niclas Hoyer <niclas@verbugt.de>, Richard Cyganiak <richard@cyganiak.de>
Message-ID: <509C190D.3060206@openlinksw.com>
On 11/8/12 2:22 PM, Wilde, Erik wrote:
> hello richard.
>
> On 2012-11-08 8:30 , "Richard Cyganiak" <richard@cyganiak.de> wrote:
>> HTML does just fine with a single media type. Why not RDF?
> because in HTML, humans drive the interactions, and they understand the
> "amazon order form media type" and can make decisions whether to click the
> "POST order confirmation" or "POST order cancellation" button. HTML has
> two layers of meaning:
>
> - raw HTML, the HTML media type: got GET the next web page, go GET an
> image, go GET a thumbnail, go POST form fields to some URI. this is good
> enough to drive crawlers, and thus hugely useful. it doesn't really cover
> the "true meaning" of the interactions by humans, though.
>
> - human-readable labels that allow people to choose which links to follow.
> this is not covered by the HTML media type and doesn't need to be, because
> you're not executing stuff on this level in a crawler. if you do, things
> get tricky because you either need to do some language processing, infer
> meaning through some other extraction means, or rely on "profiles" such as
> microformats/RDFa, which essentially specialize the media type into a more
> expressive one.
>
> the HTML web works because you have smart human operators hitting the
> right buttons.

Good!

Remember, RDF is about enabling a Web dimension that works for smart
agents that aren't strictly human. These agents surpass humans in the
following areas:

1. speed
2. endurance
3. susceptibility to emotional distraction -- Star Trek got the Data
Droid spot on.

Media type enable these smart agent determine what kind of content bears
the structured data they ingest. Once ingested, they exploit entity
relationship graphs and relationship semantics en route to upping their
smarts. This process is iterative and exponential. The denser the Web
becomes the smarter the agents. Humans can spend more time putting these
agents to work instead of fighting a losing battle against:

1. Data volume
2. Data velocity
3. Data variety
4. Data verity -- insight orientation and friendliness.

The RDF data packet is a self-describing. An agent participating in this
system subscribes to this understanding. This is ground zero.
>
>>>> 2. In the RDF world, the semantics of the message is not communicated in
>>>> a media type but in the RDF vocabularies used within the graph. RDF
>>>> terms
>>>> are globally unique, so this is unambiguous, unlike in say JSON where
>>>> you
>>>> need media types to distinguish your format from someone else's.
>>> are you sure about that one? i think this is what started this thread:
>>> the
>>> inability to distinguish whether a server should interpret something as
>>> an
>>> opaque RDF graph (store this set of triples), or actually do something
>>> based on the interaction semantics.
>> That seems to be a non-issue for GET, PUT and DELETE. I can kind of see
>> where you're coming from in the case of POST. But even there, the
>> distinction between “take this set of triples, ignoring their semantics”
>> and “take this set of triples, taking their semantics into account” still
>> doesn't seem to call for a different media type. Again, the semantics is
>> in the vocabularies. The fact that in some situations, one may want to
>> exchange RDF graphs while  ignoring their semantics doesn't change that.
> even for GET, it's not a non-issue. if you GET an HTML page as text/plain,
> your browser (should) display it as source, because it is not supposed to
> sniff anything. 

Yes, and you can add plugins to browser to make them handle other media
types, especially those associated with RDF. Extensions have been doing
this for years.

> all it GETs is plain text, and HTML (also) is plain text.
> same with scripting and styling. without labeling content, there are no
> rules how to act on it. unless you sniff, which in practice is what many
> do, but opens the doors to all sorts of security issues and
> interoperability problems.
>
>>> that is exactly what media types give
>>> you, and what RDF by definition can not do just by itself, since it is
>>> only a data format: talking about the interaction semantics of what you
>>> expect to happen when you exchange certain representations.
>> But you can define RDF vocabulary that specifies the interaction
>> semantics. Or are you somehow disputing that this is possible?
> i am not saying this is not possible, it certainly is; the question is
> whether you're describing it in the right place. 

That is the right place because those are the rules of the system. Media
types simply enable agents determine in what format an RDF graph has
been represented.

> what i am saying is that
> on the web, the label that triggers processing rules is the media type.

Yes, and on a Web of RDF based Linked Data it goes further i.e., entity
relationship graph and relationship semantics, as per my comments above.

> let's say we have 42 container interaction protocols, and somebody sends a
> client a link to a container. how would that client tell the server that
> it supports container interaction protocols 12 and 27, and then the server
> can provide interaction affordances according to one of those protocols?

It's about the data not the mechanism for marshaling the data. An RDF
agent understand the rules for processing RDF data.

>
>>>> 4. This somewhat parallels the situation in HTML, where the interaction
>>>> semantics are not in the media type but described in the payload --
>>>> hyperlinks and forms. Although unlike in HTML, our ³forms² probably
>>>> only
>>>> need to cover a few hardcoded kinds of actions -- create a new resource
>>>> in this container, go to the next page, stuff like that.
>>> interaction semantics in HTML are in the media type,
>> Well, fair enough, but my point is that you don't need to introduce new
>> media types each time you build a new service with an HTML front-end,
>> because the interaction semantics in HTML are rich enough and generic
>> enough to work for all sorts of services. The same can work for RDF, with
>> the addition of a few vocabularies.
> see above, the fundamental difference is that HTML is driven by human
> operators.

See my comments above. We are going beyond humans to smart agents that
work *intelligently* on behalf of humans.

>
>>> what you refer to are
>>> the "human-oriented semantics" that are represented by anchor text and
>>> so
>>> forth. if interaction semantics weren't part of HTML itself, the web as
>>> we
>>> know it (and particularly any agents that crawl and index) would not
>>> exist.
>> I didn't claim that HTML has no interaction semantics. I claimed that the
>> semantics of RDF representations are in the vocabularies used within the
>> graph, and that it is possible to define vocabularies that specify
>> interaction semantics, and that therefore one doesn't need to introduce
>> new media types in order to enable RESTful interactions on the web. One
>> needs new vocabularies. (Such as the terms that LDP introduces.)
> vocabularies are important, but vocabularies are just about data.

They are all about the data and the semantics used to make sense of the
data.

> protocols are more than data; they are based on exchanging data, but they
> add a layer of interactions, and rules governing those, and peers engaging
> in conversations governed by those rules. conversations need to carry that
> context, or you have those situations like the one that started this
> thread: is a certain bunch of data to be interpreted according to one set
> of rules, or another set. maybe this scenario helps:
>
> i am creating a blog post about AtomPub. i want it to contain sample code.
> i can POST an <entry ..../> labeling it as application/atom+xml to the
> collection for creating the actual blog post, and the server will take
> this as a "entry resource" and interpret the XML for populating some
> metadata fields. after that, i can POST an <entry ..../> labeling it as
> application/xml to the same collection for creating something that has the
> exact same content as the blog post itself, but is not interpreted as
> having the same interaction semantics, it is just a "media resource". no
> amount of sniffing in the world would help to make this interaction
> possible if i hadn't labeled my requests by the proper conversational
> context ("i am POSTing an entry resource", "i am POSTing a media resource
> that just happens to look like an entry resource, but please don't get
> confused by that, just treat it as random XML"). my apologies for using
> XML here, but there really isn't anything in this scenario that's in any
> way specific to XML; it's all about establishing and communicating
> conversational context.

Very Web 2.0 circa. 2000 :-)

XML as your example isn't the problem. You are not honing into the fact
that a data packet can deliver an entity relationship graph based
payload. This payload is endowed with machine discernible entity
relationship semantics. This is fundamentally what RDF is about, Linked
Data principles when applied to RDF just makes it webby (via hyperlinks
denoting anything rather than just documents) such that the range of
data access scales to the expanse of the Web :-)


Kingsley
>
> cheers,
>
> dret.
>


-- 

Regards,

Kingsley Idehen	      
Founder & CEO 
OpenLink Software     
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Attachments

application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Thursday, 8 November 2012 20:42:13 UTC