Re: Links and graphs

Hi Erik,

On 20/12/2012 17:28, Erik Wilde wrote:
> hello graham.
>
> i really appreciate the time you're taking for this discussion! and first, just
> a side note, before getting to the interesting use cases, because i think we
> really have to start from the right place to get this right.

Thank you for your constructive engagement here.  I'm finding it very useful.  :)

(I'm dropping the PROV list from this side-discussion.)

>
> On 2012-12-17 16:47 , Graham Klyne wrote:
>> On 17/12/2012 17:17, Erik Wilde wrote:
>>> there seems to be this lingering feeling that "RDF is different"
>>> because "it's semantics, and not a format".
>> I wouldn't put it that way, but RDF is different from most of the
>> XML-based formats used.  It is a common format that can carry (and
>> merge) arbitrary semantics.  That's not true of most other data formats
>> on the web.  I don't claim it's unique, and certainly there could be
>> other formats with similar capabilities, and they too could be used in a
>> similar way to RDF provided that there are media types to distinguish
>> the base format.
>
> i absolutely don't want to say that RDF doesn't have some very strong abilities
> in areas where previous formats fell short. it's great as schema-less and
> merge-friendly data format, something that XML can be used for, but it gets so
> complex that nobody (i know of) never went all the way of actually implementing
> it in the most generic way. but while RDF's mechanics are much better designed
> for doing this kind of thing, they are not really different. just better designed.

I think we may have to agree to disagree about some of the details or RDF vs XML 
(more later), but from what you say I think we can agree on some broad 
properties of RDF vs XML that pertain to the service description discussion. Let 
me try:

1. RDF makes it relatively easy to create open-ended service descriptions in 
ways that one would not usually choose to attempt with "bare" XML.

2. RDF has no exclusivity in this respect; one could use XML in alternative ways 
(and I'm thinking this probably means alternative XML-based formats - Atom might 
be a candidate) to achieve the same effect.   To this extent, RDF is just an 
existence proof of the possibility.

I also very recently discovered your "profile" draft 
(http://tools.ietf.org/html/draft-wilde-profile-link) which suggests to me a 
similar line of thinking.  (I recall you mentioned "profile" earlier in our 
discussion, but I didn't realize then that there was a specific proposal.)  A 
question this begs is the extent to which a profile attribution is necessary, or 
can be left implicit in the content - I can see possible advantages either way.

I think there's a basis here we can use to progress our service description 
discussion:  I'm not claiming there aren't other ways to achieve what RDF 
achieves, but that of what's currently available, RDF lends itself particularly 
to an open-ended style of description.

...

And so to some specifics that I believe don't really affect the above:

>>> ... pretty much all data formats (apart from those
>>> using ad-hoc syntaxes) work exactly that way, so there really is not
>>> need to
>>> deviate from what has been working in the past 20 years.
>>> application/atom+xml is
>>> atom semantics using xml syntax,
>> I would argue that, technically, XML is *not* a syntax.  It's a family
>> of syntaxes.  It's a syntactic framework.  Each XML-based document type,
>> defined by DTD, XSD, RelaxNG or other means is a syntax, which may have
>> ore or less associated semantics.  RDF/XML is an XML-based syntax that
>> provides a semantic framework for conveying arbitrary semantics; i.e.
>> descriptions of arbitrary things.
>
> you're right that technically speaking, XML actually is not all that relevant
> anymore, pretty much every relevant "XML technology" out there today is an
> infoset or XDM technology, which are the two existing equivalents of the RDF
> abstract model.

I agree that XML infoset serves the same purpose as the RDF abstract model, and 
that alternative surface syntaxes can yield the same underlying infoset.  Thus 
far, XML and RDF are similar as you say.  (I wouldn't say XML is "not relevant" 
- the infoset is clearly deeply rooted in XML qua syntax.)

> ... XML is just one syntax for that (EXI is another one), as is
> RDF/XML or turtle for RDF. in the same way as looking at random well-formed XML
> data doesn't tell you anything about the data other than its tree shape (the
> second XML in http://dret.net/lectures/xml-fall10/basics#%286%29 is what i use
> as illustration in XML intros), looking at random RDF data doesn't tell you
> anything about the data other than its graph shape.

Yes, so far...

>>> ... personally, i don't think providing multiple
>>> syntaxes is a very good idea, but that's just my personal opinion. but
>>> as a
>>> matter of fact, pretty much all structured data services today use the
>>> exact
>>> same setup of "parse the data based on some general-purpose underlying
>>> syntax",
>>> and then "start processing the data based on some assumption what kind of
>>> vocabulary is expressed within that syntax." RDF is doing exactly the
>>> same as
>>> XML or JSON, just with a different syntax and metamodel.
>> I disagree that RDF is doing the same as XML.  It doesn't have any
>> notion of syntactic constraints (like DTD, XML schema, etc.)  JSON is a
>> more interesting case, as it doesn't (yet) have a way to impose schema
>> constraints.
>
> well, if you don't want validation, just use well-formed XML and happily work
> with unconstrained XML trees. the fact that RDF does not have a validation model
> (so far) is not really a feature, it's more of a problem, because it makes it
> much harder to declaratively expose the preconditions of a service (POST *this*,
> and you're fine, otherwise, you're hosed).

I think this is where we start to diverge, if only slightly.

With XML, one has tools to impose further syntactic constraints on the tree, and 
these are generally needed if one wants to associate meaningful semantics with 
an XML document.  (I don't claim it's necessary to have further syntactic 
constraints - Peter Patel-Schneider et al made a proposal some years ago to 
provide an RDF-style semantics for bare XML, but as far as I'm aware it gained 
no traction.)

With RDF, one has a basic semantic framework provided, and one can refine the 
use of RDF as defined by applying additional semantic constraints associated 
with the vocabulary terms used, from which flow "valid" inferences.

Some of the required outcomes of syntactic validation of XML can be achieved by 
semantic inference over RDF.  For example, one may need to test for sufficiency 
of information for a given purpose (such as a usable service description).  With 
XML, one often achieves this by imposing syntactic validity constraints.  With 
RDF one can use inference, coupled with a local closed world assumption,  which 
(if sound and complete) can tell whether sufficient information is present. 
Practically speaking, the effect is very similar, but the underlying mechanisms 
are somewhat different.

One consequence of the difference is that one can never use inference to decide 
that there is too much information present, which you can do with syntactic 
validation.

>> And RDF *is* different from both in that id does have a (minimal) formal
>> semantics, which neither XML nor JSON have.
>
> you'd be surprised how much you could, if you wanted to, place in the XSD type
> layer, starting from the minimal built-in type layer of simple types and their
> derived types, and then continuing on to complex types and their derivation
> system. it's all horribly designed and not all that well supported in popular
> XML toolkits, so you don't really see many apps out there basing all their
> processing off the XSD type model, but you could do that, and then it would be
> the exact same thing (but only trees, of course, and with all the random other
> limitations XSD has). the point being: RDF is much better designed here, and
> thus people feel much more comfortable building their processing on that level,
> which is great and certainly a win. but it's not a unique feature of RDF; it's
> just that XSD has designed this level in a way that nobody wants to use it.

I agree that XSD has some associated semantics, but XSD is not XML.  Indeed, RDF 
semantics borrows from XSD semantics.  But, IIRC, there are no truth-value 
semantics here, but rather denotations of values.  The semantics of XSD elements 
used in an XML document don't lead to predictable semantics for the document.

So while, as you say, you *could* use XSD semantics to build a semantics around 
some defined XML constructs, you'd need additional elements to make a 
truth-value semantics, and I don't seen how you could make it apply to 
arbitrary, open-ended XML.

>> It's insufficient semantics
>> to describe useful things in the world, but it's enough to provide some
>> basic ground rules for preserving any meaning that is conveyed, e.g.
>> when merging RDF from independent sources.
>
> it's the same thing with XML. if you were an XSD nazi and would always
> type-annotate everything and would only pass around things this way, you could
> build the exact same machinery. only you would claw your eyes out several times
> each day because of how hard it is to do, and because of running into all kinds
> of random limitations.

Maybe that's the difference:  with XML/XSD you'd have to type-annotate 
everything to end up with some comparable machinery.  With RDF, the machinery 
exists and becomes increasingly potent as additional type annotations are 
introduced.  Whether you see this as a difference in degree or fundamental 
nature is maybe a matter of taste?

> sorry for this divergence into the good old RDF vs. XML debate, but it really
> helps to look at things this way, because then we can learn where we can reuse
> established patterns of web architecture from a technology that has some design
> pain points (XML for schema-less merging) to one that's doing a much better job
> for these kinds of scenarios (RDF). just let's not assume we're doing something
> new here, where we should be applying design patterns in the same way that we
> have applied them before.

I find the debate is interesting, and it makes me realize that while I see 
differences, they are maybe finer distinctions than I first assumed.  And I 
completely agree with your point of using established and successful patterns. 
(Indeed, that's really why I'm engaging in this, to better understand what those 
established patterns really are, and how they apply.  I think I've come on a 
fair way, and I don't think I'm proposing to discard anything that is already 
deployed and delivering value.)

...

A diversion.

Since our earlier exchanges, I've come far more comfortable with NOT using Link: 
headers as an alternative to content-types.  Somewhere along the line, you 
mentioned late binding, and I think that's key, though I didn't immediately 
realize why - I just assumed that Link: headers could be dynamically generated too.

The example that has finally convinced me of your approach is considering a 
three-way scenario in which:

1. Client C requests a resource from S1
2. S1 provides a link to an information service (e.g. provenance) provided by S2
3. C then uses the service at S2

I had been focusing on client-server coupling, in which one might regard 
information returned by a server as part of its content, using the Link: header 
relation type wasn't really increasing the coupling.  But what I initially 
overlooked was the coupling between S1 and S2:  like C, S1 should only know that 
S2 has some resource, and not provide information about how to process it.

I expect none of this is new to you, but I thought it might help to point out 
the scenario as an exposition of possible coupling effects.

#g
--

Received on Saturday, 22 December 2012 12:36:04 UTC