Fwd: Re: [RSS-DEV] To RDF, or not to RDF? That is the question! from Joseph Reagle on 2002-07-25 (www-archive@w3.org from July 2002)

From: Joseph Reagle <reagle@w3.org>
Date: Thu, 25 Jul 2002 13:10:52 -0400
To: www-archive@w3.org
Message-Id: <200207251310.52580.reagle@w3.org>
----------  Forwarded Message  ----------

Subject: Re: [RSS-DEV] To RDF, or not to RDF?  That is the question!
Date: Thu, 25 Jul 2002 13:06:36 -0400
From: Joseph Reagle <reagle@mit.edu>
To: rss-dev@yahoogroups.com

On Wednesday 24 July 2002 07:37 pm, burton@openprivacy.org wrote:
> I am having the same problem with the modules I am working on right now.
> When I use all of them at once (and they are RDF) the format is NOT
> simple (far from it).  However when I simplify the format, make it
> nested, simple, and more 'XML friendly' the format is easy to use but
> ceases to be RDF.

I've been through the XML v. RDF ringer a few times, and I've been thinking
about this very issue since I'd like to see RSS become more widely
(interoperably) used. In my understanding, the issues fall into three
categories: the model, the syntax, and the application. I'll give some
examples from my experience with XML Signature (xmldsig)

THE MODEL: How do you take your pain?

One of the difficulties with many applications, particularly XML, is that
people tend to "write syntax first, argue semantics later." The nice thing
about XML is I can quickly right some syntax, get about 80% of the
semantic/processing out there quickly, and worry about the other  stuff
later. Ah, but that "other stuff" can end up also causing you 80% of your
headaches [1], particularly in the security space. In xmldsig, I thought
about the KeyInfo model and had a decent understanding of the semantics
which informed the spec, but wasn't explicitly represented. KeyInfo was
supposed to contain the key (and other optional information) used to
validate a signature. For instance, I could have:
  <KeyInfo><KeyName/><KeyValue/><KeyInfo>
I know there is some key, used to validate the signature, and it has a
properties (e.g,. names) and a value. However, after time folks asked, "if
there is only one validating key, I can understand it having more than one
name associated with it, but there should be only one KeyValue, why does
the schema permit more than one?" This was an error in the schema on my
part, but it then invited debate between those that wanted to preserve the
existing "singular semantic" of the text and those that wanted to continue
to use the liberal schema definition in which KeyInfo is a bucket of key
stuff with no necessary correspondence between any of the children. This
was an unfortunate argument to have that late in the game. *So* the model
(be it in RDF, UML, or some other thing) is important to the hygiene,
interoperability, and reuse of your application. Of course, you have to
balance the upfront cost of designing a perfect model to the later cost of
snafu's like this one [3]. I think RDF itself is similar. There's a global
benefit of "semantic interoperability" [4] (once the network effect kicks
in) but with a localized and upfront comprehension cost.

[1] http://goatee.net/2002/07#_11th
[2] http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001JanMar/0117
[3] http://goatee.net/2002/07#_18mo
[4] http://www.w3.org/TR/md-policy-design#_Applications_and_Semantics

THE SYNTAX: Sugar or Salt

While I now would've preferred a better and more widely understood data
model for xmldsig, I unfortunately approached RDF just like most approach
XML: I started fidgeting with the syntax first! If you haven't given much
thought to the model, the RDF syntax is unmotivated and ugly: syntactic
salt. For instance, in xmldsig we had a <SignedInfo> element which includes
digests of the actual things being signed and the canonicalization and
signature method. I was never sure how to model this because the reason
this syntax existed was for processing and security purposes: you need to
create a bucket of bits to sign, including the signature method, to prevent
attacks. This wasn't so much about modeling an assertion, but marshaling a
bunch of syntax to be hashed. However, I think xmldsig (and encryption even
more so) are the exceptions. They are really octet and XML processing specs
-- less so applications written in XML. To those that are unconcerned about
the model or those that aren't terribly good at it -- I count myself in
this group -- the salt can be minimized. I thought RSS 1.0 did this well.
Additionally, things can be sweetened up a bit. As many folks have already
suggested, I feel RSS 1.0 could've done a better job explaining what those
constructs mean to someone who doesn't give a fig about RDF.

THE APPLICATIONS: How's it taste?

This is the critical bit of course, forgetting historical and philosophical
arguments, do the content syndication applications work and interoperate?
The better your model, the better you are served -- but nothin is perfect.
To those that don't like it bitter, the brew can be sweetened. And, I
believe, that a requirement on future work should be that a future RSS
should be able to be easily used by both "fast custom crafted XML" and
"flexible generic RDF" applications. <smile/>

--

Regards,          http://www.mit.edu/~reagle/
Joseph Reagle     E0 D5 B2 05 B6 12 DA 65  BE 4D E3 C1 6A 66 25 4E

* This email is from an independent academic account and is
  not necessarily representative of my affiliations.

-------------------------------------------------------

-- 

Joseph Reagle Jr.                 http://www.w3.org/People/Reagle/
W3C Policy Analyst                mailto:reagle@w3.org
IETF/W3C XML-Signature Co-Chair   http://www.w3.org/Signature/
W3C XML Encryption Chair          http://www.w3.org/Encryption/2001/
Received on Thursday, 25 July 2002 13:10:53 UTC