W3C home > Mailing lists > Public > www-rdf-interest@w3.org > April 2001

Re: RDF in XHTML

From: Murray Altheim <altheim@eng.sun.com>
Date: Tue, 17 Apr 2001 17:17:48 -0700
Message-ID: <3ADCDD2C.B2CA63C6@eng.sun.com>
To: Seth Russell <seth@robustai.net>
CC: Dan Brickley <danbri@w3.org>, Joshua Allen <joshuaa@microsoft.com>, "Sean B. Palmer" <sean@mysterylights.com>, Danny Ayers <danny@panlanka.net>, RDF Interest <www-rdf-interest@w3.org>
Seth Russell wrote:
> 
> From: "Murray Altheim" <altheim@eng.sun.com>
> 
> > I've never offered to solve world hunger, even for RDF flesh. I don't know
> > how to solve that one. The *only* way I can imagine (that wouldn't involve
> > an act of Congress) would be to have CDATA section nodes containing RDF be
> > notation-marked as RDF, such that they get passed off to an RDF schema
> > processor for *appropriate* processing. This isn't technically all that
> > difficult, but it's religiously and politically unlikely. IMO.
> 
> Actually this solution is rather attractive to me ... it has the smell of
> the right way to do it.  But I'm a babe in the woods where it comes to
> swimming in these waters.  Could you (or somebody) sketch the theological
> implications for us?

Theologically, it seems to me that few people in the W3C like the SGML 
approach to dealing with non-XML content, which would be to use notations.
Their way would be using XML namespaces, which unfortunately don't provide
the features that XML notations do. You'd use XML Schema datatypes, which
is a might bit more complex.

The XML Schema approach is also markedly different, which is to *validate* 
the content. Notation-based approaches simply indicate what previously-
declared notation a specific entity is considered to be, "entity" in our
case being a CDATA-wrapped DOM node. 

XML 1.0 got halfway there in supporting SGML notations, in that one can 
indicate the notation of element content, but one cannot do this for 
attribute content. Given that most theologians believe that element and
attribute content are both "document content" this was an unfortunate 
oversight that would have allowed DTDs to compete with XML Schemas on a
more level playing field. I'd like to see any update of XML include 
notations on attributes, but I'm a bit cynical given the W3C's dislike
of DTDs.

But for our purposes here, what we have will do just fine. Check out the
following if you want to follow along in the XML spec as to I'm talking
about:

   http://www.xml.com/axml/target.html#Notations

Basically, we can't simply put any markup that contains angle brackets
into any XML document without breaking validity. XHTML is not special
in this regard. But we can wrap such markup (such as RDF) in a CDATA
section. This means that it doesn't get well-formedness checking, which
would have to occur in the processor that receives the CDATA section
DOM node. But if this was an understood part of the process, we could
proceed.

In the DTD we'd have something akin to:

   <!NOTATION dc PUBLIC 
       "-//DCMI//NOTATION Dublin Core Metadata Element Set V1.0//EN"" 
       "http://dublincore.org/">
   <!NOTATION rdf SYSTEM "http://www.w3.org/1999/02/22-rdf-syntax-ns#">  
   <!NOTATION blat PUBLIC "-//doctypes.org//NOTATION Blat 1.0//EN"
       "http://www.doctypes.org/blat/1.0/">
   ...
   <!ELEMENT  metadata  ( #PCDATA ) >  <!-- really, a CDATA section -->
   <!ATTLIST  metadata
       type  NOTATION  (dc|rdf|blat)
   >
   ]><!-- end of DTD -->
   ...
   <head>
   <metadata type="rdf">
   <![CDATA[
     {rdf content}
   ]]></metadata>

The "(dc|rdf|blat)" list can't unfortunately be an open-ended list.
Each of the tokens in the list must be a declared NOTATION in the DTD,
otherwise you'll get an error. But given the purpose of DTDs is to create
constraints, this isn't too bad; this is essentially a contract to 
everyone on what we'd all accept in terms of available public notations.
We could add a empty parameter entity to allow it to be extended in a
document's internal subset for custom or development use, such as:

   <!ENTITY % Metadata.ext "">
   <!ATTLIST  metadata
       type  NOTATION  (dc|rdf|blat %Metadata.ext;)
   >

Then, as I mentioned above, the CDATA section DOM node (ie., the content
of the metadata element) would be passed off to processor which would 
strip off the CDATA section wrapper and pass it to another XML parser
process, which would first well-formedness check it before sending it
off to the RDF processor. If the RDF was of a particular known application,
it would then process the content appropriately. Any decent engineer could
whip this up in fairly short order from commonly available tools. [Okay, 
so I've outlined the process...]

> > Why do you need to have the RDF be *in* the XHTML file? Honestly, without
> > trying to sell you an XTM solution, this is precisely what XTM is good
> > for: mapping resources within XML files. [..good stuff snipped...]
> 
> Well for the general solution to describing resources with RDF we need to be
> able to read and write it ... were working on the writing ... the reading
> (first level) will need to be just as simple.  Ideally a browser plug-in or
> the browser itself can pop up a surfable user friendly window of the
> metadata.  Knowing the XTM tags and retrieving other resources is going to
> complicate that application an order of magnitude.  Imho, it's a deal
> breaker.

I wasn't trying to make it a bar to entry, only pointing out a possible
solution. RDF could be used for this too, but one would need a specific
application of RDF to standardize the semantics of the mapping, and you'd
have something *like* XTM then. And I'm not particularly interested in 
creating a custom RDF language to use for mapping web documents. I'll
bet somebody else is, and I'm not one to rein on anyone's parade here.
 
> > As has been mentioned in other threads and by other people, creating
> > external documents means more document management, less document
> portability,
> > the likelihood of metadata-document mismatch, etc. People already spend
> too
> > much time managing (or not managing) their document sets. I'd hate to add
> > to their burden.
> 
> I agree.  I seem to smell a consensus that the embedded way is the best ...
> it's just that there is this theological problem with it,  and it's
> currently politically incorrect.   Well shucks .. we stopped the war, didn't
> we, so this should be an easy piece.

Embedding DC won't be too bad. Embedding RDF will require also a software
solution that's a tad bit custom currently in the XML world: actually 
paying attention to notation declarations and passing content off to
notation processors. Oh, and in having an RDF notation process know to
cast off the CDATA wrapper prior to beginning processing. There's no
spec that spells out this process; perhaps there should be. But I do
maintain that the solution I've described above is likely unpalatable
to the W3C. I'd be happy to be shown wrong, though.

Murray

PS. I'm cc'ing myself as I just did a substantial amount of the DTD work. :-)
BTW, isn't everyone on the cc list also on RDF Interest? Why not kill the cc
and rely on the listserver? I'm getting an awful lot of duplicates.
...........................................................................
Murray Altheim, SGML/XML Grease Monkey     <mailto:altheim&#64;eng.sun.com>
XML Technology Center
Sun Microsystems, 1601 Willow Rd., MS UMPK17-102, Menlo Park, CA 94025

      the wood louse sits on a splinter and sings to the rising sap
      ain't it awful how winter lingers in springtimes lap -- archy
Received on Tuesday, 17 April 2001 20:16:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:51:48 GMT