W3C home > Mailing lists > Public > www-rdf-comments@w3.org > October to December 2003

RE: Internal DTD Examples Invalidate the RDF/XML Documents

From: Dennis E. Hamilton <dennis.hamilton@acm.org>
Date: Thu, 2 Oct 2003 17:01:51 -0700
To: "Frank Manola" <fmanola@acm.org>
Cc: <www-rdf-comments@w3.org>
Message-ID: <FFEPLLNFAHGBKNENFGPAEELDDCAA.dennis.hamilton@acm.org>

Hi Frank,

When I saw that the OWL RDF Schema uses exactly the same technique in a published RDF, I dug into this more deeply and asked a related question at 

	<http://lists.w3.org/Archives/Public/public-webont-comments/2003Oct/0001.html>.

Here's my understanding:

1.	RDF/XML requires that the XML be well-formed.  Yes.  And the presence of the XML declaration seems to be an assertion that what follows will indeed be well-formed XML (including having a single root element).

2.	I did not know that a Document Type Declaration was required for RDF/XML.  I haven't checked the latest specification.  That is a surprising requirement, since it is not a requirement for XML 1.0 and in particular for XML 1.0 documents that are only intended to be well-formed.

It is also a surprising requirement since a Document Type Declaration can be taken as an assertion of validity.  A non-validating processor is not required to confirm it, but I had always taken it as a form of promise.   

3.	I think that is the disconnect for me. 

It is simply very peculiar to have a practice that involves using a Document Type Declaration that establishes a DTD for which there are no XML valid documents.

SOMEHOW, THE PRACTICE NEEDS TO BE MADE EXPLICIT.  It is weird to think that someone won't try the technique in the examples (I did), and it is even more startling to have an XML editor complain when fed the OWL RDF Schema. 

4.	There are XML editors and other tools that will validate when a Document Type Declaration is present.  And there are other processors, such as IE6.0, that will not even display the XML if it specifies a Document Type Declaration and it is not valid.  There is something in the implementation of IE 6.0 that has it not fail with an RDF though.  (It will display the OWL RDF Schema at <http://www.w3.org/2002/07/owl> with no problem.  I'm afraid to ask what the XSL instruction is for, though.)

So I have to deal with all of the error messages if I don't make the Document Type Declaration one for which the RDF is valid XML.  Since validation comes before content, it is a hack to notice that the XML is for an RDF and programmatically operate differently with regard to validation.  I haven't checked to see what the various DOM processors do, but this has to be a stumbling block for some of them.

(By the way, I successfully did external DTDs for two RDFs I just wrote for a class. Once I got the hang of it, it wasn't too difficult.  But each external DTD is clearly specific to the particular RDF that I built.  I could do a generic DTD that would work across multiple RDFs on the pattern that I am willing to use, but it doesn't work for arbitrary RDFs and it is a pretty "loose" DTD since I used <!ELEMENT rdf:description ANY> as a way to get around a lot of customization fuss.)

-- Dennis

-----Original Message-----
at <http://lists.w3.org/Archives/Public/www-rdf-comments/2003OctDec/0006.html>
From: Frank Manola [mailto:fmanola@acm.org]
Sent: Thursday, October 02, 2003 07:28
To: dennis.hamilton@acm.org
Cc: www-rdf-comments@w3.org
Subject: Re: Internal DTD Examples Invalidate the RDF/XML Documents


Hi Dennis--

The use of internal DTD subsets in these Primer examples was merely to 
illustrate the use of entities as an abbreviation mechanism. [ ... ]
 (the Primer examples are valid RDF/XML according to the W3C RDF 
validator).  Moreover, it's perfectly OK to use internal DTD subsets to 
define entities in XML you don't intend to validate (there's a 
well-formedness constraint in the XML spec that covers this situation) 

<dhnote>Granted</dhnote>

and, technically, RDF/XML *without* a document type declaration isn't 
valid, so it isn't exactly the introduction of an internal DTD subset 
that causes the RDF/XML to be invalid.

<dhnote>
Why?  Where did that come from?  
XML doesn't require there to be a document type declaration.
In light of the following, it is screwy to require it, and screwier not
to visibly declare a convention that a non-validating XML processor must 
be used for RDF.
</dhnote>

It's been known for some time that the current grammar of RDF/XML isn't 
amenable to description in a DTD (or an XML Schema).  Producing such a 
grammar was considered by the RDF Core WG, but it was decided that the 
changes would be so extensive as to be outside the current WG's charter. 
    This has been added to the RDF Core postponed issues list at:

http://www.w3.org/2000/03/rdf-tracking/#rdfms-validating-embedded-rdf

<dhnote>
Right.  You will never get a generic DTD to handle the case where a tag QName
is understood to be a shorthand for a URI based on a concatenation assumption 
concerning namespaces.  I don't expect that XML Schema will get far with it
either.  [This observation is not strictly true, in working with the basic,
common vocabularies, but it certainly holds as a practical matter.]
</dhnote?

We also received related Last Call comments on this subject labeled as 
xmlsch-10 and xmlsch-12, which can be found at
http://www.w3.org/2001/sw/RDFCore/20030123-issues/

<dhnote>
Although I have great sympathy for xmlsch-10, that is not the concern that I raise.  Satisfying the concerns expressed in xmlsch-10 would obviate my concern, but I was assuming that the RDF/XML notation for abbreviating mappings to URIs and triples is a foregone conclusion.  What I am saying is that if you are not going to operate in the XML stack according to what other users of XML processors might expect when they see an XML declaration, you need to say so in a very clear way.
	Another way to put it in xmlsch-12 terms is, if for some reason RDF/XML is not going to be consistent with Colloquial XML, then you must say so.  My sense is that it is not, and the examples and practices are not that innocent.  It gives pause that XML 1.0 validators will claim deficiencies in XML documents that RDF processors will assert are [RDF] valid.
	I think the interoperability issues around the "stack" up through Web Services to the Semantic Web require a clear statement.  I would treat that as separate from the xmlsch-10 and xmlsch-12 comments.
</dhnote>

Do you feel this issue needs additional discussion (in the Primer or 
some other RDF spec)?
<dhnote>Yes</dhnote>

--Frank


Dennis E. Hamilton wrote:
in <http://lists.w3.org/Archives/Public/www-rdf-comments/2003OctDec/0005.html>

> I have been reading over the current RDF Primer Working Document, 
> 
> 	<http://www.w3.org/TR/2003/WD-rdf-primer-20030905/>
> 
> And I notice that the introduction of internal DTD subsets to provide entity definitions (e.g., for &xsd;) results in the XML document being [DTD] invalid.
> 
[ ... ]
> 
> Dennis E. Hamilton
> ------------------
> AIIM DMware Technical Coordinator
> mailto:Dennis.Hamilton@acm.org | gsm:+1-206.779.9430
> http://DMware.info
>    ODMA Support: http://ODMA.info
> OpenPGP public key fingerprint BFE5 EFB8 CB51 8781 5274  C056 D80D 0C3F A393 27EC
Received on Thursday, 2 October 2003 20:04:59 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 21 September 2012 14:16:32 GMT