- From: Dave Beckett <dave.beckett@bristol.ac.uk>
- Date: Tue, 22 Jan 2002 11:50:17 +0000
- To: Sjoerd Visscher <sjoerd@w3future.com>
- Cc: www-rdf-comments <www-rdf-comments@w3.org>
Sorry for the delay in replying. >>>Sjoerd Visscher said: > I'm sorry to mail to you directly about this, but the www-rdf-comments lists > seems to contain only spam. Sure; I'm reading that list anyway. > I like the direction the new RDF/XML Syntax Specification is going. However, > I had the feeling the intermediate SAX-like model was a step too much. So I > tried to express the RDF/XML Grammar directly in Infoset terms, using > http://www.w3.org/2001/04/infoset I did consider using infoset item directly and I'm trying to think exactly what my reasoning was thn. Something to do with since 90+% of the RDF/XML parsers use streaming SAX events, and familar forms of grammars (BNFs etc) are a serialisation, it seemed much more natural for presenting to parser writers, to give a grammar based on concepts very close to the software. i.e. if you have SAX events, you can skip most of the detail and just look at what to do with sequences of them. [Although we are being SAX-orientated; this is not a required parsing method; any other that produces the same output for the same input and passes the test cases, is also OK.] I was looking at the XPath nodeset as a starting basis and although it was fine for XPath, for this problem it needed new nodes (Identifer) and some node properties (identifer-type) as described in http://www.w3.org/TR/2001/WD-rdf-syntax-grammar-20011218/#section-Data-Model > > It turned out to be quite straight forward, it looks a lot like the Relax NG > Schema. You'll find it attached. I tried to use the same notation as the > Spec uses, i.e. classname(propertyname restriction [,...]) Properties that > have no restrictions aren't shown, f.e.: > > nodeElement = Element( > attributes=set((idAttr | aboutAttr)?, bagIdAttr?, propertyAttr*), > children=propertyEltList) > > Which means that the namespaceName and localName are unrestricted. > Unrestricted means unrestricted from the RDF Grammar point of view, > properties like 'parent' are already restricted by the infoset > specification. It think, although precise, for a parser app taking in SAX events one by one, it is too much to expect it to match big lumps of XML like the above. It isn't clear given a particular event, what to do then or when the next one arrives - i.e. the state machine. Not sure what you mean by understricted. Some things are allowed, some are restricted, some are forbidden (like non-namespaced prefixed attributes). By omitting the namespaceName and localName does that mean any values are allowed - no. We need to be more precise than that. > Some specific features: > > ws = Character( > elementContentWhitespace=Boolean.true) > > Boolean.true is defined in the rdf version of the infoset. > > parseTypeLiteralPropertyElt = Element( > attributes=set(idAttr?, parseLiteral)) > > Here the children property is unrestricted. This area - parseType literal - is still under consideration, so I can't really make the definitive change here until we decided what is allowed inside it. There are *lots* of issues here for embedded XML - namespaces, xml:base, XML Canonicalisation, ... and we aren't yet in a stage to resolve it. > It is also easy to check what Infoset features remain unrestricted: > prefixes, namespace declarations, if attributes are specified in a DTD or > not, CDATA, etc. And which are restricted: PI's or comments are not allowed, > except inside the parseTypeLiteralPropertyElt, and no document type > declaration. But maybe that is a too strict translation. RDF/XML doesn't care about/use such things - in the next grammar I will add that other infoset items are ignored (some of these don't have SAX events). I expect the list will be as follows: Processing Instruction Unexpanded Entity Reference Comment Document Type Declaration Unparsed Entity Notation -- reading http://www.w3.org/TR/xml-infoset/#infoitem > > Kind regards, > > Sjoerd Visscher > w3future.com Original attachement removed; see http://lists.w3.org/Archives/Public/www-rdf-comments/2001OctDec/att-0391/01-infoset2rdf.txt Thanks for the feedback Dave
Received on Tuesday, 22 January 2002 06:50:27 UTC