- From: Jimmy Cerra <jimbobbs@hotmail.com>
- Date: Mon, 16 Jun 2003 00:58:49 -0400
- To: "'Sean B. Palmer'" <sean@mysterylights.com>
- Cc: <www-rdf-interest@w3.org>
>> I don't want to build a tokenizer for parsing >> apostrophes, white spaces, and other strings on top of >> another tokenizer, the XML processor. > > Why not? It's another level of abstraction that I have to deal with. With only XML, the character encoding, string parsing, entity normalization, and even sometimes file IO are all handled by a third party processor. I don't have to worry about them, and I can concentrate with the model extracted from the document. However, text-node processing has to be done on top of all that - and it must be written by me (since XML processors are general-purpose machines) - in order for the XENT model to be extracted. With SAX that's sometimes trivial; however, it becomes a real liability with DOM or XSLT processing. Also, incompatibilities with the XML processor and text-node processor have to be tested, worked around, etcetera. > > IMHO, using two or more different escaping methods > > really mucks up the language. > > I'm not actually sure what you mean by this--could you expand, please? > Actually, escaping should've been listed as a TODO in my original > announcement. But since XENT uses XML/RDF, it's basically going to > have to use entity escaping *instead* of the Python-esque \uHHHH > method. No big deal, and not an issue that would get in the way of any > standards track work on the format, IMO. For instance, I suspect that " and ' won't work: The XML processor will normalize them to " and ' before the text-node parser sees them, and the parser will puke on the misplaces characters. So you must to use a different escaping method - \' , \" , and \\ , (or \u0027 , \u0022 , and \u005C ) - and that's butt ugly in XML (two separate methods... <<shiver>> ). > Hmm. I think that BCC is better since this thread is liable to go off > topic... Too late. :-) -- Jimmy Cerra ] "I have learned these days, never to limit ] anyone else due to my own limited ] imagination." - Dr. Mae C. Jemison > -----Original Message----- > From: Sean B. Palmer [mailto:sean@mysterylights.com] > Sent: Sunday, June 15, 2003 11:25 AM > To: jimbobbs@hotmail.com > Cc: www-rdf-interest@w3.org > Subject: Re: XML Enriched N-Triples (XENT) > > > [...] the more experimentation, then the more likely some > > good variants will be created. > > Quite. Whilst RDFCore are well past overdue going by their chartered > timeline, and whilst the RDF Syntax specification is still not at CR, > RDF/XML is only undergoing a bug fix from the 1999 original; it's a > half-decade old technology coming to fruition. It may be too widely > deployed now for any alternate serialization to seriously challenge > it, and that, like it or not, is going to put off a lot of people from > using RDF. It's a shame that the main barrier to alternate > serializations, and hence RDF's adoption, is an historical/political > accident. > > With any format like RPV or XENT, or your own hypothetical YARS (Yet > Another...), the suitability of the language--how well it fits the > requirements of those who use RDF--is unfortunately a small factor. > The XML and RDF communities are full of a lot of people who have very > strong opinions about lots of things--by necessity, though it tends to > lead to some quite obsessive and heated likes/dislikes of various > constructs. > > For example, as was noted to me on #rdfig [1], XENT itself is pretty > much a mix of constructs from Notation3/N-Triples and RDF/XML mashed > together into one proposal. I tried, of course, to take the best > features from each approach, but the problem is that proponents of RDF > serializations are usually very passionately one-sided about which > method they prefer (talking from my possibly incorrect experience > here). In other words, people tend to favor one serialization very > much over the others. So whilst you'd think that a compromise between > them would be a good idea, it'll probably just end up with almost > every established member of the RDF community snubbing it :-) > > I note that even though QNames are used heavily in communications > about SW vocabularies, they're viewed as harmful in the motivation > section of RPV. We need them; we may as well deploy them. > > > On abbreviating element names for URIs, there has been > > controversy [2]. > > A better reference is:- > > http://www.w3.org/2001/tag/doc/qnameids-2002-07-15 > > There's a lot of FUD surrounding QNames, but at the end of the day > they just map prefixes to namespaces. For RDF, it's too handy an > abbreviation mechanism for URIs to pass up. RDF/XML and Notation3 > would be lost without them, and NTriples is basically too difficult to > write because it doesn't have them. > > The TAG finding says that parsing costs of QNames in PCDATA may be > high. I've proved that, for Python and SAX and least, the opposite is > true. I suspect that this will be the case in many other languages > too. The TAG, in the finding above, say that since the approach is > widely deployed to good effect, it's "reasonable to use QNames in this > way". > > > I don't want to build a tokenizer for parsing apostrophes, > > white spaces, and other strings on top of another tokenizer, > > the XML processor. > > Why not? > > > IMHO, using two or more different escaping methods > > really mucks up the language. > > I'm not actually sure what you mean by this--could you expand, please? > Actually, escaping should've been listed as a TODO in my original > announcement. But since XENT uses XML/RDF, it's basically going to > have to use entity escaping *instead* of the Python-esque \uHHHH > method. No big deal, and not an issue that would get in the way of any > standards track work on the format, IMO. > > > I like your ideas; [...] > > Thanks. Your feedback is appreciated. > > > CC: Tim Bray - as with the origional message > > Hmm. I think that BCC is better since this thread is liable to go off > topic... > > Cheers, > > -- > Sean B. Palmer, <http://purl.org/net/sbp/> > "phenomicity by the bucketful" - http://miscoranda.com/
Received on Monday, 16 June 2003 00:58:58 UTC