Hi Noah, noah_mendelsohn@us.ibm.com <noah_mendelsohn@us.ibm.com> writes: > This seems to me a slightly odd way of splitting things. In the implementation of the parser that I used as an example (Xerces-C++), XML parsing and validation against the schema are handled in separate places so in effect every 'data' character that is part of a value that needs validation is traversed twice: first by the XML parser code then by the validation code. The whole point of this mental exercise was to show that content validation must be a lot cheaper than structure validation. > Indeed, the whole > point of our earlier-referenced XML Screamer work was to make sure you can > come as close as possible to touching each such character no more than > once. That must have been some pretty tight integration of XML parsing and schema-based validation. For example when you validate, say a float, as an element value then you have to look for both legal float characters as well as '<'. If this float is a value of an attribute then you must watch for '"' instead of '<'. Or maybe there is a better way (I haven't gone through all the material you sent in your other email). Also I tend to believe that most existing parsers don't have this architecture. -boris -- Boris Kolpackov Code Synthesis Tools CC http://www.codesynthesis.com Open-Source, Cross-Platform C++ XML Data BindingReceived on Thursday, 19 October 2006 19:17:55 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 19 October 2006 19:17:58 GMT