Following the discussion about validity or not as an aspect of accessibility, I would like to share my thoughts about this topic: Parsing documents * In XML the concept of well-formedness allows for parsing a document that uses an unknown vocabulary (elements, attributes). * In non-XML SGML there is no such concept. Parsers should ignore elements they do not know and try to parse the document somehow. If a non-XML SGML document is valid, it can easily be parsed. An invalid document may not be parsed the author-intended way. The outcome of parsing an invalid document is undefined. So for not getting confused with sloppy markup nesting etc. an XML document has to be at least well-formed, and a non-XML SGML document should be valid. Extracting information When an application wants to extract information from a markup document (XML or not-XML) and present it to the user, the used vocabulary must be known. This requires the document to be valid - not only to some homebrewn, but to a published and accepted grammar. This grammar is the interface between the information provider and the information extractor. -- Johannes Koch - Competence Center BIKA Fraunhofer Institute for Applied Information Technology (FIT.LIFE) Schloss Birlinghoven, D-53757 Sankt Augustin, Germany Phone: +49-2241-142628Received on Thursday, 23 June 2005 15:28:17 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 22 March 2009 02:51:21 GMT