- From: Johannes Koch <johannes.koch@fit.fraunhofer.de>
- Date: Thu, 23 Jun 2005 17:28:08 +0200
- To: w3c-wai-gl@w3.org
Following the discussion about validity or not as an aspect of accessibility, I would like to share my thoughts about this topic: Parsing documents * In XML the concept of well-formedness allows for parsing a document that uses an unknown vocabulary (elements, attributes). * In non-XML SGML there is no such concept. Parsers should ignore elements they do not know and try to parse the document somehow. If a non-XML SGML document is valid, it can easily be parsed. An invalid document may not be parsed the author-intended way. The outcome of parsing an invalid document is undefined. So for not getting confused with sloppy markup nesting etc. an XML document has to be at least well-formed, and a non-XML SGML document should be valid. Extracting information When an application wants to extract information from a markup document (XML or not-XML) and present it to the user, the used vocabulary must be known. This requires the document to be valid - not only to some homebrewn, but to a published and accepted grammar. This grammar is the interface between the information provider and the information extractor. -- Johannes Koch - Competence Center BIKA Fraunhofer Institute for Applied Information Technology (FIT.LIFE) Schloss Birlinghoven, D-53757 Sankt Augustin, Germany Phone: +49-2241-142628
Received on Thursday, 23 June 2005 15:28:17 UTC