- From: Francois Daoust <fd@w3.org>
- Date: Mon, 03 May 2010 12:27:00 +0200
- To: public-mobileok-checker <public-mobileok-checker@w3.org>
Replying to myself... Francois Daoust wrote: > Internal changes > ----- > The main change is that Saxon's TinyTree's DOM implementation is used to > parse the document under test with line numbering activated. The line > number is then added to the moki serialization (see methods > XhtmlContent.parse and XhtmlContent.toMokiNode). > > The use of Saxon's DOM implementation triggered a couple of bugs related > to the fact that instances of DOM nodes are created on the fly by Saxon > when needed, and cannot be compared with "==". They must be compared > with the DOM "Node.isSameNode" method (see e.g. changes in > ObjectResourceExtractor). I found and fixed a few other bugs that had been created by that change, for instance related to counting extraneous characters. I also found and reported a bug in Saxon that affects the use of an entity resolver and thus the possibility to use a local catalog of DTDs when a Document is created in a specific way (and more precisely right the way we need...). Michael Kay suggested some workaround which I implemented today: https://sourceforge.net/mailarchive/message.php?msg_name=209D7731E68043DC8F6695AF79CD6397@Sealion > Notes > ----- > - Newer versions of Saxon would also allow to preserve the column, but > we cannot switch to newer versions for licensing reasons (the mobileOK > Checker uses extension functions which are not included in Saxon-HE, > AFAICT). Actually, we can switch to Saxon-B version 9.1 that adds the functionality with the same license. Any reason not to? > - The line number seems to stay accurate when the source is tidied: the > library the Checker uses to tidy up the source does not seem to add or > remove lines. This shouldn't be relied upon, though. The column number > would also not stay accurate. The library did add/remove lines from time to time in practice, but I should now have fixed most of the cases where this happens. The returned line position is "as close as possible" to the original line position. > - In the moki, the introduction of the "line" attribute in HTML elements > triggers the definition of a "ns0" prefix for the moki namespace defined > in the HTML root, e.g.: > <html xmlns="http://www.w3.org/1999/xhtml" lang="en" > xmlns:ns0="http://www.w3.org/2007/05/moki" ns0:line="2"> > That's technically correct, alghough visually ugly. I would have > preferred to control the serialization and generate a "moki" (or "m") > prefix, but I could not figure out any easy way to do that in Java. Having run into weird namespaces issues when the DOM tree was serialized, I eventually resorted to the use of an additional XSL stylesheet that forces the use of an "m" prefix. Note that I checked and re-generated the whole test suite. Let me know if you find anything strange. Francois. > > > Related "bugs" > ----- > 5006: Does a "tidied" element or attribute exist? > 6962: Code extracts: closing tag and tag content are often useless > 9538: Improve code references > 9583: Return code position consistently across the tests that output > code extracts > These bugs are visible using: > http://www.w3.org/Bugs/Public/show_bug.cgi > > > Francois. > >
Received on Monday, 3 May 2010 10:27:29 UTC