- From: Johannes Koch <johannes.koch@fit.fraunhofer.de>
- Date: Wed, 26 Apr 2006 09:54:05 +0200
- To: "public-wai-ert@w3.org" <public-wai-ert@w3.org>
Following my action item from <http://www.w3.org/2006/04/05-er-minutes#item02>, here are my comments to the byteOffset with snipppet. The proposal was: <earl:snippet> <earl:Snippet> <earl:content rdf:parseType="Literal" xmlns:x="chickens"><x:a>chickens</x:a></earl:content> <earl:byteOffset>100</earl:ByteOffset> </earl:Snippet> </earl:snippet> or <earl:snippet> <earl:Snippet> <earl:content><![CDATA[ <div>chickens<img src="chicken.gif"></div> ]]></earl:content> <earl:byteOffset>15</earl:ByteOffset> </earl:Snippet> </earl:snippet> The snippet type points to a snippet and then a byteOffset to the start of the error within the snippet - you cannot point to a range with this. As I tried to point out during the telecon, the earl:content property contains _characters_, while the earl:byteOffset property is a _byte_ offset relative to the contents of the snippet. That's inconsistent and mixing levels. One is in the character level, the other is on the byte level. How does an EARL-reading tool know which character encoding to use to encode the characters in the snippet to then apply the byte offset to get the byte that marks the error? When the subject is text and the snippet contains text content, it makes sense to have a char offset. When the subject is binary, it makes sense to have a byte offset. The byte sequence to be put into the snippet has to be encoded (e.g. Base64) because EARL is a text format. The encoding must be recorded so the EARL reading tool can transform the snippet content into the original byte sequence to apply the byte offset. -- Johannes Koch - Competence Center BIKA Fraunhofer Institute for Applied Information Technology (FIT.LIFE) Schloss Birlinghoven, D-53757 Sankt Augustin, Germany Phone: +49-2241-142628
Received on Wednesday, 26 April 2006 10:04:07 UTC