- From: <bugzilla@wiggum.w3.org>
- Date: Thu, 01 Feb 2007 21:57:37 +0000
- To: public-qt-comments@w3.org
- CC:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=3939 jim.melton@acm.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #1 from jim.melton@acm.org 2007-02-01 21:57 ------- The Task Force has agreed to provide such an example. In Section 4.1, Tokenization, immediately prior to section 4.1.1, we will insert a paragraph that reads: For some languages, some tokenizers may identify overlapping tokens. For example, the German word "Donaudampfschifffahrtskapitaensmuetzen" might be tokenized into the following tokens: Donaudampfschifffahrtskapitaensmuetzen, Donau, dampf, schiff, dampfschiff, kapitaen, muetzen, kapitaensmuetzen, schifffahrt, dampfschifffahrt, and perhaps others.
Received on Thursday, 1 February 2007 21:57:46 UTC