- From: <bugzilla@wiggum.w3.org>
- Date: Thu, 01 Feb 2007 21:57:37 +0000
- To: public-qt-comments@w3.org
- CC:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=3939
jim.melton@acm.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
------- Comment #1 from jim.melton@acm.org 2007-02-01 21:57 -------
The Task Force has agreed to provide such an example. In Section 4.1,
Tokenization, immediately prior to section 4.1.1, we will insert a paragraph
that reads:
For some languages, some tokenizers may identify overlapping tokens. For
example, the German word "Donaudampfschifffahrtskapitaensmuetzen" might be
tokenized into the following tokens: Donaudampfschifffahrtskapitaensmuetzen,
Donau, dampf, schiff, dampfschiff, kapitaen, muetzen, kapitaensmuetzen,
schifffahrt, dampfschifffahrt, and perhaps others.
Received on Thursday, 1 February 2007 21:57:46 UTC