- From: Al Gilman <asgilman@iamdigex.net>
- Date: Fri, 29 Oct 1999 10:45:53 -0500
- To: "Chris Ridpath" <chris.ridpath@utoronto.ca>, "WAI ER IG List" <w3c-wai-er-ig@w3.org>
At 10:27 AM 10/29/99 -0400, Chris Ridpath wrote: >I was hoping that we could take a look at this (seemingly) simple technique >next. >http://www.w3.org/WAI/ER/IG/ert/#Technique4.1.A > >The WAI checkpoint states: >"Clearly identify changes in the natural language of a document's text and >any text equivalents (e.g., captions)" > >Is it practically possible to detect when the author has used words that are >not in the document's primary language? > >If we can't detect this then should every document (that has a BODY and any >text) trigger a warning about this? The method that I have been contemplating is dictionary-based. It actually parses the content into word-candidates and attempts to validate the presence of these words in a dictionary. This is similar to spell-checking and current commodity spell-checkers create the exception events when words don't check out. Al PS: The actual scenario this comes from has to do with preparing TTS-ready texts and it covers markup of smileys, acronyms, and all things that match the lexing rules that natural language words match but fail to have stable, widely-known pronunciations. Language changes would get caught in such a net. > >Chris >
Received on Friday, 29 October 1999 10:41:42 UTC