W3C home > Mailing lists > Public > www-archive@w3.org > March 2004

genxScrubText(...) does not scrub text

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Fri, 19 Mar 2004 08:37:03 +0100
To: tbray@textuality.com
Cc: www-archive@w3.org
Message-ID: <4065a066.478281322@smtp.bjoern.hoehrmann.de>

Hi Tim,

  genxScrubText(...) does not work as advertised, it will never skip any
invalid octet sequence because it does not increment or reset the 'last'
pointer. I guess it is actually supposed to do something like

  ...
  while (*in)
  {
    int c = genxNextUnicodeChar(&in);
    if (c == -1 || !isXMLChar(w, c))
    {
      problems++;
      last = in; /* <-- */
      continue;
    }

    while (last < in)
      *out++ = *last++;
  }
  ...

regards.
Received on Friday, 19 March 2004 02:37:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 July 2008 08:09:21 GMT