W3C home > Mailing lists > Public > www-validator-cvs@w3.org > September 2007

[Bug 5031] Doctype detection fails if root element includes non "word" character

From: <bugzilla@wiggum.w3.org>
Date: Tue, 11 Sep 2007 06:35:56 +0000
CC:
To: www-validator-cvs@w3.org
Message-Id: <E1IUzLo-0006x2-18@wiggum.w3.org>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=5031

           Summary: Doctype detection fails if root element includes non
                    "word" character
           Product: Validator
           Version: 0.8.1
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P3
         Component: check
        AssignedTo: ot@w3.org
        ReportedBy: ot@w3.org
         QAContact: www-validator-cvs@w3.org


The doctype detection routine in preparse_doctype() has the following regexp to
detect FPI and SI:

m(<!DOCTYPE\s+(\w+)\s+(?:PUBLIC|SYSTEM)\s+...
the first (\w+) is the name of the document type, which has to be the root
element
(ref: http://www.w3.org/TR/xml/#vc-roottype )
but the \w+ is incorrect, as the root element can (among others) have a dash or
dot.
(ref: http://www.w3.org/TR/xml/#IDANQDS )

This half-breaks detection of the doctype for languages with root element
including non "perl word (alphanum plus _)" characters.
Received on Tuesday, 11 September 2007 06:36:01 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 10 December 2014 20:08:28 UTC