[Bug 27257] New: anyURI_b006 seems to be valid

https://www.w3.org/Bugs/Public/show_bug.cgi?id=27257

            Bug ID: 27257
           Summary: anyURI_b006 seems to be valid
           Product: XML Schema Test Suite
           Version: 2006-11-06
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Microsoft tests
          Assignee: cmsmcq@blackmesatech.com
          Reporter: georgiy.rakov@oracle.com
        QA Contact: public-xml-schema-testsuite@w3.org

Bug 4048 [1] resulted in marking the expected result for anyURI_b006 test as
"invalid" because "//" (double slash) is considered as invalid URI. However
according to reading of rfc2396 [2] presented below double slash should be
considered as valid URI.

Section "5. Relative URI References" from rfc2396.txt [2] states that:

   A relative reference beginning with two slash characters is termed a
   network-path reference, as defined by <net_path> in Section 3.  

Section "3. URI Syntactic Components" from rfc2396 [2] states:

      net_path      = "//" authority [ abs_path ]

Section "3.2. Authority Component" from rfc2396 [2] states:

      authority     = server | reg_name

So if 'server' component can be empty then '//' should be considered as valid
URI. According to following reasoning 'server' component can be empty.

Section "3.2.2. Server-based Naming Authority" from rfc2396 [2] states:

      server        = [ [ userinfo "@" ] hostport ]

namely according to BNF rules above it is allowed for 'server' component to be
empty, thus '//' can be considered as empty relative network-path reference.

I understand that 3.2.2 from rfc2396 [2] in its beginning states:

   URL schemes that involve the direct use of an IP-based protocol to a
   specified server on the Internet use a common syntax for the server
   component of the URI's scheme-specific data:

      <userinfo>@<host>:<port>

   where <userinfo> may consist of a user name and, optionally, scheme-
   specific information about how to gain authorization to access the
   server. The parts "<userinfo>@" and ":<port>" may be omitted.

thus it looks like that from:
1. definition '<userinfo>@<host>:<port>'
2. and the excerpt from above: 'The parts "<userinfo>@" and ":<port>" may be
omitted'
it follows that '<host>' part is obligatory,
but section "1.6. Syntax Notation and Common Elements" states:

   This document uses two conventions to describe and define the syntax
   for URI.  The first, called the layout form, is a general description
   of the order of components and component separators, as in

      <first>/<second>;<third>?<fourth>

   The component names are enclosed in angle-brackets and any characters
   outside angle-brackets are literal separators.  Whitespace should be
   ignored.  These descriptions are used informally and do not define
   the syntax requirements.

namely it says: "These descriptions are used informally and do not define the
syntax requirements.". Hence I believe no conclusions about syntax should be
made from layout syntax definition '<userinfo>@<host>:<port>' of 'server'
component.

[1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=4048
[2] http://www.ietf.org/rfc/rfc2396.txt

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Thursday, 6 November 2014 12:19:11 UTC