Re: ACTION 2023-05-09-c: Steven to produce new sample grammars for issue #139

> ACTION 2023-05-09-c: Steven to produce new sample grammars for issue

> #139 for discussion in June.



Late I know, but here is a proposal.


IRIs: absoluteIRI**nl, nl?.
-nl: cr?, lf.
-cr: -#d.
-lf: -#a.


absoluteIRI: scheme, -"://", user?, host?, port?, path?, query?, fragment?.


scheme: letter, letgit*.
-letter: ["a"-"z"; "A"-"Z"].
-letgit: ["a"-"z"; "A"-"Z"; "0"-"9"; "+.-"].
                         {Example: http}


user:   uch*, -"@".
-uch:    enc ; iletter ; punct.
                         {Example: user05:pw12345@}


host:   domain++-"." ;
 -"[", ipv6, -"]".


domain: (iletter+)++"-". {A domain may contain hyphens,
     but not start or end with one.}
                         {Example: www.w3.org}
    {Note that like the published iri grammar,
     this also accepts 192.168.0.0 etc}


ipv6: h4++-":", (-":", ipv4)?;
      head, zeros, tail.
                         {Example: 2001:db8:1::8a2e:370:7334}


ipv4: d3, -".", d3, -".", d3, -".", d3.
                         {Example: 192.168.0.1}


-head: h4**-":".
-tail: ipv4;
       h4++-":", (-":", ipv4)?;
       .
zeros: -"::".


h4: h, (h, (h, h?)?)?.
-h: ["0"-"9"; "a"-"f"; "A"-"F"].
d3: d;
    d, d;
    ["01"], d, d;
    "2", ["01234"], d;
    "25", ["012345"].
-d: ["0"-"9"].


port:   -":", d*.
                         {Example: :80}
path:   segment+.
segment: -"/", pch*.
-pch:    enc ; iletter ; punct ; "@".
                         {Example: /2002/xforms/index.xhtml}
query:  -"?", qfch*.
-qfch:   enc ; iletter ; punct ; ["/?@"].
                         {Example: ?q=test}
fragment: -"#", qfch*.
                         {Example: #toc}
-iletter: ["a"-"z"; "A"-"Z"; "0"-"9"; #A0-#EFFFD].
-enc:    "%", ["0"-"9"; "A"-"F"], ["0"-"9"; "A"-"F"].
-punct:  [".!$&'()*+,;=:_~-"].


Input examples


http://www.W3.org/
http://www.w3.org
http://www.w3.org/2002/xforms
irc://irc.w3.org:6665/#forms
http://search.example.org?q=a
ssh://user@host.example.com:2222
ftp://anonymous@example.net:4916/;type=d
file:///test.txt
http://example/my%20file
http://example-com.abc-def.com/
http://user05:pw12345@[2001:db8:1::8a2e:370:7334]:80/2002/xforms/index.xhtml?q=test#toc
http://[::]/
http://[::1]/
http://[1::]/
https://[::FFFF:192.168.0.128]:8080/
http://[2001::7334]/
http://192.168.0.1/
http://192.168.0.1:/
https://école.fr.example.org/élève.xhtml
https://zh.wikipedia.org/wiki/Wikipedia:关于中文维基百科/en
https://www.石川.日本/雅康#mimasa
http://search.example.org?q=☺
http://http://http://@http://http://?http://#http://
http://




Comments:


Like the original, in a non ipv6 domain, it doesn't treat ipv4 specially: 192.168.0.1 is treated just as 192.168.0.com would be (which is a valid domain address)
      <host>
         <domain>192</domain>
         <domain>168</domain>
         <domain>0</domain>
         <domain>1</domain>
      </host>


Doesn't separate the user and password.


Doesn't accept file:/test.txt, but does accept file:///text.txt


In an ipv6 address with an ipv4 ending, like [::FFFF:192.168.0.128] it does this:


    <host>
         <ipv6>
            <zeros/>
            <h4>FFFF</h4>
            <ipv4>
               <d3>192</d3>
               <d3>168</d3>
               <d3>0</d3>
               <d3>128</d3>
            </ipv4>
         </ipv6>
      </host>


Steven

Received on Tuesday, 13 June 2023 13:13:30 UTC