- From: olivier Thereaux <ot@w3.org>
- Date: Mon, 25 Jun 2007 15:55:40 +0900
- To: "www-validator@w3.org Community" <www-validator@w3.org>
- Cc: Martin Duerst <duerst@it.aoyama.ac.jp>
On Jun 25, 2007, at 14:18 , olivier Thereaux wrote: > Hence, when allowing no space around equal sign: > > /^<\?xml [\x20|\x9|\xD|\xA]+ version > [\x20|\x9|\xD|\xA]* = [\x20|\x9|\xD|\xA]* > ("1.0"|"1.1"|'1.0'|'1.1') > ([\x20|\x9|\xD|\xA]+ encoding > [\x20|\x9|\xD|\xA]* = [\x20|\x9|\xD|\xA]* > ("[A-Za-z][a-zA-Z0-9-_]+"|'[A-Za-z][a-zA- Z0-9_]+') > )? > ([\x20|\x9|\xD|\xA]+)+ standalone > [\x20|\x9|\xD|\xA]* = [\x20|\x9|\xD|\xA]* > ("yes"|"no"|'yes'|'no') > )? > [\x20|\x9|\xD|\xA]* \?> > /x And after checking that only the BOM (i.e no whitespace, no comment) may exist before the XMLdecl, and some comments (to be formatted a little nicer in fonal code), it gives us: /^[\xEF\xBB\xBF]? # we may have a BOM at the beginning before <?xml, nothing else <\?xml [\x20|\x9|\xD|\xA]+ version # for documents, version info is mandatory [\x20|\x9|\xD|\xA]* = [\x20|\x9|\xD|\xA]* # x20, x9, xD and xA are the allowed "xml white space" ("1.0"|"1.1"|'1.0'|'1.1') # hardcoding the existing XML versions. Maybe we should use \d\.\d ([\x20|\x9|\xD|\xA]+ encoding # encoding info is optional [\x20|\x9|\xD|\xA]* = [\x20|\x9|\xD|\xA]* ("[A-Za-z][a-zA-Z0-9-_]+"|'[A-Za-z][a-zA- Z0-9_]+') )? ([\x20|\x9|\xD|\xA]+)+ standalone # ditto standalone info, optional [\x20|\x9|\xD|\xA]* = [\x20|\x9|\xD|\xA]* ("yes"|"no"|'yes'|'no') )? [\x20|\x9|\xD|\xA]* \?> /x -- olivier
Received on Monday, 25 June 2007 06:55:50 UTC