RE: [Fwd: XML Schema Part 2 should provide BNF for all primitive types.]

Hi,

these are the regular expressions that we use within XML Spy, which were
created in accordance with the information in Part 2.

While they are still at CR level, I'd be perfectly willing to share them as
a starting point for maybe adding them to Part 2 for the REC version. For CR
you did pretty much the same thing with the EBNF that we had written for the
regular expressions that got added to the corresponding chapter in Part 2.

	// primitive W3C Schema DataTypes
	true|false
// DT_boolean,
	[\\-\\+]?(INF|NaN|(\\d*(\\.\\d*)?([eE]\\-?\\d+)?))
// DT_float,			DT_r4 = DT_float,
	[\\-\\+]?(INF|NaN|(\\d*(\\.\\d*)?([eE]\\-?\\d+)?))
// DT_double,			DT_r8 = DT_double,
	[\\-\\+]?\\d*(\\.\\d*)?
// DT_decimal,
	[\\-\\+]?P(\\d+Y)?(\\d+M)?(\\d+D)?(T(\\d+H)?(\\d+M)?(\\d+S)?)?	//
DT_timeDuration,
	
\\-?(\\d{2,4}|-)-(\\d{2}|-)-(\\d{2}|-)T(\\d{2}|-):(\\d{2}|-):(\\d{2}(\\.\\d+
)?|-)(Z|([\\-\\+]\\d{2}:\\d{2}))?			//
DT_recurringDuration,
	
// DT_binary,
	
(([a-zA-Z][0-9a-zA-Z+\\-\\.]*:)?/{0,2}[0-9a-zA-Z;/?:@&=+$\\.\\-_!~*'()]+)?(#
[0-9a-zA-Z;/?:@&=+$\\.\\-_!~*'()]+)?	// DT_uri_reference,
	[\\p{L}_][\\p{L}\\d\\.\\-_]*
// DT_ID,
	[\\p{L}_][\\p{L}\\d\\.\\-_]*
// DT_IDREF,
	[\\p{L}_][\\p{L}\\d\\.\\-_]*
// DT_ENTITY,
	([\\p{L}_][\\p{L}\\d\\.\\-_]*:)?[\\p{L}_][\\p{L}\\d\\.\\-_]*	//
DT_QName,
    
	// derived W3C Schema DataTypes
	[^\\n\\r\\t]*
// DT_CDATA,	// new: CR 10/24/00
	([^\\n\\r\\t ]+)( [^\\n\\r\\t ]+)*
// DT_token,	// new: CR 10/24/00
	([a-zA-Z]{2}|[iI]-[a-zA-Z]+|[xX]-[a-zA-Z]+)(-[a-zA-Z]+)*
// DT_language,
	([\\p{L}_][\\p{L}\\d\\.\\-_]*)( [\\p{L}_][\\p{L}\\d\\.\\-_]*)*	//
DT_IDREFS,
	([\\p{L}_][\\p{L}\\d\\.\\-_]*)( [\\p{L}_][\\p{L}\\d\\.\\-_]*)*	//
DT_ENTITIES,
	[\\p{L}\\d\\.\\-_:]+
// DT_NMTOKEN,
	([\\p{L}\\d\\.\\-_:]+)( [\\p{L}\\d\\.\\-_:]+)*
// DT_NMTOKENS,
	[\\p{L}_:][\\p{L}\\d\\.\\-_:]*
// DT_Name,
	[\\p{L}_][\\p{L}\\d\\.\\-_]*
// DT_NCName,
	[\\p{L}_:][\\p{L}\\d\\.\\-_:]*
// DT_NOTATION,
	[\\-\\+]?\\d*
// DT_integer,
	0+|-\\d+
// DT_non_positive_integer,
	-\\d+
// DT_negative_integer,
	[\\-\\+]?\\d*
// DT_long,
	[\\-\\+]?\\d*
// DT_int,				DT_i4 = DT_int,
	[\\-\\+]?\\d*
// DT_short,			DT_i2 = DT_short,
	[\\-\\+]?\\d*
// DT_byte,				DT_i1 = DT_byte,
	\\+?\\d*
// DT_non_negative_integer,
	\\+?\\d*
// DT_unsigned_long,
	\\+?\\d*
// DT_unsigned_int,		DT_ui4 = DT_unsigned_int,
	\\+?\\d*
// DT_unsigned_short,	DT_ui2 = DT_unsigned_short,
	\\+?\\d*
// DT_unsigned_byte,	DT_ui1 = DT_unsigned_byte,
	\\+?\\d*
// DT_positive_integer,
	
\\-?\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}(\\.\\d+)?(Z|([\\-\\+]\\d{2}:\\
d{2}))?// DT_timePeriod,
	
\\-?\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}(\\.\\d+)?(Z|([\\-\\+]\\d{2}:\\
d{2}))?// DT_timeInstant,
	\\-?\\d{4}-\\d{2}
// DT_month,
	\\-?\\d{4}
// DT_year,
	\\-?\\d{2}
// DT_century,
	-?-?\\d{2}-\\d{2}
// DT_recurringDate,
	-?-?-?\\d{2}
// DT_recurringDay
	(\\d{2,4})?(-(\\d{2})?(-(\\d{2}))?)?
// DT_date,
	\\d{2}:\\d{2}(:\\d{2}(\\.\\d+)?)?(Z|([\\-\\+]\\d{2}:\\d{2}))?	//
DT_time,

Please let me know, if you find any problems with these or think they should
be different. Also, again this is CR status and hasn't yet been updated to
PR.

Alexander

... Alexander Falk
... President & CEO of Altova, Inc. - The XML Spy Company
... Member of the W3C Advisory Committee
... Member of the W3C XML Schema Working Group

=========================================================================
XML Spy 3.0  -  the first true Integrated Development Environment for XML
Visit http://www.xmlspy.com/ to download a free 30-day evaluation version
=========================================================================



-----Original Message-----
From: Asir S Vedamuthu [mailto:asirv@webmethods.com]
Sent: Tuesday, March 27, 2001 14:57
To: Noah_Mendelsohn@lotus.com; Ashok Malhotra
Cc: jjc@jclark.com; w3c-xml-schema-ig@w3.org;
www-xml-schema-comments@w3.org
Subject: Re: [Fwd: XML Schema Part 2 should provide BNF for all
primitive types.]


> grammar for the lexical forms, but formal mappings to the value space.  In
> other words, show the polynomial that gives you the integer value, for
> example.

I don't see a *reason* why we have to go this far.

There are 19 primitive types. Of them, formal descriptions for some can be
found in related recommendations and standards, example [1]. I am sure we
can re-use these descriptions. For some, say anyURI, we do not have to
provide a formal description. And, it is relatively easy to write a BNF or
RegEx for some datatypes, say 'boolean'.

The big question is how long would it take to produce this? May be we can
give up Part 2 re-org.

[1] http://www.w3.org/TR/1999/REC-xml-names-19990114/#ns-qualnames

Regards, Asir
----- Original Message -----
From: <Noah_Mendelsohn@lotus.com>
To: "Ashok Malhotra" <ashokma@microsoft.com>
Cc: <jjc@jclark.com>; <w3c-xml-schema-ig@w3.org>;
<www-xml-schema-comments@w3.org>
Sent: Tuesday, March 27, 2001 1:44 PM
Subject: RE: [Fwd: XML Schema Part 2 should provide BNF for all primitive
types.]


> Does it make any sense to do regex's or BNF as non-normative for 1.0,
> normative for 1.1?  This might, editors' time permitting, let us get
> something out, and still have the opportunigy to fix edge cases before we
> make it normative.  I've thought for a long time we need not only a formal
> grammar for the lexical forms, but formal mappings to the value space.  In
> other words, show the polynomial that gives you the integer value, for
> example.
>
> ------------------------------------------------------------------------
> Noah Mendelsohn                                    Voice: 1-617-693-4036
> Lotus Development Corp.                            Fax: 1-617-693-8676
> One Rogers Street
> Cambridge, MA 02142
> ------------------------------------------------------------------------
>
>

Received on Tuesday, 27 March 2001 15:41:05 UTC