Re: Comments on regular expressions

Hi,

I would certainly also hope to see a compact EBNF description for the
Regular Expressions in the final draft - for the time being I have created
my own condensed version for use in our own development, which I'll gladly
share with you:

regExp			::= branch ('|' branch)*
>Regular Expression (branch|branch|...)
branch$			::= piece+
>Branch (piece+)
piece$			::= atom quantifier?
>Piece (atom quantifier?)
quantifier$		::= [?*+] | ( '{' quantity '}' )
>Piece quantifier (? | * | + | {quantity})
quantity$		::= quantRange | quantMin | QuantExact
>Numeric quantity
quantRange$		::= QuantExact ',' QuantExact
>Quantity range {n,m}
quantMin$		::= QuantExact ','
>Minimum quantity {n,}
QuantExact$		::= [0-9]+
>Exact quantity {n}
atom$			::= Char | charClass | ( '(' regExp ')' )
>Atom (char | charclass | (regexp))
Char$			::= [^.\?*+()|#x5B#x5D]
>Normal character (any non-metacharacter)
charClass		::= charClassEsc | charClassExpr
>Character class (escape | expression)
charClassExpr$	::= '[' charGroup ']'
>Character class expression ( [charGroup] )
charGroup		::= negCharGroup | posCharGroup | charClassSub
>Character group
negCharGroup$	::= '^' posCharGroup
>Negative character group
charClassSub$	::= ( posCharGroup | negCharGroup ) '-' charClassExpr
>Character class subtraction
posCharGroup$	::= ( charRange | charClassEsc )+
>Positive character group (character range | character class escape)+
charRange$		::= seRange | XmlCharRef | XmlChar
>Character range (XML character|s-e range)
seRange$		::= charOrEsc '-' charOrEsc
>s-e character range
charOrEsc$		::=	XmlChar | SingleCharEsc
>XML character or single-character escape
XmlChar$		::= [^\#x2D#x5B#x5D]
>XML character (all except \[])
XmlCharRef		::= ('&#' [0-9]+ ';') | ('&#x' [0-9a-fA-F]+ ';')
>Character-Reference (Ù or ê)
charClassEsc	::= ( SingleCharEsc | MultiCharEsc | catEsc | complEsc )
>Character class escape
SingleCharEsc	::= '\' [nrt\.?*+()|{}#x2D#x5B#x5D#x5E]
>Single character escape
MultiCharEsc	::= '.' | ('\' [sSiIcCdDwW])
>Multi-character escape
catEsc$			::= '\p{' charProp '}'
>Category escape
complEsc$		::= '\P{' charProp '}'
>Category escape compliment
charProp$		::= Letters | Marks | Numbers | Punctuation |
Separators | Symbols | Other | IsBlock	>Unicode character property
IsBlock			::= 'Is' [a-zA-Z]+
>Unicode block name
Letters			::= 'L' [ultmo]?
>Unicode letters category
Marks			::= 'M' [nce]?
>Unicode marks category
Numbers			::= 'N' [dlo]?
>Unicode numbers category
Punctuation		::= 'P' [cdseifo]?
>Unicode punctuation category
Separators		::= 'Z' [slp]?
>Unicode separators category
Symbols			::= 'S' [mcko]?
>Unicode symbols category
Other			::= 'C' [cfson]?
>Unicode other category

Sincerely,

Alexander Falk

... Icon Information-Systems
... ALEXANDER FALK
... President, CEO
... http://www.icon-is.com/falk

Received on Wednesday, 12 April 2000 06:54:36 UTC