Issue (regex)

Dear Sir, Dear Madam!

In WD-xsl-19981216 there is an issue about regular expression
capabilities in XSL:

  Issue (regex): Should XSL support regular expressions for matching 
  against any or all of pcdata content, attribute values, attribute 
  names, element type names?
                                        (section 2.6.2. [15])

I've the following suggestion: 
XSL should support regular expressions for matching against all
of #PCDATA content and attribute values (= all possible text data
content), but not for element type names or attribute names.

Consider the following examples: 

- All trademark signs in a text have to be superscripted.
- There's an index element with an attribute which contains
  a list of entry texts for that index marker. The attribute value
  could be a list of alternative entries (separated by ";")
  of the form:
          main entry text:subentry text [sort criteria]
  perhaps containing also special character escapes.
  From that there is a two-level index to be contructed.
- Day, month or year have to be extracted from a simple date element
  ("<date>1998/12/16</date>").

In all cases where text content in elements or attributes has
additional structure which is not marked but has to be distinguished
for transformation or layout, there is the need for additional
functionality to extract parts of the content. Regular expressions
are the tool of choice for that task.

For information which has predefined values like attribute names,
I don't see a need for regular expression search.

I hope to have added a useful thought to the discussion.

Best regards,
  Christian Wetzel

Received on Tuesday, 12 January 1999 06:46:53 UTC