- From: Lou Burnard <lou@vax.ox.ac.uk>
- Date: Mon, 10 Mar 1997 15:14:02 +0000
- To: W3C-SGML-WG@w3.org
From: OXVAXD::LOU "Lou Burnard" 10-MAR-1997 14:51:18.60 To: MX%"Peter@ursus.demon.co.uk" CC: LOU Subj: Re: 4.5 TEI Extended Pointer Locators? |Fnd I am mystified by the example in A.1.1.13 (the 3rd, 4th 5th tokens |in "This is _not_ a very good idea" are apparently: |"not_", "a" "very" |The _ acts as a delimiter if it leads a token but not if it trails it? |(Any default use of _ as a delimiter will cause scientists a lot of |grief, I suspect).] The rule is that the first SGML namecharacter found starts a new token, and the first non namecharacter terminates it, with the proviso that any character whose status as a namecharacter is not defined in the current SGML declaration is treated as a non-name-char (i.e. as white space). See TEI P3 p 419, from which the XML spec has been derived). This means that the underscore is, by default, regarded as white space, so the location specification TOKEN (3 5) starts with the "not" of "not_". The example is perhaps a little confusing because it requests a *span* of tokens (from the 3rd i.e. "not" to the 5th i.e. "very"), and a span, naturally enough, includes all the characters from the start of its beginning to the end of its end, including therefore one of the underscores, but not the other. Seemed a good idea at the time! Lou
Received on Monday, 10 March 1997 10:14:31 UTC