- From: Steven J. DeRose <sjd@ebt.com>
- Date: Tue, 14 Jan 1997 14:47:55 -0500
- To: w3c-sgml-wg@www10.w3.org
These examples are taken from comments submitted by the TEI and some national bodies to the HyTime committee at the time of the Draft standard (I've updated them to match the HyTime final form, and hope I haven't let any typos in). I'm posting the examples because we've had *so* much discussion of TEI and HyTime location-specifying methods, but almost no concrete examples. *** The Example *** Task: find the destination by traversing in this order: a) the element with ID attribute "a25", b) within that element, the second child of type "CHAP" (assumed to also be the third child, due to a title element), c) within that element, the fourth child of type "SEC" (assumed to also be the fifth child, due to a title element), d) within that element, the fourth child of type "P" (assumed to also be the fourth child regardless of type), e) within that element, the 5th word-token. *** Old (P1) locator form *** This is obsolete, but is often quoted (some software supports it for backward compatibility). See Section 5.7.3 of the (1990) draft TEI Guidelines. ID a25 TPATH CHAP=2/SEC=4/P=4 TOKENS 5 *** Current (P3) locator form *** The final TEI guidelines ("P3", 1994), changed the syntax slightly and added features (to step sideways, filter by attributes, do regex matching ("DIV[3-5]"), and so on): ID (a25) CHILD (2 CHAP) (4 SEC) (4 P) TOKEN (5) Values like this are called "extended pointers" ("extended" because they can point to more than just an ID). They operate on SGML abstract syntax trees, which DSSSL and HyTime call part of groves. They address pcdata chunks the same way HyTime does (this differs slightly from DSSSL). Neither HyTime nor TEI said how PI or comments were addressed, though the intent was that TEI and HyTime work the same way. Doing the steps by element type is preferable to just using child numbers because you are more likely to catch errors (there is less likely to be a 4th SEC if you are in the wrong place, than just a 4th element of any type). But the GI parameter is not required, you could just have: ID (a25) CHILD (3) (5) (4) TOKEN (5) That's 37 bytes for the more robust form or 29 for the unfiltered form, and no values added to the ID name space. HyTime does not filter steps by GI or attribute value unless you resort to queries (now via HyQ, but after the TC, via DSSSL queries), so only the short form above has a direct HyTime equivalent. An equivalent HyTime link would point to a location ladder something like this (first, the almost-fully explicit HyTime form -- I've left out some of the more obscure attributes and any unnecessary recursion): <nameloc id="step1" hytime="nameloc" docorsub=""> <nmlist nametype="element"> a25 </nmlist> </nameloc> <treeloc id="step2" locsrc="step1" lextype="locsrc (IDREF, (s*, IDREF)+)" reflevel="1" hytime="treeloc"> <marklist HyTime="marklist">3 5 4</marklist> </treeloc> <dataloc id="step3" locsrc="step2" hytime="dataloc" quantum="word"> <dimlist HyTime="dimlist">5 1</dimlist> </dataloc> *** Here's the HyTime form, about as compact as I can see how to get it using lots of attribute defaulting and SHORTTAG: <nameloc id="step1" docorsub=""> <nmlist>a25 </></> <treeloc id="step2" locsrc="step1"> <marklist>3 5 4</></> <dataloc id="step3" locsrc="step2"> <dimlist>5 1</></> That's 135 characters (about 330 for the unminimized form) and 3 IDs. An advantage of splitting the rungs of location ladders into separate elements a la HyTime may be that you may be able to re-use portions. However, you can do that if you want in TEI since indirection is supported; and you can do it in plain SGML by judicious use of entities. The HyTime TC's new construct for "clink that uses non-IDREF notation to specify the destination" can serve as an escape hatch to TEI. This would make the whole link shorter. You get HyTime compatibility, but also TEI locator functionality, compactness, name space clearance, type-checking, and more existing implementations. *** From locators to links *** Extended pointers are typically placed in attributes on link elements. As one example, the P3 equivalent of HyTime's clink is the xref element, which takes an extended pointer. The pointer can be as brief as <xref target="ID (a25)"> or as complicated as necessary (bounded and unbounded indirection are also allowed, and there is a separate form for bare IDREFs). Out-of-line links in TEI are very like HyTime ilinks: they take a "targets" IDREFS attribute to point to a set of anchors (and will nearly always point to extended pointers in practice). Link collections are linkgrp elements, which are aggregates of links. That's a taste of the syntaxes we're discussing. As previously cited, a URL for info on TEI extended pointers is http://dynaweb.ebt.com:8080/usrbooks/teip3/26759 (I'd put a period there but Eudora would think it was part of the URL and not let you click your way to it). Lou Burnard has written an excellent tutorial on TEI locators, found at http://users.ox.ac.uk/~lou/wip/XR.htm Steve
Received on Tuesday, 14 January 1997 14:49:59 UTC