- From: David G. Durand <dgd@cs.bu.edu>
- Date: Sat, 11 Jan 1997 15:03:05 -0500
- To: w3c-sgml-wg@www10.w3.org
- Cc: w3c-sgml-wg@www10.w3.org
At 5:07 PM 1/10/97, Derek Denny-Brown wrote: >That is not the issue I was worried/concerned about. One reason contextual >links (html's <A>, TEI <XREF> & <XPTR>) are the primary hyperlink style >implimented in today's software is that it is very easy to figure out what >the start for the hyperlink is, trivial even. Using contextual links, you >only need to traverse the document when the user explicitly asks for it, and >thus a small delay is acceptable. Using ilink style links means that the >document load also incurs a similar penalty _for_every_ilink_ just to figure >out what parts of the document are anchors. The BOS defines which documents >must be processed to ensure that all anchors have been located. As an >simple example, if you have three documents, A, B, C, where you are loading >document A, but B and C are in A's BOS, then all three documents must be >processed in order to determine what parts of A are anchors. This is one of >the reasons HyTime has not already replaced HTML. As Steven Newcomb is fond >of saying, how does an anchor "know" it is an anchor? In order to answer >that question, every thing that _might_ point at it must be resolved and >_all_ ilinks must be processed to determine if _any_ use that object as an >anchor. All _relevant_ ilinks must be examined. This (currently undefined) notion of relevance is the key to defining what we want. We can define the set of relevant ilinks by definining a set of relevant documents, and then saying that ilinks in that set of documents must be processed with respect to a given starting point. We have several possible definitions of the set of relevant documents (the XML BOS, as I think we might call it) on the table at the moment: 1. (Steve N.) The HyTime BOS. This means that all documents declared in entity declarations are potentially relevant (though some control can be explicitly provided to reduce this set). Indirection is supported via entity declarations. The user must explicitly 2. (Steve D., Jon Bosak) Some particular document mentioned as an associate document. I'm not sure if more than one "link database document" can be associated as relevant. Indirection in the creation of such sets is not supported. 3. (David D.) A particular set of documents declared by pulling in "associated" documents explicitly requested by the user. (The user must explicitly ask for a document to enter this set, but may use indirection to manage sets if desired). 4. (Martin Bryan). I'm least sure about what Martin wants but I think he'd prefer a mechanism that enforced indirection in specifying the set of relevant documents. 5. (Derek). The "relevant document set" is always just the current document. (no direction, or indirection). >I like ilinks, but having worked on developing software to implement HyTime, >they are not easy. Carefull restrictions should be made to reduce overhead. As I said, ilinks are definitely harder than simple 1-way embedded links! The way to make implementing ilinks easy is to simehow guarantee that an application has parsed every ilink before it encounters any of that ilink's endpoints. Then one need only keep a dictionary of ilinks sorted by address, and track addresses as you parse, checking each new address in the ilink database as you go. That said, I don't think this is actually a very good way to implement ilinks. For one thing, the constraints on authors are hard to explain, and for another, we will need to hard-wire the rules about when entity references are resolved, in order to enable authors to tell if they are making a forward reference or not. The other way is simply to parse all the relevant documents, saving the ilinks. Then one can bounce through the list of ilinks, applying any endpoints found in the relevant document set to the applications intenral representation of the documents. I think this is, for a multiple document model, the esentials of Lee's in-memory condition. Note that the latter strategy can combine with the first strategy -- you need delay processing an ilink only until you parse the document it applies to. For a browser, this means that incremental displays may not show all links immediately, depending on when the ilink is actually processed. The only way to avoid this is to somehow require that all ilinks be processed first. Given the structural orientation of XML, and the kinds of location address we are proposing, I think that it is not a hardship for applications to have to keep a representation of their documents in memory. So the second strategy makes the author's life much easier, and still has an easy implementation strategy (process the ilink pool after bringing all documents into memory), as well a nicer but harder one (combining on-the-fly ilink resolution with a a final cleanup pass). >I thin I just managed to say the exact same thing twice, so I'll try again, >as a proposal: > >A XML Hyperlinking processor should ("is required"?) to notify the >application of all anchors of hyperlinks only if the anchor and the >hyperlink are declared in teh same XML document. External resurces, as >provided-by/restricted-by the BOS (and any other constraining mechanism we >choose to define), may be used to locate the anchor. > >The problem with the above proposal is that it breaks the usefulness of >ilinks for annotations. I am not very happy about this, and would be happy >to hear a better solution. I think we need to allow the processing of multiple documents at a time. I don't see that this is hard for any but the simplest of applications... This kind of simplification make ilinks useful only for a few things. No solution to the annotation problem is going to be based on single-document parsing, and that could be one of the real selling points of XML (for people other than Terry, who would rather not have this feature). > I worry that a XML hyperlink processor will need >to keep a complete representation of every document in the BOS around >because any part of the BOS might use some other part as an anchor. yes, that is why we need to be careful about defining the XML BOS. I think we should have indirection, but that only an explicit request by the author/publisher should add a document to the XML BOS. > Another >possibility is to define the BOS as a ordered list of documents such that if >a hyperlink processor where to process the documents in order it would not >need to report anchors (though it should report a path to the anchor) which >are located in documents earlier in the ordered list of documents (which is >the BOS). Thus, once a document is processed, the hyperlink processor only >needs to keep around some representation of locator/addressing elements >which might be used by later documents. Hyperlinks and anchors in the >earlier documents will have already been passed to the application (or >stored somehow) and are not neccessary for the hyperlink processor to >process the later documents. As I said above, I think this makes it too difficult to define of what ilinks take effect, and also will require us to make a much more constrained and "procedural" definition of XML parsing strategies in order to support making the author's life more difficult. I don't think keeping a data structure around and updating it later is an insurmountably difficult problem. Model/View/Controller and other update strategies have been around for decades, and are easier than background formatting and processing of documents anyway. >*David* since you were the one who responded to my previous post, does this >at least make sense to you? Have I explained why I want to restrict the >power of ilinks? I understand, but I had (in my own mind) already discarded that kind of strategy as too limiting, and not enough easier to make the limitations palatable. So I understand, but I'm not convinced yet. -- David I am not a number. I am an undefined character. _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________
Received on Saturday, 11 January 1997 15:01:11 UTC