XML-LINK: EXTENDED and EMBED from Peter Murray-Rust on 1997-06-22 (w3c-sgml-wg@w3.org from June 1997)

From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
Date: Sun, 22 Jun 1997 10:33:03 GMT
To: w3c-sgml-wg@w3.org
Message-Id: <8357@ursus.demon.co.uk>
I have been asked by a WG member whether a general purpose XML tool such as 
JUMBO supports EXTENDED links, the idea being to use it in a complex hyperlinked
context.  When reviewing examples of the proposed input I was unclear about
whether certain constructs had well-accepted implied semantics.  If so, I think
it important that they appear in the spec.

I know the ERB are rightly concerned about not specifying behaviour, APIs, etc.
The draft says 'express *policies* rather than *mechanisms*'.  This implies
that there are no generic approaches to implementing XML-LINK.  However I 
believe that some of the components can be implemented in a generic manner
but that guidance as to which and (to some extent how) should be in the revision

If this is not done, then a multitude of different approaches spring 
up, often incompatible.  We have already seen this in XML-DEV which has seen
at least 5 public XML parsers emerge (and doubtless many more unpublished). 
The approaches were significantly different.  XML-DEV, catalysed by John Tigue,
is (hopefully) pulling together the basis of a parser API (both input and 
ouptut).

IMO XML-LINK is even more difficult than a parser API.  Without ERB guidance
in certain areas, we may  get confusion and diversity.  It will make it
difficult to get consensus on common constructs in XML-LINK if they behave
differently in different link-processors.  It may be inevitable, but if the 
first generation of XML-LINK documents contain:

'The links in this document can only be processed with Browser X' then
I think we have lost something.

[If this is just my confusion, there is no problem, but I suspect that the 
webhacker community may also have some problems in implementing XML-LINK 
consistently.]

The main question is

'are there generally accepted semantics/behaviour for XML-LINKs, or is 
everything application-dependent?  If there *are* GENERIC components, which 
are they and how are they to be interpreted?'

I accept that it may be possible to produce general purpose engines (perhaps
HyBrowse is one?) that are customisable by particular applications, but this
is not the primary purpose of my question.  The question is how far, if at all,
can independent JUMBO-like tools go without diverging in their interpretation
of a given XML document. Another way of looking at it is to ask what the 
major browser manufacturers might be able to include in XML-LINK-aware 
browsers without divergence or local customisation.

I will try to summarise what I believe a generic XML-LINK processor can do
without customisation.  I stress that not all 'display' is necessarily
browser-like and may not be graphical.

All 'USER' options are probably to be carried out by a human and I will assume
the browser metaphor.  (I suppose it is possible that 'USER' could be a robotic
command from another XML processor).

I will label by 'GENERIC', those commands that I think any XML-LINK processor
should be able to implement in a uniforma manner.  I shall assume that the 
evaluation of the HREF argument is one or more locations.  (location(s)) are 
addressable point(s) in the grove of an XML document and could be a node, a 
span, multiple nodes (TEI/ALL); alternatively it could be a single non-XML 
document (e.g."foo.gif") possibly using an XML NOTATION, but without internal 
XML-addressable structure.

SIMPLE:

USER/NEW (GENERIC)

This creates a new display space for the target (grove or non-XML) which can be
manipulated indendently of the current grove (i.e. where the SIMPLE link
was located.)  If the grove happens to be the same as the current, then a 
model/view might be used to manage them independently.  I would assume
that traversal of the target grove started as it was constructed and that
other XML-LINks (AUTO) were processed as soon as they were encountered (i.e. a 
'depth-first' strategy).

USER/REPLACE (GENERIC)

This removes the current grove from the display and starts processing the
target into that display space.  Again, if the target contains XML-LINKs (AUTO)
they should be processed in a depth-first strategy.  

AUTO/NEW (GENERIC)

When this is encountered in displaying the current grove, a new display space
is immediately created and processing switches to it.  When this processing is
(recursively) complete, processing of the current grove continues.

AUTO/REPLACE (GENERIC)

When this is encountered while traversing the current grove, the current
traversal is immediately halted and processing of the target begins, using the
current display space.  Any recursive XML-LINKs are treated on a depth-first
basis, similar to the above.

All links are treated as INLINE=TRUE and so have at least two ends.  It is
application-dependent as to whether this information is used, but if it is
it represents a (GENERIC) method of traversing the link in the reverse direction

EMBED.
I have problems with this that I think would benefit from clarification in the
spec.  My interpretation is as follows:

IF the target of an EMBED is a non-XML object, then that object (e.g. a GIF)
is 'physically' embedded in the document.  Thus 
<IMG SRC="foo.gif">
and
<IMG XML-LINK="SIMPLE" SHOW="EMBED" ACTUATE="AUTO">
carry out identical operations.  (There is an application-dependent or 
target-dependent problem as to how physically the target is embedded, and it
is possible that it might still have an actuation button for conciseness - e.g.
a thumbnail gif)

*** when the target of an EMBED is an XML-grove I have the main problem ***

Example:
WF document target.xml:
<!DOCTYPE target [
]>
<TARGET ID="T">
I am a target
</TARGET>

WF document current.xml
<!DOCTYPE current [
]>
<CURRENT ID="C">
I am the current document
<A ID="A" XML-LINK="SIMPLE" ACTUATE="USER" SHOW="EMBED" HREF="target.xml">
I am a resource</A>
</CURRENT>

*** is there a GENERIC way of processing this? ***

My understanding is that there are two separate groves (target and current) and
that there is a link from the <A> node in current to the <TARGET> in target.
In some way target is 'embedded' in current.  This is what I (and some of my 
fellow webhackers) do not understand.

One way of embedding would be to use an entity:

<!DOCTYPE current [
<!entity target SYSTEM "target.xml">
]>
<CURRENT ID="C">
I am the current document
&target;
</CURRENT>

but this clearly misses something.  

The USER/EMBED option suggests that the user has a document which is rendered
like:

[Current Icon]
I am the current document
[I am a link; Press me and I will embed target.xml]
I am a resource
[I have stopped being a link]
[I am no longer current]
[EOF]

The user presses! Do they get a new version of the displayed document? IMO 
this seems essential.  So the new document might look like:

[Current Icon - I obey the CURRENT DTD]
I am the current document
[I am a clicked link! target.xml is embedded here! You can now read it!]
[Target Icon - I obey the TARGET DTD!!]
I am a target
[I have finished being a target]
[end of embedding]
I am a resource
[I am no longer a link]
[I am no longer current]

[EOF]

(I am not clear whether the contents of <A> ("I am a resource") should be 
deleted.)

Alternatively, as I understand it from Eliot's posting, there is simply a 
link between A and T.  In which case why the word 'EMBED'?  If the link is
REQUIRED to be traversed (either AUTO or after the USER action) then we
have effectively got a larger grove (rather similar to the file above)
which may contain documents with different DTDs.

It gets harder with 
EMBED/AUTO
which is what my correspondent was enquiring about.  This can be potentially
extremely powerful if the target documents (recursively) describe sets of 
XML-LINKs.  But its value will be limited if the traversal is not GENERIC.

In conclusion, the XML-LINK spec should specify which bits are GENERIC and
which are application-dependent.  The ACTUATE/SHOW axes only make sense if
they describe a GENERIC mechanism (otherwise people would simply roll their 
own).  

Since I know don't know how to treat parts of SIMPLE, I shan't consider EXTENDED.
But it will be important to know whether there are GENERIC implementations to
any of these.  The long posting from MichaelSMcQ was very useful - I haven't
had time to work out whether it can be processed by a GENERIC mechanism or
not.

There are clearly a group of people on the ERB/WG who have a picture of 
how GENERIC solutions are to be implemented.  However, I think that the 
dirty webhackers like me will not pick them up.  [And XML-LINK won't be very
useful if there aren't any implementations or they all behave differently.]
Any clarity on WHICH aspects are GENERIC will be valuable, and for these
some additional implementation guidance, please.

	P.

(looking forward to July 1).

-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
Received on Sunday, 22 June 1997 09:43:27 UTC