Re: XHTML & hyperlinking opinions (long, sorry) from Chris Lilley on 2002-10-07 (www-tag@w3.org from October 2002)

From: Chris Lilley <chris@w3.org>
Date: Mon, 7 Oct 2002 21:28:15 +0200
To: www-tag@w3.org, Tim Bray <tbray@textuality.com>
CC: "Champion, Mike" <Mike.Champion@SoftwareAG-USA.com>
Message-ID: <117222972046.20021007212815@w3.org>
On Saturday, October 5, 2002, 12:11:10 AM, Tim wrote:


TB> Champion, Mike wrote:
>>>I think disagreement should be accompanied by examples: 
>>>"here's a better way ...
>> 
>> 
>> In a situation where a particular solution is widely deployed and solving
>> real problems daily, I would totally agree that the burden of proof is on
>> those who disagree with it.  But in the current discussion, it's HTML that's
>> widely deployed and solving real problems

TB> Except for HTML doesn't have multi-ended or metadata-loaded or 
TB> out-of-band links.


Two out of three ain't bad. I suggest that HTML *does* have
matadata-loaded links:

http://www.w3.org/TR/html4/struct/links.html#h-12.2
 charset     %Charset;      #IMPLIED  -- char encoding of linked resource --
  type        %ContentType;  #IMPLIED  -- advisory content type --
  name        CDATA          #IMPLIED  -- named link end --
  href        %URI;          #IMPLIED  -- URI for linked resource --
  hreflang    %LanguageCode; #IMPLIED  -- language code --
  rel         %LinkTypes;    #IMPLIED  -- forward link types --
  rev         %LinkTypes;    #IMPLIED  -- reverse link types --
  accesskey   %Character;    #IMPLIED  -- accessibility key character --
  shape       %Shape;        rect      -- for use with client-side image maps --
  coords      %Coords;       #IMPLIED  -- for use with client-side image maps --
  tabindex    NUMBER         #IMPLIED  -- position in tabbing order --
  onfocus     %Script;       #IMPLIED  -- the element got the focus --
  onblur      %Script;       #IMPLIED  -- the element lost the focus --

That is a dozen attributes of link metadata besides the actual link
URI. Plus coreattrs, 18n, etc - one of which (title) should really be
an element not an attribute for internationalization reasons.
  
http://www.w3.org/TR/html4/struct/links.html#h-12.3
 charset     %Charset;      #IMPLIED  -- char encoding of linked resource --
  href        %URI;          #IMPLIED  -- URI for linked resource --
  hreflang    %LanguageCode; #IMPLIED  -- language code --
  type        %ContentType;  #IMPLIED  -- advisory content type --
  rel         %LinkTypes;    #IMPLIED  -- forward link types --
  rev         %LinkTypes;    #IMPLIED  -- reverse link types --
  media       %MediaDesc;    #IMPLIED  -- for rendering on these media --

Has (some of) the same metadata.

http://www.w3.org/TR/html4/struct/objects.html#h-13.2
<!ATTLIST IMG
  %attrs;                              -- %coreattrs, %i18n, %events --
  src         %URI;          #REQUIRED -- URI of image to embed --
  alt         %Text;         #REQUIRED -- short description --
  longdesc    %URI;          #IMPLIED  -- link to long description
                                          (complements alt) --
  name        CDATA          #IMPLIED  -- name of image for scripting --
  height      %Length;       #IMPLIED  -- override height --
  width       %Length;       #IMPLIED  -- override width --
  usemap      %URI;          #IMPLIED  -- use client-side image map --
  ismap       (ismap)        #IMPLIED  -- use server-side image map --
  >

Note that this has two URIs, one for the image graphical content and
one for the long description.

Now lets imagine that the HTML WG decides to harmonise some of its
link metadata constructs. Suppose they note that

- images can contain text and thus be in a particular language (and in
some formats, such as SVG, they can contain indexable, selectable
text)

- images are available in multiple formats

- images can have relationships, for example "illustrates"

they add hreflang, rel and rev to the attribute list for
image.

Noting too that longdesc, being text, might benefit from some more
metadata they add charset to the attribute list for img as well.

But now there is an ambiguity, so they add human-readable prose
to the spec to clarify that charset is the charset of the longdesc,
not that of the image data (for example, if the image data is in SVG
and thus textual and thus, perhaps with a charset) but that title and
rel and rev apply to the image data, not the long description.

Now, a little while later, they decide to harmonize some more. Plus,
image formats can be in XML and thus charset might be useful, but it
is already there and you can't have two of them. So they add
imgcharset and longdesctitle and longdescrel and longdescrev and
longdesctype.

<img
  src="foo.svg"
  type="image/svg+xml"
  title="my cat"
  imgcharset="utf-8"
  longdesc="foo-desc.tt"
  longdesctitle="about my cat"
  rel="picture-of"
  longdescrel="description-of"
  longdesctype="application/timed-text+xml"
  charset="utf-16">

In other words, because the attempt to cram two *links* onto one
element, they would need to dig further and further into a hole of
special-purpose, attribute-grouping-by-magic-prefix syntax.

Having two attributes that are of type URI (for example, namespace
declarations, or attributes that take qnames as attribute values, or
that use URIs to provide an open-ended list of values) on one element
is of course OK (the question that is often asked and answered. Who
was it said that the nice thing about asking the wrong question is
that you don't have to care about the answer?)

At which point (since this is a thought experiment, then like a film
director I can make the ending anything I like with a cunning plot
twist) they realize the error of their ways, make longdesc and alt
into real elements, put the longdesc metadata on the longdesc element
and the image metadata on the image element, use XLink, use the smil
and svg switch syntax to indicate an actual choice between graphical
and textual media, thus have something like:

<switch xmlns:xl="http://www.ws3.org/1999/xlink>
  <img
    xl:href="foo.svg"
    type="image/svg+xml"
    charset="utf-8"
    xl:role="picture-of"
    some-test-attribute="images-on">
    <title>my <span class="example>marked-up</span>cat</title>
  </img>
  <longdesc
    xl:href="foo-desc.tt"
    type="application/timed-text+xml"
    charset="utf-16"
    xl:role="description-of"
    some-test-attribute="long-descriptions-on">
    <title>about my <span class="example>marked-up</span>cat</title>
  </longdesc>
  <alt><span class="example>marked-up</span> alternate text</alt>
</switch>

Notice that the same attributes (charset, etc) mean the same things,
rather than having special-purpose names to hedge around there being
only one attribute of a given name on an element.

In practice of course the xlink declaration would be somewhere on the
root element and thus only appear once, and the common

<img
  src="foo.png"
  alt=""/>

would become

<img
  xl:href="foo.png"/>

Having two *links* on one element is only okay if you have no link
metadata, no intention of adding multi-ended links, and no intention
of addressing internationalization concerns over title attributes that
should be element content. But HTML already has link metadata, might
or might not add multi-ended links, and hopefully wants to do a better
job with titles in the future. Thus, XLink, which has as a design goal
the association of link metadata, is a good fit here.

The question is in some ways not whether XLink is a suitable fit for
XHTML 1.x, but rather whether XHTML 1.x syntax is a good fit for XHTML
2.x. It isn't, and moving to XLink would solve a bunch of those
problems. The lack of the bugwards-compatibility shackles on XHTML 2.x
makes this an ideal time to actually make this change.
  
TB> I'm suggesting that it would be a good idea to add
TB> them, and proposing XLink as a way to do it.

Yes. Because the 'attributes can't have attributes' and 'attributes
can't have chile elements' arguments bite even more once you get to
multi-ended links.



-- 
 Chris                            mailto:chris@w3.org
Received on Monday, 7 October 2002 15:28:16 UTC