[whatwg] Feeedback on <dfn>, <abbr>, and other elements related to cross-references

Nicholas Shanks writes:

> Ian, I think you have made a mistake.

The message of Ian's you replied to covered several different things (as
indicated by the subject line) and you didn't quote any of it.  Could
you be more specific on which bit you consider to be a mistake?

> We need to go through this more methodically before making a decision.

Ian appears to have looked at every mail sent to this (and other) lists
on relevant topics, reading and considering each one in order -- you
can't get much more methodical than that!

> Situations where expansions of abbreviations are needed:
> 1)	People unfamiliar with the topic being discussed.

Sure.  That can also happen with jargon which isn't an abbreviation.

> This can happen if you click a link to an anchor half-way down a page
> and miss the  introduction, or you are reading about a topic new to
> you.

If you start partway through a document on an unfamiliar subject it
shouldn't be surprising if not everything makes immediate sense!
Navigating back to the start seems like a reasonable thing to do.

> It should not be required that the user screw around looking for the
> acronym with a dotted underline.

* Why is this feature needed for abbreviations, but not for other words
  people may not be familiar with?

* The current draft spec doesn't prohibit an author from marking every
  use of an abbreviation (or the first one in each section, or the first
  one after each anchor target, or whatever) with its expansion.

> 2)	Documents that exist as both a single page, and as multiple
> pages  (like large web specifications). Should the expansion occur
> once per  file? That would require additionally marking up every abbr
> at it's  first occurrence on a page when splitting the single-page
> version.

Sure.  What would you suggest as an alternative?

> 3)	Documents that use the same acronym to mean different things in  
> different contexts/sections.
> 	For example, take that both <abbr title="United States of  
> America">USA</abbr> and <abbr title="United Space Alliance">USA</abbr>  
> previously occurred in the document,

A document re-using an abbreviation like that would have to be written
carefully to avoid confusing anybody -- such as by providing enough
context that it's obvious which is meant or sometimes writing them out
in full.

> ... and you *don't* want, as an author, for every future use of either
> term to be expanded by default (so will not provide titles for all
> occurrences).

How does that "so" follow?

If you have multiple instances of "USA" and you want to disambiguate the
expansion of each one then surely you'll have to label each one with its
expansion.

But why would having this information available mean that the
abbreviations would be "expanded by default"?  Surely if the author had
wanted text to be read as "United Space Alliance" in full then the
author would have written that, and as such by default it should be read
as "USA".  The expansion is there for users who ask for it, but its
existence doesn't imply it will always be forced on users.

> I then jump into the middle of a page from somewhere else and see "The
> USA's fleet of Space Shuttles are refurbished by USA, LLC." and wonder
> what's going on!
> 	There's no way to tell which is which without heuristical
> analysis of  the language,

Sure.  But from reading that sentence I can tell from the context which
one is which.  Both "USA"s should be displayed the same; both should be
spoken the same.  Why do they need distinguishing?

> so the UA can't auto-expand based on a single previous occurrence,
> which I think is the behaviour you were expecting

What in the current spec or Ian's mail makes you think that?  My reading
of it was that nothing happens automatically: if you encounter a term
which you don't understand (whether an abbreviation or not) then you
navigate back in the document in the hope of finding an explanation of
it.

> by disallowing abbrs without titles and removing the referencing.

How would <abbr>s without titles help in the above situation anyway?
If you really feel the "USA"s need disambiguating what's deficient about
puting title attributes with their expansions on each one?

> 4)	Documents where the acronym and an identically spelled word appear.  
> For example your current system would *require ambiguity* in the
> admittedly somewhat unlikely newspaper headline:
> 	<h1><abbr title="British American Racing">BAR</abbr> RAISE THE
> BAR IN FORMULA ONE<h1>
> 	Is the second BAR an acronym, which is prohibited from being
> marked up, or not?

Firstly there's nothing prohibiting the second one from being marked up.

Secondly, again context makes it pretty obvious.

And thirdly, the headline should actually be marked up as:

  <h1><abbr title="British American Racing">BAR</abbr> raise the bar in
  Formula One<h1>

> No way to tell without heuristical analysis of the language. 
> Certainly not something most UAs will be doing, even for English. What 
> hope would Nahuatl have?
> 	At least with <abbr>BAR</abbr> you can tell that it *is* an  
> abbreviation, and can go look for the reference. Telling when a word  
> that's not marked up is or isn't an acronym (so deciding if the UA  
> should provide an expansion) is much harder.

A site may well have a house-style of putting headlines in all-caps, but
that should be achieved in CSS.  So speaking browsers would have no
problem in treating "bar" in my version above as the ordinary English
word.

> Ideally users need to have on-demand expansion of any abbreviation they 
> come across, in any situation,

Possibly.  Or possibly that's beyond the scope of HTML.  When browsing I
would also quite like to have other on-demand information:

* The 'OED' definition of any English words I come across.
* The postcode for any UK addresses I come across.
* The artist who sang any song whose title I come across.

But that doesn't mean that HTML 5 needs to provide mark-up for these.

> regardless of how competent the HTML author was.

Suppose a browser decided to implement a feature whereby the expansions
of all abbreviations defined in the current document could be displayed,
and there's way of asking for an expansion, if any, of the 'word' under
the cursor.

If you want that for all documents, even those without co-operative
authors, then surely the look-up has to be available for abbreviations
which aren't marked up at all ("BAR")?  In which case marking one up
("<abbr>BAR</abbr>") wouldn't provide any additional functionality.

> Erroneous expansion of non-abbreviations that match a previously
> defined abbreviation is I think the hardest thing to avoid.

Why would it matter?  The browser may not have human-level linguistics
heuristics to be able to tell whether "bar" is a word or an
abbreviation; but if a human is asking for an expansion then the reader
has obviously interpreted it as an abbreviation, so the browser can
expand it.

> Where should these expansions come from? The following fallback list
> seems reasonable:
> 1)	Inline with @title, the way it's currently done.
> 2)	By referencing, either automatically by the UA or explicitly
> marked  up, an expanded occurrence of the acronym.

This could be done with current HTML.  Are any browsers doing it?  Is
there a Firefox plug-in which provides it?

If there is significant demand for this then surely it would already be
happening; it doesn't require a new version of HTML.  If it hasn't
happened so far, why would HTML 5 make any difference?

> You are prohibiting (2) from being used ...  by disallowing the title
> attribute to be omitted you make things unnecessarily difficult for
> currently valid HTML4 to migrate to valid HTML5.

Can you link to examples of such webpages, which have <abbr> elements
without title attibutes?  What does that mark-up currently achieve?

If it doesn't actually do anything then it seems reasonable for an HTML
5 validator to flag it as problematic; otherwise authors may misguidedly
continue to believe it has a purpose.

Smylers

Received on Monday, 21 April 2008 13:20:57 UTC