Re: HTML is a declarative mark-up language from Boris Zbarsky on 2009-02-02 (public-html@w3.org from February 2009)

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Sun, 01 Feb 2009 22:40:52 -0500
To: Leif Halvard Silli <lhs@malform.no>
CC: HTML WG <public-html@w3.org>
Message-ID: <49866B44.7020506@mit.edu>
Leif Halvard Silli wrote:
> Boris Zbarsky 2009-02-01 04.21:
>> and that in fact the Wikipedia article is EXACTLY the kind of poor 
>> authoring practice that the restrictions on the content model of <a> 
>> and the belief that the anchor target must be an <a> has spawned.
> 
> Yes, the content model must have been the reason why they did not place 
> <H2> inside <A>. We don't know why they picked <A> at all.

While this last is true, I will lay odds that my above characterization 
is correct.  Given the source, it's also possible they were aiming for 
maximum cross-browser compatibility, including browsers that don't 
implement the linking-to-id part of HTML4, of course.  I'm not sure we 
should be letting such browsers dictate HTML5.

> One functional benefit of their solution is it creates a margin between 
> the <h2> header and the window top, when the anchor is activated. While 
> the text of the H2 element will "bang its head" into the top of the 
> window if you use <h2 id="Anchoring_gear"> instead.

I just tested this in Opera, Safari, and Firefox.  Only Opera has the 
behavior you described.  In the other two, the point scrolled to is the 
same whether the id is on the <h2> or on the <a>.

Opera's behavior is due to either a bug in margin collapsing or possibly 
different values for the margins (e.g. if the bottom margin of the <p> 
in Opera is bigger than the top margin of the <h2>).

In all three cases, though, the result looks significantly better when 
scrolling directly to the <h2>, at least over here.

> But here is another usecase: What if I want to list all/most pages where 
> a certain word occurs - in an index. Is there any better element than 
> the anchor element if I want to create links in the index to each and 
> every occurrance of a particular word?

I don't think so, no.

> I could use <span>, of course. 
> Span seems to be the universial cure ... But it would easily be much 
> simpler to just use <A>. E.g. I could use it to avoid mixing it up with 
> other purposes that I might use SPAN for.

OK, I almost buy this.  How about saying that <a> without @href is a 
link destination in text and should be used in cases when one wishes to 
link to a particular piece of text that is not otherwise marked up 
already.  Then we can make <a> without @href and without children 
non-conforming.  Further, we can make <a> without @href that has no 
leading or trailing non-whitespace text children nonconforming, right?

> Further more, a more fundamental question - more related to the general 
> usecases rather than Wikipedia: Why should it be considered more 
> legitimate to /not/ add a phrasing element around the hypertext link 
> source anchor, than around the destination anchor? If the source anchor 
> doesn't need it, why should the destination anchor?

Right.  In other words, if you're linking to text, not to an element, 
how should it work?  I think the above proposa covers that case.

>> I do appreciate you finding such an excellent example in the wild, on 
>> a quite popular site run by people who generally know their HTML, of 
>> the exact problem I described.
> 
> Would be interesting to know more about Wikipedia's exact problem.

Agreed.  Unfortunately I have no contacts at wikipedia, and some digging 
didn't locate any sort of revision history for the exact mechanism that 
transforms "== foo ==" into an <a> and <h2>.  If someone happens to have 
any such contacts or knows where said revision history can be found, I 
would much appreciate that.

>>> Also, if I want a place to be a destination anchor, without having  
>>> to investigate or find out (it may be impossible to know the content 
>>> of the A element if it is generated via DOM etc) the *further* 
>>> semantics of the anchored resource, then I can use <a 
>>> id="ref">resource</a> and be certain that I have at least not lied 
>>> plus that I solved a practical problem.
>>
>> I'm not sure I follow this use case.  Would you be able to give a 
>> concrete example?
> 
> Let's say I meant something like what Wikipedia could do, in HTML 5: ;-)
> 
> <a id="placeholder"><h2 id="something_or_nothing" >text</h2></a>

Why is that desirable, though?  Either as something one might want to do 
or as something one should be allowed to do?


>> The result is that far too many people end up with the impression that 
>> the <a> element is the only way to create a link target.
> 
> I think we agree that we cannot use Wikipedia as proof for that

While it's not "proof", I think it's pretty strong evidence for it...

> They probably have some other readon - good or bad.

I doubt that, actually.  Again, having hard data would be nice.

 > However, not by *reading* HTML 4, because HTML 4 explains
> about both <a name> and @id in general when it explains the desetination 
> anchor concept.

I've read HTML4 a few times, and I still forgot that you can target 
random IDs in HTML4.

And yes, most beginners don't read HTML4.  They read tutorials or 
copy/paste.  Very few tutorials mention linking to IDs.

On the other hand, if validators flagged situations like this and 
suggested linking to an ID, that would be a big step up, in my opinion.

>>> However, if I have gotten it right, in HTML 5 it will be 
>>> non-conforming to e.g. write
>>>
>>> <body><inline-elemetn></inline-element></body>
>>>
>>> (or has that idea been left?)
>>
>> I have no idea, honestly.  What does that question have to do with the 
>> discussion at hand?
> 
> You mentioned that a "loose" <a id=ref></a> could easily float away from 
> the target it was placed beside, e.g. due to CSS float or something. 

Right.

> E.g. in this example:
> 
> <div><a id=ref></a><table ... /table></div>
> 
> I just thought that if "loose" inline elements like in the example above 
> (and I guess A would be inline here) was nonconforming, then we would 
> see less of it. For instance, in the Wikipedia example above, they 
> placed it inside the P-element no doubt because then it would not "float 
> a way" very easily (even if for sure is possible that it could).
> 
> In short: to place an <a> inside a <P, or <Hn> element is one thing. To 
> just drop it inside the <body> is another thing.

I don't see what that has to do with the issue at hand, though.  In your 
example, if the <a> were in a <p> and the table were floated, the table 
could end up nowhere near the <a>.

In fact, the CSS box models for <a> inside <p> inside <div> and <a> 
directly inside <div> are basically identical, modulo the margins on the 
<p>.  Runs of inline kids inside a block are conceptually wrapped into a 
single block box.

I doubt that any of this had to do with the reasoning on the Wikipedia 
template.

-Boris
Received on Monday, 2 February 2009 03:41:53 UTC