- From: Ian Hickson <ian@hixie.ch>
- Date: Mon, 27 Mar 2006 20:21:20 +0000 (UTC)
- To: "Jukka K. Korpela" <jkorpela@cs.tut.fi>
- Cc: "'www-html@w3.org'" <www-html@w3.org>
On Sat, 25 Mar 2006, Jukka K. Korpela wrote: > > > > According to my studies it's used in around 0.1% of the Web's pages. > > One in every thousand pages isn't bad, given how few pages could be > > expected to be defining terms; In particular, it's used more than > > <ins>, <del>, <var>, <samp>, <bdo>, etc. > > I can't argue with your statistics - the Google analysis > http://code.google.com/webstats/2005-12/elements.html does not cover the > <dfn> element. The 0.1% number comes from the data that also produced the Google analysis. > Assuming that the figure 0.1% is representative, is it small or large as > compared with the expected frequency of pages that actually contain > definitions of terms? I don't have data on that frequency. I would expect that it is low, though. > After all, what matters - for purposes like developing browsers and > search engines - is the probability that you can actually locate > defining occurrences by looking at markup for them (at present, <dfn> > and <dt>). Even if you get a large amount of information that way, is it > enough if it is just a small fraction of pages that actually define > things? Most of the Web is presentational. You can't use _any_ of HTML's semantics to unambiguously get data out of the Web in the manner you describe. On Sat, 25 Mar 2006, Jukka K. Korpela wrote: > > > > > > How is the reader expected to know whether italics is used in > > > printed matter to indicate a defining occurrence, or to emphasize, > > > or to indicate > > > > The reality is that, in general they do, > > I'm afraid that's wishful thinking. Anything that can be understood in > two or more ways will be understood in the wrongest way. This seems like a reason to provide a way to unambiguously mark such spans of text, rather than requiring authors to use one element for all these cases. That way, at least there is a way to disambiguate if necessary. > If browsers used _different_ default styling for <dfn>, <cite>, and > <var>, the message would be much clearer, and authors might have been > more interested in using such markup. Authors can set different styles in a stylesheet. > > <dfn> etc., give the potential for machine processing > > But it has not been used. It's been used on millions of pages. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Monday, 27 March 2006 20:21:32 UTC