W3C home > Mailing lists > Public > whatwg@whatwg.org > November 2008

[whatwg] Deprecating <small>, <b> ?

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Fri, 14 Nov 2008 10:59:53 -0600
Message-ID: <dd0fbad0811140859t602d3a81j24bead092c2b2ccd@mail.gmail.com>
On Fri, Nov 14, 2008 at 10:44 AM, Pentasis <pentasis at lavabit.com> wrote:

>  >>>If we wish to communicate that level of semantics, yes.  It may not be
> useful to us.  If you *really* need some metadata/semantics, @class probably
> can't convey it with enough granularity.  Check out the big discussion from
> a few months ago about ccRel and RDFa.
>
>
> Not yet maybe, but we could at least try to keep options open for the
> future.
>

Of course, but I don't think having <small> in the language closes any
options.


> >>Second: Suppose I want to collect all copyright notices from 1000
> websites (don't ask me why, I just want to), how am I to do this when they
> are marked up in <small>s? I will definatly end up with a lot of text that
> has nothing to do with copyrights (and probably miss a lot of copyright
> notices as they are marked up differently) Whereas If they were maked up in
> (for example) <span class="copyright"> I could retrieve it all based on the
> class-name.
>
> >>>That would be a wonderful perfect world.  I'd like the copyright date as
> well, so I can retrieve only things copyrighted in the last ten years.
> Assuming that metadata will exist is a fool's errand.  The fact is that if
> you are searching for copyright notices, the most efficient way is likely to
> just search for the string "copyright" and the (c) symbol.  That'll net you
> copyright notices with a high accuracy, and some training on real data can
> yield further rules to improve the data-mining accuracy.
>
> You say it yourself, only in a perfect world where all websites in the
> world would be written in the same language would your "solution" work.
> Unfortunatly I would miss out on all the chinese copyright stuff.
>

Of course.  But would you expect chinese speakers to use class="copyright"
on their pages anyway?


> But another example (based on "siemens") wouldn't it be nice if I could
> tell Google I am looking for a person named "Siemens" so it would ignore the
> "brand"-name?
>

Certainly.  But at this point you're expecting authors to mark up their
pages with metadata every time they mention someone's name.  The use of <b>
doesn't prevent this, but your use-case certainly requires quite a lot more.


> >>>While we're hoping for copyright notices to be marked up as <span
> class="copyright">, though, why not wish for <small class="copyright">?  If
> you're going to be providing metadata, it works the same.  Is it that you
> believe people won't provide a special class for copyrights if the <small>
> tag already gives them the preferred display?  Do you believe that everyone
> will automatically use class="copyright" to mark up their copyright
> notices?  What if they use class="copyright-notice"?  Or class="license"?
> Or any of a million other distinct possibilities that would destroy any
> naive attempt to datamine based on a particular class name?
>
> Well, that would have to be defined in the standard, wouldn't it? I'm not
> saying -again- it should be defined NOW, but at least leave the door open.
> I have no problems with using small over span, neither one is correct as
> far as I can see, in this context. Using "copyright" instead of "license" or
> "copyright-notice" would have to be defined somewhere, either in the
> standard or in an externally maintained "document" that is accepted as "best
> practice" or "standards related".
>

Okay, then we have no issue with <small>.  There has been some discussion,
btw, about standardizing a set of normative class names.  You should be able
to turn something up about it.

PS: I find it very difficult to respond to rich-text/html messages as they
> seriously mess up the indentation. Sorry therfor if this message is unclear
> as original message and reply are mixed up.
>

No problem; it was clear enough.  The only richtext I use is quote levels,
and with the conversation context nearby anyway, it's not difficult to
puzzle out when it occasionally messes up.

~TJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20081114/0bfa046c/attachment.htm>
Received on Friday, 14 November 2008 08:59:53 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:07 UTC