W3C home > Mailing lists > Public > public-html@w3.org > May 2008

Current HTML ruby markup usage

From: Philip Taylor <pjt47@cam.ac.uk>
Date: Tue, 27 May 2008 00:39:58 +0100
Message-ID: <483B4A4E.6080605@cam.ac.uk>
To: HTML WG <public-html@w3.org>

Given the recent work on adding ruby markup to HTML5, I thought I'd look 
at how it's currently used (mainly to (hopefully) confirm the data Hixie 
based the design on), so I've put some examples up at 
<http://philip.html5.org/demos/html/ruby/wild-examples.html>, extracted 
from all the sites using ruby elements from a random sample of 130K 
pages from dmoz.org.

I've not looked in any detail, but some simple observations:

* Ruby is used on about 0.3% of the 3286 .jp pages in my sample. (For 
comparison, <acronym> and <sub> and <csaction> are each used on about 
0.3% of all (not just .jp) pages.) (I only have 315 .cn pages so I don't 
have enough data to compare.)

* End tags are often omitted. (Corollary: this is not XHTML.)

* <ruby>, <rb>, <rt> and <rp> are frequently used. <rbc> and <rtc> don't 
come up at all, but my sample is too small to know how rare they are.

* Ruby elements are sometimes used accidentally. (See 
http://www.bellabotanica.com/ - <rb></br>)

* <ruby> is always (in this sample) used whenever there are other 
intentionally-used ruby elements.

* The 'lang' attribute is not always to be trusted. (See 
http://za455.myweb.hinet.net/ - <span lang="EN-US" style="font-family: 
新細明體"><font size="4">僅</font></span>)

-- 
Philip Taylor
pjt47@cam.ac.uk
Received on Monday, 26 May 2008 23:40:37 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:31 UTC