Re: [whatwg] <menuitem>: Issue reported by the web developers

On Mon, 08 Dec 2014 21:50:56 +0100, Simon Pieters <simonp@opera.com> wrote:

> On Thu, 27 Nov 2014 01:15:20 +0100, Ian Hickson <ian@hixie.ch> wrote:
>
>> On Wed, 26 Nov 2014, Simon Pieters wrote:
>>>
>>> - Make the end tag optional and have <menuitem>, <menu> and <hr>
>>> generate implied </menuitem> end tags. (Maybe other tags like <li> and
>>> <p> can also imply </menuitem>.) The label attribute be honored if
>>> specified, otherwise use the textContent with leading and trailing
>>> whitespace trimmed.
>>>
>>> This would allow either syntax unless I'm missing something.
>>
>> That's another option, yeah. Probably the best so far if we can't just
>> power through and break the sites in question. It's not yet clear to me
>> how many sites we're talking about here and how possible it is to
>> evaneglise them.
>
> In httparchive  
> http://bigqueri.es/t/analyzing-html-css-and-javascript-response-bodies/442  
> :

FTR, the numbers were slightly wrong. I didn't count top-level pages, I  
counted resources (including e.g. iframes). Also there is a bug in the  
data with duplicate entries for some pages  
(https://twitter.com/zcorpan/status/542363458671747072 ).

> * 10101 pages use <menuitem>

8929 pages use <menuitem>

SELECT page, COUNT(*) as num
 FROM [httparchive:runs.2014_08_15_requests_body]
WHERE mimeType CONTAINS "html"
   AND REGEXP_MATCH(LOWER(body), r'<menuitem\s')
GROUP BY page
ORDER BY num desc

> * 39 have no label attribute
> * 0 have non-whitespace content
> * 15 have no end tag
>
> Based on this, it seems possible to keep it as a void element and only  
> use the label attribute.
>
>
> SELECT COUNT(*) as num,
>   CASE
>    WHEN REGEXP_MATCH(LOWER(body), r'<menuitem\s([^>]+\s)?label\s*=')  
> THEN "label present"
>    ELSE "no label"
>   END as stat
>  FROM [httparchive:runs.2014_08_15_requests_body]
> WHERE mimeType CONTAINS "html"
>    AND REGEXP_MATCH(LOWER(body), r'<menuitem')
> GROUP BY stat
> ORDER BY num desc
>
> Row num stat 
> 1 10062 label present 
> 2 39 no label 

8900 have label present (so 29 no label).

SELECT page, COUNT(*) as num
 FROM [httparchive:runs.2014_08_15_requests_body]
WHERE mimeType CONTAINS "html"
   AND REGEXP_MATCH(LOWER(body), r'<menuitem\s([^>]+\s)?label\s*=')
GROUP BY page
ORDER BY num desc

>
> SELECT COUNT(*) as num,
>   CASE
>    WHEN REGEXP_MATCH(LOWER(body),  
> r'<menuitem[^>]*>(\s*[^<]+)+\s*</menuitem>') THEN "has content"
>    ELSE "no content"
>   END as stat
>  FROM [httparchive:runs.2014_08_15_requests_body]
> WHERE mimeType CONTAINS "html"
>    AND REGEXP_MATCH(LOWER(body), r'<menuitem')
> GROUP BY stat
> ORDER BY num desc
>
> Row num stat 
> 1 10101 no content 
>
>
> SELECT COUNT(*) as num,
>   CASE
>    WHEN REGEXP_MATCH(LOWER(body), r'</menuitem>') THEN "end tag"
>    ELSE "no end tag"
>   END as stat
>  FROM [httparchive:runs.2014_08_15_requests_body]
> WHERE mimeType CONTAINS "html"
>    AND REGEXP_MATCH(LOWER(body), r'<menuitem')
> GROUP BY stat
> ORDER BY num desc
>
> Row num stat 
> 1 10086 end tag 
> 2 15 no end tag 
>


-- 
Simon Pieters
Opera Software

Received on Tuesday, 9 December 2014 20:59:31 UTC