[Bug 29483] New: metatags already too redundant

https://www.w3.org/Bugs/Public/show_bug.cgi?id=29483

            Bug ID: 29483
           Summary: metatags already too redundant
           Product: HTML WG
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: CR HTML5 spec
          Assignee: robin@w3.org
          Reporter: Nick_Levinson@yahoo.com
        QA Contact: public-html-bugzilla@w3.org
                CC: public-html-admin@w3.org
  Target Milestone: ---

Redundancy is getting out of hand despite the instruction at WHATWG
MetaExtensions to reuse what's roughly close enough. In order to support
different parsers and not be misunderstood due to omitting a tag, my new
website (cold32.com) on a recent day had the following meta tag values on a
typical page (index.html):

-- nine of me as author, creator, designer, and publisher: author;
article:author; web_author; creator; dc.creator; dcterms.creator; designer;
dc.publisher; dcterms.publisher (differences can be substantial but that still
leaves three for author, three for creator, and two for publisher)

-- five copyright statements: rights; dcterms.rights; dcterms.rightsHolder;
dc.dateCopyrighted; dcterms.dateCopyrighted (while dcterms.rightsHolder would
be useful if rights had been transferred, such as to a literary agent, without
such a transfer omitting this tag is potentially problematic if a parser would
understand the omission to mean there is no rights holder) (while
dc.dateCopyrighted and dcterms.dateCopyrighted could presumably have a year
only, when the first year of a copyright stretches over multiple years then a
distinction has to be made between revisions and balance, which the main
machine-readable copyright notice already does)

-- five for description, all identical: description; dc.description;
dcterms.description; twitter:description; og:description

-- five for page title, four identical and one nearly so: dc.title;
dcterms.title; twitter:title; og:title; application-name

-- four for coverage by space and/or time: dcterms.coverage; dcterms.spatial;
dc.temporal; dcterms.temporal

-- four for date of first appearance: created; dc.created; dcterms.created;
article:published_time (including a specific time would be rare and this last
is explicitly date-time)

-- three for date of last modification: dc.modified; dcterms.modified;
article:modified_time (last explicitly datetime)

-- three that say this is a "website": dc.type; dcterms.type; og:type

-- three for language: dc.language; dcterms.language; og:locale (it's possible
the last was meant for markup applied in Russia to a page written in Chinese,
but since the spec calls for a language code it seems unlikely a literal place
is what they'd want to find)

-- three for an icon: two meta tags plus one link tag: twitter:image; og:image
(meta thumbnail arguably should be different); redundant of link icon

-- two for audience type: audience; dcterms.audience

-- two short URLs (protocol-to-TLD only): one meta tag plus one link tag: meta
msapplication-starturl redundant of link shortlink

-- two page URLs: one meta tag plus one link tag: meta og:url redundant of link
canonical

-- two for main color: theme-color; msapplication-navbutton-color (maybe
different but not clear if latter omissible)

While dc.* and dcterms.* we might think would not both be needed on the same
page, a claim that one replaces the other does not seem clearly supported by
DCMI, so I assume parsers expect both and I have to supply both.

And there are more metatags that aren't even registered in HTML5 or
MetaExtensions but seem to be in use for scholarly works.

I'm treating the name and property attributes in meta tags as interchangeable.

This doesn't consider microdata, which don't use meta tags but do overlap meta
tags.

I predict redundancy will generally get worse, as more websites get large and
decide to throw their weight around and make their own systems using meta tags,
creating their own branded comprehensive systems that don't leave gaps for
what's already available.

Suggestion: Require in HTML5 that if an attribute's value is absent and a
certain other value is present in the same file then interpret the metadata as
if both were present and identical. Which attribute values to which to apply
this would be enumerated, probably in WHATWG MetaExtensions with another column
(this does not fit a legacy synonymy so no existing column would fit).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Received on Friday, 19 February 2016 02:58:42 UTC