W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > April 2009

[Bug 6774] <mark> element: restrict insertion by other servers

From: <bugzilla@wiggum.w3.org>
Date: Sat, 25 Apr 2009 06:41:44 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1Lxba4-0006qg-0T@wiggum.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=6774





--- Comment #5 from Nick Levinson <Nick_Levinson@yahoo.com>  2009-04-25 06:41:43 ---
The calendar example is fine. It doesn't worry me and that's for exactly the
reason you give.

The standard, however, goes farther in its words. What might have been agreed
upon verbally in meetings won't matter once the standard is promulgated as
reliable for designers of user agents. What the standard says is what matters.
(An analogy in U.S. law is that the plain words of a statute are to be applied
in disposing of a case on its facts and only if the plain words do not provide
necessary guidance may legislative intent be examined, so only then may
legislative committee reports and pre-enactment floor debates be considered,
which means that the original sponsors' hopes and expectations are irrelevant
until the text is found to be unclear in context. That usually means that
drafters' intentions never get considered and only the official words matter.)

The words of section 4.6.7: "Another example of the mark element is
highlighting parts of a document that are matching some search string. If
someone looked at a document, and the server knew that the user was searching
for the word 'kitten', then the server might return the document with one
paragraph modified as follows: . . . . ." Insofar as the search string is only
from a search created as part of a website's internal search function,
including an on-site search box supplied by Yahoo or Google, then your
interpretation that 4.6.7 is safe and does not provide 3d parties with adverse
permission is valid.

But 4.6.7 just talks of "some" search string, i.e., pretty much any search
string, and so there's no limitation that the search string must have been
crerated at the website owner's website. It could have been created at an
external search engine or anywhere else before the URL arrives at a destination
website for page retrieval. It could have been created without the user
realizing it. Most users have no idea what a search string is.

And "matching some search string" from unlimited points of creation is only
"[a]nother example" of marg tagging. It is not the limit. The limit is defined
earlier in the same section: "The mark element represents a run of text in one
document marked or highlighted for reference purposes, due to its relevance in
_another context_. . . . When used in the main prose of a document, it
indicates a part of the document that has been highlighted due to its _likely
relevance to the user's current activity_." (emphases added). There is no limit
that means <mark> can be based only on activity within the same website.
"[A]nother context" is any other context. The "user's current activity" has to
be known or the provision is meaningless, and the standard presumption is that
every provision has meaning until shown otherwise. So the "user's current
activity" has to be known in "another context", wherever that may be.

As to what kind of nefarious use third-party modification would support,
injection of advertising is the likeliest to be common, with the ads being not
very distinct, so users think they're supplied by the website. A local,
professionally-written, newspaper article the other day reported that 5 of 12
results on the first page of Google results shared a certain characteristic;
the problem is that Google doesn't put 12 results on a page, they put 10 and
maybe 2 ads (you can get 12 if you opt for 20 or more results but then you
wouldn't have a "first" page, you'd have only one page, so that's not what
happened). So the reporter didn't know the difference between a result and an
ad, even though the engines label ads. Some days I have to angle my head while
using an LCD with Yahoo/Google search results to figure out whether a result is
really an ad, because the color differentiation has gotten fairly subtle. And I
know this stuff. Most users don't.

If you are right about the drafters' intent, they need to tighten their wording
in the standard. In that case, I'm not sure what <mark> would do that <span>
won't. So it appears that <mark> exists specifically for third-party use, which
means permission for third parties is part of the intent as well as implicit in
the wording. If it's not meant to be, rewriting is required and I favor it.

Without a third-party role, <mark> seems largely presentational. If <mark> is
meant only to be more easily recognized as presentational than <span>, the
standard can be rewritten to say that.

Thank you.

-- 
Nick


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Saturday, 25 April 2009 06:41:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 25 April 2009 06:41:56 GMT