Re: [Selectors4] Semantic Pseudo Elements from Tab Atkins Jr. on 2011-05-11 (www-style@w3.org from May 2011)

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Wed, 11 May 2011 13:15:22 -0700
To: Christoph Päper <christoph.paeper@crissov.de>
Cc: W3C style mailing list <www-style@w3.org>
Message-ID: <BANLkTikw839oJCX1xZt3+_NN-Kmhm5AxjQ@mail.gmail.com>
On Wed, May 11, 2011 at 8:41 AM, Christoph Päper
<christoph.paeper@crissov.de> wrote:
> Tab Atkins Jr.:
>> On Tue, May 10, 2011 at 10:11 AM, Christoph Päper
>>> Tab Atkins Jr.:
>>
>> Yes, CSS was obviously designed with HTML in mind, and thus some types of other source languages don't mesh perfectly, such that you may want to invent tagnames for nodes in the CSS element-tree that don't perfectly match up with what the source language actually looks like.
>
> And that’s what seems utterly wrong to me.
>
> The Selectors3 spec currently says:
>
>  A type selector is the name of a document language element type …
>  …
>  ns|E  elements with name E in namespace ns …
>
>> You started talking about the source language not having the same notion of "start tags" and "end tags" as HTML, though, which is irrelevant - CSS has no notion of these things either.
>
> Yes, that I might have left unclear. What I meant was that the “name of a document language element type” is not always as obvious as Selectors makes it seem. Take this (probably hypothetic) snippet:
>
> * This is an *emphasized* example of a list item with a [[link]].
>
> Therein the asterisk ‘*’ hardly can be such a name, because it is used in two very different manners, and it is just as unclear whether the name for the link element type would be ‘[’, ‘[[’, ‘]’, ‘]]’, ‘[]’, ‘[[]]’ or something else entirely.

Yes, "*" is a bad tagname if you supported CSS on Markdown directly.
Not only is it used for three different elements (unordered list
items, emphasis, and horizontal rules), but it requires escaping in
selectors.  In general, I wouldn't symbols for tagnames - those are
used in these types of markup languages for their brevity, not their
readability.  There's lots of other weirdness embedded in this too -
in Markdown, for example, a link can be specified with []() or with
<>.


> You effectively say to put the burden on markup language creators to design a mapping to element names in a document tree which Selectors, type selectors specifically, can work on. That is unrealistic. It is not as unrealistic, though, that there is a quite limited set of semantic or stylistic keywords that pretty much every plain text markup language could easily be mapped to. Other languages, such as HTML, could be accessed with them, too, obviously.
>
> I don’t want to have Textile map “_foo_” to ‘emph’ and Markdown map the same to ‘i’, while HTML would use ‘em’. Instead I want a single token that implementers, authors and users can rely upon, e.g. in a common default stylesheet.
> Whether that token be ‘emphasis’, ‘:emphasis’, ‘::emphasis’, ‘@emphasis’ or something else is really a secondary point. I just believe that pseudo classes, i.e. one preceding colon, make the most sense, also given the link precedent.

That's for the markup languages and the implementors of the UAs that
natively support such markup languages to decide.  If such UAs arise,
and they support multiple such markup languages natively, I suspect
there will be decent pressure to stabilize a set of tagnames for
similar semantics.

Right now, there are no such UAs.  All such markup languages are
transpiled to HTML for display.  This sounds like a problem we can
avoid caring about until it actually appears.


>> Languages without obvious tagnames for their nodes are at a slightly disadvantage here, but that's the price you pay for compactness, unfortunately.
>
> I proposed a solution to that, fortunately.

It's not clear to me how your solution helps.  Rather than remembering
that emphasis (denoted by surrounding text with "*") is selectable as
"em", you have to remember that it's selectable as an anonymous
element that matches ":em".  If multiple markup languages are
supported that all have spans of emphasis, they can all agree that the
tagname for it is "em" as easily as they can agree that they should be
matched by the ":em" pseudoclass.


>>>> keyword { color: blue; }
>>>
>>> Yes, that could work, although it it probably works worse with namespaces.
>>
>> I don't understand what problem with namespaces you're referring to.
>
> Me neither. I guess I meant documents in a markup language where you would either want to highlight the source code or render the parsed tree, where “keyword” could mean very different things in them, although the namespace is the same.
>
> That means with
>
>  <keyword>foo</keyword>
>
> you would either get just a blue string “foo” or the two strings “keyword” marked blue, since semantically they’re keywords.

That's just a matter of specifying the interpretation.  If a UA
supports styling a view-source: url directly, it would know how to
construct a proper DOM for it such that you could style HTML/XML
source directly.


>> Why are source-language-specific pseudoclasses more generic than source-language-specific tagnames?
>
> The pseudo classes would be generic, their binding would be language-specific.
>
>> There's no such thing as a generic set of semantics
>
> To validate that assessment I suggested to study what text editors do today.

Your examples partially defeat you, though.  You talk about at least
two classes of markup you want to style: plaintext markup languages
like Markdown and Wikitext, and source code.  These have a very
disjoint set of semantics.

Within markup languages there are also significant differences.
Markdown, for example, has three different kinds of links: the direct
link (with <>), the textual link (with []()), and the reference link
(with [][]).  Are these all the same thing, semantically?  Maybe.
Wikitext has tons of ways to link to something, producing internal
links, external links, footnote links, inclusions, etc.


>> Note that :link only exists in the first place for legacy reasons,
>
> It does not, as Øyvind Stnhaug already pointed out, and furthermore “a[href]” is not even the same as “a:link, a:visited”, because the former just requires the presence of the ‘href’ attribute, while the latter needs it to have a valid value, too.

Yes, I didn't realize that, in FF at least, :link won't match <a
href>s with invalid urls.


> The least I hope to get from this thread is a note in an upcoming level of the Selectors module which says how to handle markup languages that don’t have obvious, inherent names for their element types.

That's a reasonable note.  I'd support that, though I'm not sure
Selectors is necessarily the right spec for it.  Maybe.

~TJ
Received on Wednesday, 11 May 2011 20:16:09 UTC