- From: Eduard Pascual <herenvardo@gmail.com>
- Date: Thu, 6 Nov 2008 00:32:57 +0000
First of all, I want to apologize. I'm quite afraid that the explosion of frustration and disappointment on my last message to this list was one of the triggers (if not the only or main one) igniting the conflict here. I'm really sorry for that: my only intention when joined this list was to contributing into making of HTML5 the best it can be, for web users, content authors, browser and tool implementers, and all other affected parties; and that message didn't help this intention at all. Before going on, I want to make clear that most of what I express in my mails is an opinion or point of view. I, just like anyone else, may be wrong at any time; and I'm more than willing to accept I am when I'm shown some sustainable evidence or argument proving so. If any of you thinks I'm wrong, on whatever I might say or have said, just let me know, and I'll listen (or read) you. Yet the source of my frustration didn't come from being proven wrong our finding disagreement on some or many suggestions; but on the fact that most of the threads I tried to participate in ended up being ignored or stepping into unrelated side-topics. I'm quite willing to make one more effort to assume good faith, and to take the assumption that my messages went unnoticed, or that I failed to express myself so miserably that nobody was able to get my point, or felt into oblivion for any other unintentional reason; rather than thinking that I'm being deliberatelly ignored. (I might add as well that a cold shower really helps with that :P ). Now, let me add that my opinions are not based on just impulsive thoughts; but on over five years of professional experience as a webmaster, web application developer, and SEO specialist; plus roughly five more as a hobbyist "web tinkerer" (ie: setting up dynamic websites just for the sake of it, with no other purpose than seeing how far could I go before messing things up). I've been also into programming since I was 8; so even if I don't write browsers and authoring tools myself, I normally have a quite approximate idea of how easy/hard something could be to implement. I perfectly know that some people here will have more and deeper experience than me; yet I still think that I have enough background to at least contribute something useful. My first opinion on the current discussion is that there is too much going on. If just an off-topic side comment has been enough to lead some entire discussions astray, I don't think it is possible at all to have a rational and/or useful discussion about so many topics at once. I'd like to (briefly) reply to some of the comments posted on this discussion, and even add a few of my own; but if somebody feels anything of what I say here is worth replying to, then it's probably worth updating the subject to something relevant, splitting this thread into more focused dicussions. To Pentasis's comment "Who ever said that the standards are here for browsers?" and all the replies to it: There are several facts here to keep in mind. First of all, there is a relevant collective among web authors (I can't say how big is it, relative to the entire web authors collective), who simply don't trust browser vendors. And there are quite good reasons for that: Microsoft and Netscape literally *negotiated* HTML3.2, with horrible consequences that we are still suffering over a decade after. Microsoft has single-handedly boicoted the propper adoption of XHTML1 with IE's obnoxious treatment of the "application/xhtml+xml" MIMEtype (and, whether you like it or not, there are some cases where draconian error handling is a feature rather than a drawback, added to the extensibility mechanisms XML offers). There has never been an HTML vs. XHTML debate: IE took the choice away, forcing those who tried to use XHTML to do a lot of extra effort, and it's quite obvious that the affected authors didn't like that too much. Microsoft has also stagnated the evolution of the Web, by deciding to take over a decade before decently implementing CSS2. All those authors who were eager to take the most of the new CSS when it was published are quite pissed off by the fact that a single vendor denied them all these new features, for no other reason than lazyness and the commodity of having the market under a monopoly. These are the most obvious and sound; and I hope people from other sectors can get an idea from here about why there are so many developers that are quite exceptic towards browser vendors choices and claims. Now, we have to add the second fact to the mix: the WHATWG group is mostly made of representatives of browser makers (although Google has recently stepped into the sector with its Chrome browser; I think Ian shouldn't be looked at as another browser maker representative; but still we should keep in mind that it's up to them to replace him with another editor if they feel like it). Top it up with the extra fact that the spec is copyrighted by three of these browser vendors. For those of us who don't trust browser vendors this is, in the best case, scary. Hence, don't expect web authors (at least this subset of them) to blindly trust the vendor-centric WHATWG (by vendor-centric I simply mean that it's composed of browser vendors at its core) from the beginning. By joining this list, providing our feedback, and sharing our opinions and PoV, we are giving the group a chance to earn that trust. How does the group use that chance is up to their members. Of course, we are quite aware that browser vendors have the final say on what do they implement; or at least they think so... But content authors have the final say on what do we use to mark up our documents; and currently we do have a choice: it is possible to perfectly render XHTML2 pages on all currently used browsers (although IE up to 7 can be a bit tricky, due to the lack of CSS2 support). Someone (I won't name that person, because it was a private conversation) told me that if HTML5 didn't meet the requirements of browsers, at the end browsers would implement something else; the same way XHTML2 didn't meet those requirements and browsers aren't going to implement it (actually, by implementing XML, namespaces, and XSLT, they are already implementing enough of XHTML2, but that's a separate point). The same reasoning applies to content authors: if HTML5 doesn't meet authoring requirements, authors will end up authoring with something else. Actually, to put a specific example, there is only one issue that keeps me from using XHTML2 for my current website project (and it's completely unrelated to browsers); and it would currently be the best option: it serves my needs better than XHTML1.x or HTML4.x; and HTML5 isn't mature enough yet to be even taken into consideration for that site. However, the main reason why I joined the lists is that I think HTML5 has the potential to beat (read: become much better) than XHTML2. Enough of that; let's go to the next point: On Wed, Nov 5, 2008 at 10:11 AM, Henri Sivonen <hsivonen at iki.fi> wrote: > On Nov 5, 2008, at 10:46, Pentasis wrote: > >> <var> is the best example I think. Why <var> but not <function> <operator> >> <operand> etc. etc. etc.? And if code gets this attention why not language? >> (<verb>, <noun> etc. etc.) If we do it like that it would never work. > > <var>, <cite> and <dfn> (and, one might argue, <em>) are legacy elements > flowing out of a desire to replace <i> with something "semantic". > > Since the elements are part of the HTML legacy, there isn't a great > rationale that would justify their inclusion today if they had never been in > HTML and were proposed as new elements now. This makes me wonder: is the backwards compatibility topic being dealt appropriately? For example, why keep <var> (and others), but drop <big>? Why don't keep <font> as well? It is part of the HTML legacy, after all, and a quite large part if you look at the markup of currently existing documents (I'd bet that it's among the three most used elements in the current web, sharing the podium with <p> and <a>, but can't say for sure). I think following HTML4's and XHTML1's approach and having Transitional and Strict flavors wouldn't be a bad idea (I don't know if the Frameset one would still be needed: <table> + <iframe>; or even <iframe> + CSS's display: table-cell; seems quite cleaner, more flexible, and doesn't require authors to use two separate content models for similar stuff). Browsers wouldn't need to care at all when using their "tag-soup" parsers; and it would be just a matter of feeding one DTD/schema or the other when using an XML parser. It would allow separating "obsolete" stuff that is only kept for backwards compatibility from really structural stuff. The impact of doing this (ordered from the worst to the best side effects): Validator implementers would face quite a deal of extra work, since they'd need to validate for different kinds of document (namely, "Transitional soup", "Strict soup", "Transitional XML", and "Strict XML"). Spec writers will have to properly define the flavours. On this early stage, it could be enough to mark the appropriate stuff in the spec as "transitional"; and then writting the DTD's or similar formalizations once the content model becomes stable enough. Authoring Tool implementers wouldn't face too much issues: if they already make a distinction between HTML4's / XHTML1's flavors, reusing most of the code should be quite doable. For those that don't separate flavours, simply don't expect them to start now :P. Browser implementers would be trivially affected: they'd just need to incorporate both flavors of the DTD/Schema for XML parsing, and at much add a bit of logics to ensure the appropriate one is fed to the parser (but browsers are supposed to already be doing this when they are dealing with XHTML1, so it shouldn't be an issue). Authors who chose Strict doctypes would enjoy a succint, efficient, and non-bloated language allowing them to conciselly and consistently mark up their documents. As soon as different kinds of UAs (including browsers, assistive technologies, and search engines, among others) become aware (read: are updated) of the new markup stuff, users will enjoy a wide variety of benefits. Better bookmarking, smarter hints by assistive technologies, and more representative snippets in SERPs are the first ones that come to my mind. Next point: On Wed, Nov 5, 2008 at 10:22 AM, Markus Ernst <derernst at gmx.ch> wrote: > Pentasis schrieb: >> [...] >> First of all, I want to make it absolutely clear that these ideas are >> strictly dealing with context and semantics. I do not wish to interfere in >> the technical part of the spec. I do understand that sometimes there are >> ideas that may involve technical solutions. My first and foremost concern is >> about having a specification that deals with the naming of elements and >> their usage in such a way that this would give us a standard which will >> enable us to markup content consistantly and flexibally without ambiguity, >> and which is flexible enough to act on-the-fly (so we don't have to wait for >> the next version of the spec if something is missing). [...] > If I understand you correctly, you suggest a very basic set of structural > elements, which are to be flexibally qualified by the authors via the class > attribute. The composition of that set should follow some kind of basic > language logic. > > If I understand HTML correctly, it provides a limited set of pre-qualified > elements, some of them with a more structural emphasis, some of them with a > more semantic (or or even presentational) one. The composition of that set > does not follow a higher logic, but the everyday needs of the common web > author (or what the writers of the spec assume this is). > > (I hope this is understandable; I am not a native English speaker, either.) > > So, supposed I got these both correctly, you do not really talk about HTML, > but about an alternative approach of marking up text documents. I > personnally find thinking about alternative approaches very interesting and > useful for opening up one's mind. Actually, I think that what Pentasis is talking about is nothing else than HTML in its earliest and purest form, untainted by the side effects of the browser wars and the mistakes of the past. Although we can't undo past mistakes, we can learn from them; and put some effort on fixing them. Initially, HTML was entirely structural: no presentation, and no semantics. Just paragraphs, headings, anchors, and few other things. With HTML3.2, there was an atempt to make HTML presentational, and it soundly failed. It was aknowledged as a mistake, and HTML4 (plus CSS) put a good deal of work on fixing it: presentational stuff went out (more preciselly, "deprecated"), and presentation was delegated to a separate language (CSS). HTML only left @class for hooking to external information, and @style for when embedding was more appropriate. Then, to make sure noone was left out, a Strict flavor of the language was published, keeping it "pure", and a Transitional one, keeping all the deprecated stuff on it to ease transition, and to enable document-level backwards compatibility. I hope we all agree this was a good solution and that it worked; but if somebody doesn't, please let me know. (It's true that shortly after came XHTML1, adding quite a bit of confussion to the scene, but that's a separate topic). So, if it worked, why not reuse that approach? Why do we need to go through the same mistakes again? Ok, that's an easy one: we need 'cause we are human :P. Jokes aside; am I really the only one here that sees this as exactly the same thing!? Let me try to make it even clearer: after the 3.2 disaster, it was found that: (1) presentational markup didn't enough to properly control the presentation of webpages; and (2) presentational markup clashed so often with structural markup that markup itself was not reliable anymore to infer the structure of a document: either structure was sacrified in favor of presentation, or presentation was sacrified for structure. Now, Pentasis initial posts were showing up a fact: sematic markup doesn't do enough to properly describe the semantics of webpages. I had already posted some comments and even a few examples showing how semantics and structure can often clash, requiring one to be tweaked to achieve the other. Doesn't sound familiar? Can't we simply apply an equivalent solution to the one we used for an equivalent problem ten years ago? Before going on, here are some of the simplest examples I posted a couple of months ago about this issue: <nav> is the only facility in the spec right now to describe "navigation" semantics; but it also implies a "section" structure: hence there is no means to express "navigation" semantics for something that isn't structurally a "section" (for example, headings of the recent changes to a site in the site's main page, linked to the relevant sections, are quite "navigation" stuff, but they are definitely not sections). Similarly, there is no way to mark something as "tangentially related" without making it a "section" (with the <aside> element). And, for example, what about something that's both "navigation" and "tangentially related" (regardless of wether it is a section or not)? For example, a list of "see also" stuff on a documentation page: you would be forced to markup it as <<a "navigation" section inside an "aside" section>> or as <<an "aside" section inside a "navigation" section>>: none of both reflects the real structure of the page; but they are the only ways to represent both semantics. I know these examples are really simple, and the workarounds wouldn't really hurt that much; but they should be enough to show how we are stepping into the same issues with semantics that we did over a decade ago with presentation. Do we really have to wait to be hurt by the issue before solving it, when we can see it so clearly approaching? I don't know you, but I know I am *not* masochist, so I don't really want to get hurt. Now, to something more specific, we'd need: 1) Some (external to HTML) way to describe semantics. (And no, I don't think RDF, on its current form, is a solution for this; but maybe the solution could be based on or inspired by RDF.) That should be to semantics what CSS is to presentation. And we don't really need to care about browsers quickly implementing it, or about legacy browsers that don't implement it, because currently browsers don't care at all about semantics (at least, not beyond displaying @title values and for default rendering, and rendering can be dealt with through CSS anyway). 2) A way to hook these external semantics to arbitrary elements of a page: we already got @class for this :D 3) A way to add inline semantics when needed. I guess a "semantics" attribute would be the most straight-forward approach. About the format it uses, we should care about it once we have solved 1). If we got that, then we could: 1) Get rid of all the "wannabe semantic" elements that didn't really work well enough, sending them to the deprecated/transitional/supported-for-backwards-compatibility-only limbo. 2) Get rid of all the *new* "wannabe semantic" elements that wouldn't be really serving any purpose (ie: un-bloat the content model) 3) Have the simplest and cleanest markup, the most accurate presentation mechanisms, and the richest semantic descriptions of the last 10 (or even more) years, all in one package. > I agree with you that there are many things in HTML that have a purely > historic legitimation, such as the h1-h6 elements. <h level="n"> would be > much more flexible. I personnally often get mad about the IMO totally > unlogic set of form elements. I would highly appreciate such thigs to be > cleaned up in a new HTML spec. But of course the task ot those who design > HTML5 is not to re-invent the wheel, but to evolve the existing HTML in a > highly backwards-compatible way. I have already mentioned what do I think about the backwards-compatibility requirement, and the way it's being approached. Anyway, I think its also worth pointing out the issue with headings: currently, the spec recommends using <h1> for all levels of headings, but that would mess the hell up on current browsers. Hasn't anybody noticed that? > I made the experience when I suggested a new set of form elements, that I > did not get much response on those contributions. The same might happen to > your suggestions, as they are on a more basic level, than the HTML5 works > act on. I don't think you can blame the people working on HTML5 for this, as > they are quite far in the process, and your suggestions do rather set new > starting points, than contribute to the acutal state of the work. These are quite different cases: the main issue with form elements is that their functionality is normally hardcoded in the browser. Pentasis suggestions (and even my own) would only significantly affect the spec itself and validators; and maybe future "smart browsing" features that aren't yet implemented anyway. Well, that's been a long enough message, and over 3 hours of typing and reviewing stuff are now asking me for a cigarrete, so I'll post again soon with the "additional comments" I was planning to add. I want to remind you all that this message mostly reflects my point of view; and if someone disagrees I'm more than willing to pay attention to your arguments. Also, I think it'd be good to start branching stuff from here rather than keeping the multi-discussion on this thread. Regards, Eduard Pascual
Received on Wednesday, 5 November 2008 16:32:57 UTC