Re: Proposing <indent> vs. <blockquote>

Mike Schinkel wrote:

> Benjamin Hawkes-Lewis wrote:

[snip]

>> Everyone here is technically minded compared to common authors,
>> 
> And that's exactly the situation I was lamenting and why I said their
>  concerns are under-represented here.

Since helping authoring tools is now within its remit, should not the
Working Group conduct some actual usability testing with ordinary people
from different constituencies and with various abilities (e.g. geeks who
aren't web professionals, technophobic newbies, political bloggers,
MySpace users, people with visual or mobile or learning disabilities) of
different authoring forms? It is impossible to settle the question of
whether there are better models for web authoring than WYSIWYG or text
editor authoring without first developing tools exploring such models.
But we could at least assess how easy people really find using HTML with
current WYSIWYG and text editor systems (and learn how to make both
easier and produce superior markup).

In addition, should not the Working Group conduct some actual usability
testing for each feature, or at least each new feature, in HTML5? I do
not believe that simply dumping features on the World Wide Web
constitutes usability testing of any meaningful sort, since HTML
features are filtered through crass educational systems (e.g. w3schools,
MSDN, the dire deadtree books people learn HTML from), broken user
agents (e.g. the sorry fate of <q>), broken CMS (e.g. generating invalid
and oft inaccessible HTML), and broken WYSIWYG authoring tools (e.g.
misuse of <strong> for the [B] (bold) button).

> But I do have anecdotal evidence that the languages that have been 
> easy to hand code have gained more rapid adoption.  RSS vs RDF, HTML 
> vs SGML, Visual Basic vs. C++, Python vs. Perl, for example.

I think the problem with this is that what may hold true for developers
programming does not necessarily hold true for ordinary people writing
web content. My experience of recommending text editor authoring over
WYSIWYG to everyone who asks on the basis that WYSIWYG tools are
fundamentally broken suggests that ordinary newbies generally prefer
WYSIWYG and consider text editor authoring to be scary "programming".
This even holds true with newbie visually impaired authors, who I'd have
thought would be a natural non-technical constituency for text editor
authoring. See for example:

http://www.gwmicro.com/Support/GW-Info_Archives/index.php?postID=10515

http://www.freelists.org/archives/nvda/04-2007/msg00215.html

> most authors don't care about accessibility because it is a lot more 
> "costly" to develop accessible content than not.  Most authors 
> implicitly make a cost-benefit decision and chose not to address it.

This remains true of many web professionals and people commissioning
small retail websites. I doubt it's an accurate characterization of the
"common" authors of social media. Web accessibility issues are a novelty
to most of the ordinary people I speak too. They don't know what screen
readers are, for instance; some express surprise that blind people can
use the web at all.

> <blockquote> misused is less accessible than <indent>

I don't think people's desire to format text should take priority over
everyman accessibility issues. If newbie authors need to understand one
thing above all others, it's that HTML is about what they mean not WYSIWYG.

> But rather than continue this debate adnaseum maybe we should look at
> addressing this problem with more correct semantic markup?  Add a 
> @rel or @type attribute, or create a handful of new elements.

See my next post. :)

>> You ask "how does [presentational markup] reduce the ability to 
>> communicate" but then go on to suggest that we need presentational
>>  information in order to organize material for "visually for
>> maximum comprehension".
>> 
> That still seems consistent to me.

I suppose it depends what you mean by "organize". For example, people
can indent blocks of text to make a list or a block quotation. That's
visual organization that communicates. My suspicion is that <indent>
would widely be used in such a way; this is confirmed by an analysis of
your own misuse of <blockquote> (more on that in my next post).

>> Surely any case where you need to specify custom visual prompts to 
>> make HTML communicate points either to a paucity of semantics in 
>> HTML or to a failure in the default rendering by browsers.
>> 
> That sentence did not make sense to me. Was it incomplete?

No, but I'll rephrase. HTML is a carrier of ideas. Assuming you control
the styling of content, if you /need/ to apply special styles to some
HTML in order to communicate an idea, that implies either:

1. HTML doesn't have a semantic element you could use to express that
idea (e.g. TEI's <soCalled>), or,

2. HTML has such an element, but a browser doesn't render or expose it
in an understandable fashion (e.g. <q>).

>> Can you think of any examples where there is not the case? What 
>> could indent communicate that a semantic element with an effective 
>> default rendering could not?
>> 
> Yes, most of the time it would mean something, though not always, in 
> the case of someone just wanting the visual appeal.

I think supporting "visual appeal" within HTML itself makes the language
harder to use for both authors and readers. And anything that even
sometimes "mean[s] something" must /not/ be ignored by assistive technology.

> But then this would mean something too, no?
> 
> <p style="margin-left:40px;">foo bar</p>
> 
> So in essence you are arguing for elimination of inline styling, 
> because people may use it when then should have denoted it with some 
> semantic markup, right?

I regard inline styling as a detrimental practice. But I don't seek the
style attribute's deprecation from HTML. Ruby on Rails is successful
because it is "opinionated software": it makes good practices trivially
easy without making "bad" (but conceivably /useful/) practices
impossible; I believe that is a good model for HTML 5 to follow.

> I've got a heavy database background, and I've spent
> the past 20+ years trying to fit data into nicely defined fields but
> I've come to realize that, no matter how much analysis I do, there is
> always something that doesn't quite fit so it's better to create 
> catch-alls rather than trying to force everything to be a round peg.

The problem is that innovation has moved much faster than specifications
or even user agents. I'd prefer W3C provided demonstrable use-cases with
new (preferably semantic) elements /before/ said use-cases pollute the
web with misused markup. XHTML 2's roles module, the microformats
movement, and WHATWG's aim of making HTML5 forwards compatible are all
attempts to meet this need.

> Universal accessibility is a nice goal, but its not realistic when 
> you have most of the first world becoming content authors. You have 
> to do your best to optimize accessibility, but draconian methods to 
> enforce will just result in people going the virtual walls you built.

100% accessibility is an ideal not a target, but I certainly believe
that mass accessibility is reconcilable with widespread authorship and a
sine qua non of mass authorship.

> And the misuse of <blockquote> is a perfect example. The irony is
> I'm arguing for an <indent> with no semantics (or something similar)
> for the exact reason you are arguing against it; to improve the
> quality of semantic markup on the web!

Semantic markup that common authors aren't going to use doesn't 
particularly interest me (I recognize TEI and MathML have their places 
in specialist circles, but we're talking HTML here.)

>> And if it wouldn't communicate anything, than why would we want 
>> common authors using it?
>> 
> You know, doesn't that sound a bit "Big Brother-ish?" ;-)

Not really. Big Brother was all about shutting down thought. I want to
prevent the free flow of ideas being hindered by styling misused to
express ideas in ways that everyman (i.e. almost every person) cannot
understand.

>> HTML isn't designed as a medium for visual self-expression, unlike 
>> Flash for example.
>> 
> It isn't?  I feel like I'm expressing myself every time I use HTML. 
> What am I missing?

But you're not /visually/ expressing yourself (to me) when I view your
blog in Lynx, fire it up in my mobile browser, apply a user stylesheet
to it in Firefox, or have Opera read it to me. And that's how HTML was
designed. By contrast, Flash was designed for animations.

>> Our odd culture of formatting HTML differently on every site and 
>> providing a hideous default presentation
>> 
> Our culture?  I'm confused.  Isn't it user agents that render, not 
> people?  Do user agents have culture? Do androids dream of electric 
> sheep?

I meant the (human) culture that creates styled content and
develops and chooses browsers to view it.

>> has been far more burdensome on ordinary authors than semantic 
>> markup has.
>> 
> Semantic markup has not been a burden because the authors do not see 
> it, so they ignore it.

The evidence is against the notion that ordinary authors don't use
semantic markup. Consider the front page of a popular American political
blog:

http://www.instapundit.com/

(archived: http://www.webcitation.org/5O7HX7ewx)

Okay, so lots of terrible markup here. Yet <p> and <blockquote> are
generally used correctly for paragraphs and block quotations.

Or again, have a look at the front page of popular blog from the other
end of the spectrum:

http://www.dailykos.com/

(archived: http://www.webcitation.org/5O7HZaxTV)

Again, plenty of bad markup. But also many correct uses of <p>, <hX>,
<blockquote>, <ul>, and <li>.

If authors do not "see" this markup, it is because they choose WYSIWIG
tools over text editor authoring and do not see /any/ markup at all.

> A wiki is a perfect example of the kind of problem that occurs! Look 
> at Mediawiki's syntax; it's so complex now as to be overwhelming.

At least in the case of Wikipedia, that's largely because there's more
than one way to do any one thing. That's another reason not to introduce
<indent>.

> I've actually thought a lot about this, and if I had a platform to 
> make changes I would be advocating that we get rid of wiki syntax and
> just get everyone to understand HTML. That way they could learn it 
> once and be done with it. Of course HTML would need to become a bit
> easier to hand-author.

More like a /lot/ easier - starting with the need to encode ampersands.

>>> Try getting (almost) any company to change their CMS because it 
>>> doesn't accomodate your needs
>>> 
>> If you can make a strong accessibility or general business case 
>> it's not impossible.
>> 
> But almost completely impractical for the average person to get the 
> company to change. 99.999% of people will just consider the source 
> and go about their business.
> 
>> If you can't make a strong accessibility or general business case, 
>> then your desires probably revolve around styling not 
>> communication.
>> 
> Huh?  The strong business case is empowering a lot more people to not
>  write incorrect semantic markup, but that is apples and oranges as 
> we were discussing the everyman's ability to get a company to change 
> on his request, which is completely unrealistic to expect on a 
> general basis.

I suspect the business case for accessibility is stronger than most
people realize. Also, by raising the spectre of litigation, tightening
accessibility laws are strengthening it still further.

But perhaps the way to combat individual web user powerlessness against
companies is to begin to organize corporate action. I've been thinking
recently that many inaccessible sites remain inaccessible because the
people experiencing the problems don't have the technical skills to
identify and articulate problems and solutions, and wondering if
creating a bugzilla where reports could be refined before being mailed
to webmaster would help. Even a template email where users report what
OS, browser, and assistive technology they are using and how what
happens differs from their expectations, and which provides links to WAI
and other resources, would help.

> Well, that's rather loaded. Any examples I give you you will just say
>  were "not serious." 

Sorry, it was both loaded and, judging by your examples which all seem
to be would-be WYSIWIGs, hopelessly unclear. By "serious" I meant tools
that consciously move away from WYSIWIG models without becoming mere
text editors. FrontPage, DreamWeaver, and friends basically embrace the
WYSIWIG model wholeheartedly. WYMEditor, by contrast, attempts a
(sort-of) semantic view comparable to the Mellel word processor,
although unfortunately retaining the [B] and [I] buttons that generate
<strong> and <em> (<shudder/>).

> On the flipside, give me ONE, just ONE HTML editor that can represent
> the full set of HTML w/o switching to a source..

How is representing "the full set of HTML", rather than the /relevant/
set of HTML, a useful goal?

> One of the biggest problems with the "TOOLS WILL SAVE THE WORLD" 
> mindset is that each tools is only available in a subset of the 
> contexts where HTML can be used, yet text editors are available in 
> 100% of the contexts where HTML will be used.  This is the 
> insurmountable reality that the toolset mentality can never address.
> 
> AND, you are ignoring the benefits of the "view source" effect. 
> Hand-authorable HTML is also a lot more readable.

I don't object to hand-authorable HTML at all, but only to the notion
that HTML is /best/ authored by hand. And because I think hand-authoring
HTML in a text editor is a technical task, I don't think the "view
source" effect has much relevance for common authors like bloggers and
forum posters who tend to choose WYSIWYG tools.

I think <indent> would be detrimental to source view. I find semantic
elements buried in presentational soup extremely difficult to read. It
gives me headaches trying to untangle the stuff.

>>> And I have yet to see a WYSIWYG GUI for HTML that doesn't have 
>>> significant limitations. Name one and I'll show you one that has
>>>  unacceptable limitations.
>>> 
>> 
>> I'm confused. As far as I'm concerned, what makes WYSIWIG 
>> inappropriate for HTML is that:
>> 
> I'm confused too because I was replying to your comment about 
> WYSIWYG.

Sorry, I was confused because I'm used to objecting to WYSIWYG for very
different reasons than you.

>> 1. HTML is about what you mean (content/semantics), not what you 
>> see (presentation).
>> 
> If that is actually completely true, then we should eliminate ALL 
> default presentational behavior from ALL elements (p, h1..hn, 
> table/tr/td, ol/ul/li, etc.)

That's a non sequitur. What I'd like to see is authors writing semantic
HTML and user agents rendering all features of that HTML in a consistent
(at least within a single user agent), beautiful, and understandable
manner by default. Ordinary authors wouldn't have to worry about CSS.
Web designers could continue to use CSS. I'd like to see W3C produce
suggested stylesheets containing default presentation for all elements.
Web designers might wish to use those stylesheets as a base, so they
don't have to specify everything from scratch. I want the pain taken out
of web authoring.

>> You seem to be want to turn HTML into a presentational language
>> 
> You seem to misread my intentions.  I don't want to turn it into a 
> pure presentational language any more than I want to pure semantic 
> language. I want to address most common use cases in a pragmatic way.
> 
>> (or at east provide presentational alternatives to semantic 
>> elements, which amounts to practically the same thing).
>> 
> Your assertion that they are the same does not make it true.

What I mean is that creating presentational elements that could be used
instead of semantic ones will leave the semantic elements largely unused
by "common" authors, so they might as well not exist.

>> WYSIWIG would seem to be at least a passable model for authoring in
>>  such a language, whereas hand-authoring went out with the Mac and 
>> Word for Windows.
>> 
> Hand authoring went out with the Mac and Word for Windows?  What 
> planet are you living on?

The one where my non-technical friends and relations author paper
documents in WYSIWYG word processors like Word for Windows and
OpenOffice.org Writer not WordPerfect 5.1 for DOS. ODF is an essentially
presentational format with some accessibility features; WYSIWYG is a
broken model for it but still preferred by users over coding it by hand.

> And I am saying we first make it hand-authorable with a text editor, 
> THEN we build those tools. Doing otherwise, therein lies madness; 
> believing that we can use tools to hide too complex markup will lead 
> us down a sordid path from which we cannot recover.

No objections here but: my experience is that semantic markup produces
simpler markup every single time.

> * <blockquote> is misused and a semantics-free markup would be 
> beneficial

I agree with the first half (although I'd need to see evidence that it
is being misused in much new content), but doubt the second.

> * HTML should be pragmatic, not ideological

In the absence of concrete distinctions this is fluff, but I think I agree.

> Hand-authorability should be a goal.

No objections.

>> (Not necessarily for newbie authors, but that's not the same 
>> thing.)
>> 
> Can you define your distinction between common and newbie authors as 
> I believe many of the former are the latter?

Fair point: newbie authors are properly a subset of common authors. What
I was getting it is this: newbies may initially find writing with
semantics confusing, just as once people found writing with WYSIWYG
interfaces confusing. But once familiar, semantics make writing much
easier. As social media uptake spreads and semantics becomes as
culturally familiar as WYSIWYG, this obstacle will cease to
exist - unless we occlude semantics with broken WYSIWYG tools, which is
what is in fact happening.

>>> Anyway, I love <sic> the "encourage people to have better habits"
>>>  solution mindset.
>>> 
>> Well, when I say that I include empowering people to do the right 
>> thing by giving them the right tools for the job and taking away 
>> the features that lead them astray as a /necessary/ component: 
>> evangelization alone is not enough.
>> 
> But an abstinence/prohibition stance is not workable either.

The operating idea here is that people abuse markup because they want to
achieve presentational effects. While this is obviously commonplace, I
think it often points to an underlying desire to express a semantic to
which HTML authoring tools do not properly cater. If authoring tools had
dropdowns for "Foreign phrase..." (using <span lang="whatever">) and
"Book title" and "Movie title" (using <span> with hCite), I think a lot
of <em> abuse would disappear. If tools offered even more functionality
around such features, such as reading your text with the correct
pronunciation or inserting (say) Amazon and IMDB links automatically,
then abuse of <em> might disappear from new content almost entirely.

> Toolicious (see my signature) has many of the same goals, and might 
> be able to work in conjunction with yours.

Looks interesting. :)

--
Benjamin Hawkes-Lewis

Received on Sunday, 15 April 2007 14:31:13 UTC