Re: 9. WYSIWYG editor (enforcing the signature) from Robert Burns on 2007-08-04 (public-html@w3.org from August 2007)

From: Robert Burns <rob@robburns.com>
Date: Sat, 4 Aug 2007 08:37:27 -0500
To: Mihai Sucan <mihai.sucan@gmail.com>
Cc: "Ian Hickson" <ian@hixie.ch>, public-html <public-html@w3.org>
Message-Id: <B08CA9C2-4E72-4089-A481-D19A8EB54719@robburns.com>
On Aug 4, 2007, at 4:10 AM, Mihai Sucan wrote:

>
> Le Fri, 03 Aug 2007 23:56:26 +0300, Ian Hickson <ian@hixie.ch> a  
> écrit:
>
>> On Fri, 3 Aug 2007, Mihai Sucan wrote:
>>>
>>> I have read the HTML 5 spec section on WYSIWYG editors [1] and  
>>> I'd like
>>> to express my concern on requiring the inclusion of "(WYSIWYG  
>>> editor)"
>>> in the META NAME="generator" CONTENT attribute value.
>>
>> I agree; in fact at this point I don't think anyone thinks it's a  
>> good
>> idea. We still need a better solution for handling the two tiers of
>> document quality, one targetting humans (who can know what they  
>> mean) and
>> one targetting today's computers (who rarely know what humans  
>> mean), but
>> I'm not sure what it is. One possibility I've considered is to  
>> just have
>> two conformance levels, "conforming html5 document" and "conforming
>> low-quality html5 document", with <font>, style="", and
>> <div>s-containing-inlines kicking documents into the second category.
>
> Others suggested the use of the Strict/Transitional terminology. I  
> don't agree with it, because it's not appropriate in this context.

Well, I think whether strict/transitional is an appropriate  
distinction probably depends on what problem we're trying to solve  
here. We have no clear agreement on what constructs we're even  
talking about here. Suggestions related to WYSIWYG  editors include:

  • Use of FONT
  • Use of @style
  • Use of DIV with an inline content model
  • Use of B and I elements

Except for FONT [1], none of these strike me as representing anything  
like low-quality markup (I'm thinking here mostly about semantics). I  
would expect to see DIV (even with inline content model), SPAN,  
possibly B and I and even @style in well done semantic pages.

Earlier, I had also said we might want to require DIV and SPAN  
elements to have at least one of the global attributes. That's  
something that I think helps ensure documents are authored in a  
semantic and accessible way. In the case of editors that are not  
semantically oriented this is the one place where a distinction in  
quality might be necessary. It would be difficult for a non-semantic  
editor to add a global attribute to a SPAN or DIV in a non-gratuitous  
fashion. It would be better for us to insist that a non-semantic  
editor not simply add a global attribute to fulfill this requirement.  
However, that would mean that there would be a distinction between  
semantic editor created content and non-semantic editor created  
content. I think the way to solve that problem is to make the  
requirement into a recommendation (authors SHOULD include at least  
one global attribute on any SPAN or DIV element). In this way, a  
document created with a non-semantic editor would end up validating  
successfully, though its conformance check report would include  
warnings about the lack of global attributes on each SPAN and DIV  
element. This to me would be a satisfactory way to deal with the  
problem.

> It's non-trivial (if not impossible) for a machine to make the  
> difference between high/low quality code. The style attribute  
> should be allowed in "high quality" documents too, same goes for  
> DIVs and SPANs. The one left is the FONT tag.

I think FONT is the one we can safely deprecate. However, this is  
true whether the editor is WYSIWYG or otherwise. So its not really  
needed in the strict/transitional (or strict/loose or whatever)  
distinction we're trying to make.


> [...]
>
> A very good user can have good quality documents come out of  
> Dreamweaver - albeit doing so is hard. In this case, it's certainly  
> unfair to require the editor to add a signature of a "low quality"  
> document.
>
> Even humans don't always agree on what's "low quality" on the Web.
>
> Why even make such distinction?
>
> Ian Hickson said:
> "We still need a better solution for handling the two tiers of  
> document quality, one targetting humans (who can know what they  
> mean) and  one targetting today's computers (who rarely know what  
> humans mean), but  I'm not sure what it is."
>
> As you present the two "worlds", yes, there's a stark difference  
> between the two. Yes, it makes the reader to want to be able to  
> tell the difference between the two. Doing so is harder.

I don't really see the stark fence there. If a developer of software  
can mistakenly map bold to STRONG and italics to EM, then so too can  
a web author working in a text editor without any software support.  
The only difference is that when the maker of a popular authoring  
tool does it we see the mistake consistently reproduced across  
millions of pages.

> If the distinction MUST be made, here's an idea with a different  
> approach to the problem:
>
> <meta name="edit-modes" content="human, tool, WYSIWYG, CMS">
>
> The new meta-tag "edit-modes" (the name is irrelevant, a better  
> name can be picked) tells how was the document edited in its entire  
> history. If only a human ever edited the document, then the value  
> is only "human". If only a WYSIWYG editor was ever used, then only  
> "WYSIWYG". However, if one opens a human-edited document in a  
> WYSIWYG editor, then the editor should add WYSIWYG to the list, if  
> it does not already exist (such that the list doesn't grow much in  
> time).
>
> Several edit-modes can be defined: tools (file converters,  
> cleaners, etc), WYSIWYG editors, and CMS which is a different  
> "beast". You could even have two types WYSIWYG-Full (web authoring,  
> like Dreamweaver) and WYSIWYG-Embedded (like Awebitor).
>
> The edit-modes don't directly tell the quality, but they are a  
> strong indication of the code quality. Besides, I was thinking of  
> adding even more indications of code quality: the edit-modes could  
> be like a "log". Here's a scenario:
>
> - a human creates a document with edit-modes=human
> - then the same document is edited with a WYSIWYG editor. Now edit- 
> modes=human, WYSIWYG
> - the document is cleaned up. Now edit-modes=human, WYSIWYG, tool
> - again, back to the editor. Now edit-modes=human, WYSIWYG, tool,  
> WYSIWYG
> - the next day, edit the document with another or the same WYSIWYG  
> editor. Now edit-modes is not modified because the last string in  
> the list is WYSIWYG.
>
> The idea, is you *append* to the list each edit-mode, if the last  
> one doesn't equal the "new" one.
>
> This would provide an improved indication of code quality. .

There may be reasons to provide this metadata (I think I would want  
the modes to only show once for each mode with the latest editing  
mode appearing last in the list) However, I do not think it provides  
any indication of quality. All of these modes should take care to  
ensure it doesn't create low quality markup. If these various tools  
take those steps then I see no reason to rank one above the other in  
terms of quality..

> Now, the spec could hint at which edit-modes indicate high/low  
> quality documents. However, the spec must not set in stone what's  
> defined high/low quality.

This seems completely opposite. If we provide no criteria for what is  
low and what is high quality, then how can we expect anyone,  
operating in any edit mode to achieve high or low quality markup.

> The edit-modes suggestion could have a wider range of applications,  
> than simply having a meta-tag "thumbs up" (high quality) or "thumbs  
> down" (low quality).
>
> One of the advantages of edit-modes is, you get to see the  
> "quality" in a single tag, without checking the document. This is  
> *without* directly telling the quality.

Again, this may be interesting metadata, but no one should draw any  
conclusions about the quality of a document from it (that's something  
we should say in the recommendation if we include this metadata).

Take care,
Rob


[1]: Nicholas Schanks has an interesting example of using FONT in an  
entirely semantic (though very rarefied) way <http://lists.whatwg.org/ 
htdig.cgi/whatwg-whatwg.org/2007-April/010899.html>.
Received on Saturday, 4 August 2007 13:37:43 UTC