- From: Robert Burns <rob@robburns.com>
- Date: Mon, 13 Aug 2007 18:50:14 -0500
- To: HTMLWG WG <public-html@w3.org>
Summary ------------------------------------------------ • A chapter should be devoted to visual editing UAs • The WYSIWYG categorization is not focussed enough • Either the visual editing UAs chapter or another chapter should discuss non-semantic editing UAs and converters • Another chapter should cover sending and receiving Mail • Another chapter should content management solutions Taxonomy of editing UAs and converting UAs: WYSIWYG, GUI, visual, and semantic, non-semantic ------------------------------------------------ Currently the chapter addresses WYSIWYG editing UAs. However, the term WYSIWYG does not really fit here. WYSIWYG often is used to describe fidelity between screen and print versions of the same document. In the context of HTML editing UA, it often implies user- invoked or live rendered view of the HTML. This can occur even when the editor is a text editor (for example BBEdit is a GUI text editor that provides WYSIWYG editing). With this in mind, I think clarifying precisely how to use these terms is warranted. Also, I think the main topic that needs to be addressed is how non-semantic editing UAs and converting UAs should produce HTML serializations (either text/html or XML). The key categories to consider here are: WYSIWYG versus source-view only GUI versus Command-line Visual versus non-visual Semantic versus non-semantic\ Considering BBEdit again, this is n HTML editor that is WYSIWYG, GUI, non-visual and mostly semantic (though flexible). Another editor such as emacs is a source-view-only, command-line, non-visual and flexible with respect to semantics or non-semantics. NVU represents an example of a WYSIWYG, GUI, visual and largely non-semantic editor. Microsoft Word constitutes a non-HTML editor that has the ability to convert its content to HTML. As a converter, the other aspects really do not come into play. Here we're only concerned with its native format and its ability and likelihood of containing semantic constructs. Since Word is largely a non-semantic editor it will produce mostly non-semantic data. The conversion of a word file to HTML should similarly produce non-semantic output. This could be done entirely through DIV and SPAN elements and with linked or embedded stylesheets (TextEdit on Mac OS X is an excellent example of a WYSIWYG , GUI, visual, non-semantic editor that produces great fully conforming HTML output separating HTML from CSS). The converter may also use the I and B elements to map from italics and bold styling respectively. Some formats may even contain headings (implicit and explicit) data and be able to successfully map semantic concepts of such things as paragraphs, sections, and headings into the converted HTML. Other semantics may be included in a format produced by a largely non-semantic editor. Converters should do their best to preserve and map those semantics to the appropriate HTML constructs, but should be careful in heuristically mapping non-semantics to semantic constructs. WYSIWYG editing UAs ------------------------------------------------ The current draft basically addresses the issue of WYSIWYG editing UAs in relation to the FONT element and the @style attribute. However, these are more problems of a non-semantic editor[1] than anything inherent to WYSIWYG editing UAs. The chapter should be renamed accordingly. Moreover, there's nothing really gained by maintaining a separate element to HTML5 named FONT. This offers no additional semantic information than DIV or SPAN. It simply adds a distinction where none exists. Non-semantic editing UAs, and converting UAs converting from non-semantic formats, should simply use DIV or SPAN. Whatever ways HTML5 pursues to associate style sheet data should also be used, whether that is an @style attribute, embedded style sheets or linked style sheets. So much of what is said in this chapter is both misdirected and unnecessary for HTML5. The chapter as written should be removed. However, we certainly have a need to provide norms and other guidance for authoring tools: particularly non-semantic authoring tools whether they be editing UAs or converting UAs. We should also provide guidance for semantic visual editing UAs to ensure these authoring tools provide the best possible HTML output they can provide. Since the visual interface in these editing UAs provides a layer of abstraction between the author and the content produced by the tool, it is important to keep the visual layer and the underlying source output as tightly integrated as possible. Fully-semantic visual editing UAs. ------------------------------------------------ For visual editing UAs that are fully semantic, we should provide the following guidance and norms: • Editing UAs must map visual objects accurately to HTML constructs. Examples: - a strong emphasis menu item or button to the STRONG element - a bold menu item to the B element. - an editor that produces a 2D grid or spreadsheet might map that to a table • Editing UAs must avoid performing heuristics from authors work, without providing immediate visual feedback to authors and must allow authors to override the heuristics. Examples: - an editor that adds @headers attributes to TD elements or @scope attributes to TH elements should present the associated cells to the author (editing UAs may even want to present associations to provide feedback when not adding @scope or @headers) • Editing UAs must not include attributes or elements to round-end conformance norms without direct author action Examples: - adding alt='' to an IMG element to fool conformance checkers). - adding summary='' to a TABLE element • Editing UAs that provide greater semantic differentiation than HTML, should use the @class attribute to provide its own semantic extensions to HTML. Examples: - a DocBook editor might use a <p class='' > ... or <div class='' >... to express the semantics of a ... • Visual Editing UAs must provide a mechanism for authors to edit other portions of the document outside the main flow, including metadata and attributes and element contents not otherwise visible from the default styling (OBJECT, long descriptions, TABLE summary, etc.). Examples: - OBJECT contents - @title, @class, @xml:lang and other global attributes - document fragment contents targeted by @longdesc - TABLE summary - TD and TH @abbr attributes - @alt attribute - @href and @src attributes These norms should apply to Editing UAs claiming to be hTML Editing UAs as well as to Editing UAs that natively manipulate other semantic content that can be fully mapped to HTML. Non-semantic (partially or fully non-semantic) editing UAs and UAs converting non-semantic content: ------------------------------------------------ For editing UAs that visually expose non-semantic constructs or semantic constructs that express fewer semantics, then HTML, a difficulty arises in mapping the vocabulary of the tool to the vocabulary of HTML. Accordingly, we should develop a series of norms for these editing editing UAs — and UAs that convert formats — to deal with the semantic deficiencies. • Editing UAs and converters must not map presentational constructs to semantic constructs Examples: - must not map bold to STRONG or italics to EM. - must not map italics to STRONG or italics to EM - should map bold to B and italics to I. • Editing UAs and converters must solicit user input when heuristically mapping presentational idioms to semantic HTML constructs • Editing UAs and converters of non-semantic constructs should map presentation with ambiguous semantics to SPAN and DIV elements; incorporate @class and @id attributes on these elements as and convert the presentational properties of the elements as CSS (either in a linked separate stylesheet or as an embedded stylesheet) [2]. Conclusion ------------------------------------------------ By keeping the issues of WYSIWYG, GUI, visual and semantic separate, we will send a clearer message to editing and converting tools. Following this approach there is no need to lower the bar for visual editing UAs. Even for non-semantic tools, the bar need not be lowered. These tools could still use linked or embedded stylesheets and id and class selectors to preserve presentation information. Even if we followed an earlier suggestion of mine [3] to require authors include at least one attribute on each SPAN (and DIV), the non- semantic tools would still produce fully compliant content since their SPAN and DIV elements would need to include a @style, @class or @id attribute. Take care, Rob [1]: Nicholas Schanks has an interesting example of using FONT in an entirely semantic (though very rarefied) way <http://lists.whatwg.org/ htdig.cgi/whatwg-whatwg.org/2007-April/010899.html>. I don't think HTML calls for that level of specialized semantic. That could be left to a TypographyML or something like that. [2]: Again, see Apple's TextEdit application for an example of a non- semantic editor that could produce fully compliant HTML5. [3]: <http://lists.w3.org/Archives/Public/public-html/2007Jul/0963.html>
Received on Monday, 13 August 2007 23:50:31 UTC