- From: Andrew Fedoniouk <news@terrainformatica.com>
- Date: Sun, 25 Feb 2007 18:18:27 -0800
----- Original Message ----- From: "Adrian Sutton" <adrian.sutton@ephox.com> To: "Andrew Fedoniouk" <news at terrainformatica.com>; "Dave Raggett" <dsr at w3.org> Cc: "Karl Dubost" <karl at w3.org>; <whatwg at lists.whatwg.org> Sent: Sunday, February 25, 2007 2:26 PM Subject: RE: [whatwg] Authoring Re: several messages about HTML5 >> | > I agree that HTML DOM is not suitable for WYSIWYG editing. >> | >> | I beg to differ. It is true that an editing style sheet may be >> | needed to avoid problems with delivery style sheets that use the >> | display and visibility properties to hide content, or which use CSS >> | positioning to layer things in complex ways. But apart from that, >> | The HTML DOM is just fine as it is. >> >> So this means relaxation of requirements - strictly speaking >> it will not be WYSIWYG anymore. If "editing style sheet may be >> needed" then what you will see is not what you will get. > > There are a couple of problems here. Firstly as far as I know there is > not and has never been an editor that does What You See Is *Precisely* > What You Get, they all have various ways to help the user understand the > structure of the document or visualize physical elements such as page > boundaries which don't exist on screen. Furthermore, users don't want to > see precisely what they'll get, what they want and what WYSIWYG is > really about, is that they can see and work with the document in visual > form instead of having to learn a mark-up language and for the rendering > to be close enough that they don't need to use preview (or don't have to > use it more than as a final check). HTML editors in particular don't > provide a precise rendering of the document because no two browsers > render a document precisely the same and even the same browser on > different machines will render it differently depending on user > settings, screen resolution etc. > > Secondly, HTML DOM is not suitable for WYSIWYG editing because it is an > inefficient and difficult representation to use for editing operations. > It is perfectly possible to use it to write an editor - it's just not > the best representation. For instance, if you have content like: > > <p><em>This is <strong>my content</strong></em></p> > > If the user then selects "is my" and applies an inline style (<span > class='stuff'>), with a DOM representation you have a couple of tree > operations to perform, plus you probably should do some normalization to > ensure that the generated tree structure is consistent regardless of how > the content got to this point. However, if you represent the text as an > attributed string, you simply add "span class='stuff'" to the attribute > set for "is my" and it's done. When you serialize the model you have to > construct a valid DOM out of it, but it's fairly trivial to make the DOM > come out in a consistent form and speed is a lot less of an issue at > that point. The attributed string model also has the benefit of better > matching the way the user thinks about the document - they rarely think > of a tree structure, they think of text with formatting. A good example > of the problems caused by the backend model not matching the user's > model is http://www.codinghorror.com/blog/archives/000583.html also see > my response > http://www.symphonious.net/2006/06/12/the-invisible-formatting-tag-probl > em/ - there's a whole series of back and forth responses if you have > nothing better to do, I seem to write about this far too much. > >> | Manipulating the DOM is a straightforward matter of tree walking >> | algorithms. The really difficulty is understanding what the users >> | would like to do. For example, you might type some text and click on >> | the list bullet button. The enter key then starts a new paragraph >> | within that list item, whilst enter followed immediately by another >> | enter starts a new list item. Pressing enter on an empty list item >> | closes the list. When it comes to the markup produced, you can >> | conceptualize this in terms of a collection of critics that look for >> | and fix particular problems, e.g. merging adjacent ul elements, or >> | for moving leading and trailing whitespace out of inline elements. > > True, the challenge is always to 1) find out what the user meant and 2) > do it. The model you choose can make that task easier or more difficult > - in fact it generally does both because no single model is ideal for > every editing operation. In our editor, we have a specific list model > which is different to both the attributed string model and the DOM model > - it simply has a list of LIs that are indented at different levels and > have attributes like list type (OL/UL), style attributes etc. There are > then a relatively small number of atomic changes users can make to lists > (change indent level, change list type or attributes or merge items) and > the model knows how to perform them on itself. The model also knows how > to serialize itself in terms of a valid HTML document structure. The > challenge is then simply mapping user actions to list operations in a > way that is intuitive and consistent for users. By picking an > appropriate model you can save a lot of effort in terms of keeping the > model sane and correct. Lists and tables are excellent examples of where > the HTML DOM is not ideal for editing because there are too many extra > elements around that the user doesn't want to think about. Reconciling > this difference in HTML DOM to user expectations is one of the reasons > why lists and tables are so hard to get right. > >> HTML and its DOM has logical/semantic elements that >> have no visual representation at all. Think about DIVs that >> are used for wrapping portions of textual data and has no >> visual representation. Your style sheet may rely >> on some containment in some DIV - so the same >> sequence of editing actions made by the user produces >> dramatically different (for the user) visual results. > > This is why it's so useful to have a visual representation - the user > can *see* that with the DIV there it renders one way and without it, it > renders a different way. Of course, getting the user to understand the > inherent tree structure that DIVs work with is difficult, so is the > rules for how CSS applies. I don't have a good solution for this, but > we're continuing to try to find one. > >> Let's take a look on example you've provided: >> "The enter key then starts a new paragraph within that list item" >> So after enter you will have: >> Case #1: >> <ul> >> <li>First paragraph. >> <p>Second paragraph.</p> >> </li> >> </ul> >> >> *or* Case #2: >> >> <ul> >> <li> >> <p>First paragraph.</p> >> <p>Second paragraph.</p> >> </li> >> </ul> >> >> So even in this "simple" case you have two options how to make >> mapping of "action -> dom structure". >> >> And now imagine that you will get following markup: >> Case #3: >> <ul> >> <li>First paragraph.</li> >> <p>Second paragraph.</p> >> </ul> > > All three of these are completely wrong. When the user presses enter in > a list item, the editor should insert another <li>. There is simply no > excuse for doing anything else. A startling number of editors get this > completely wrong though. Another very popular error is indenting a list > item and generating: > > <ul> > <ul> > <li>Item</li> > </ul> > </ul> > >> Visually and by default this is indistinguishable from the Case #1. >> So looking on these two list items user has no clue of >> what he/she just did by pressing "Enter". Enter press is the most >> unpredictable action in all WYSIWYG editors in the wild. >> For the simple reason: there is no single unambiguous way >> of changing DOM for such an action. Again tree-alike >> DOM is simply not suitable for that. > > The DOM isn't ideal for this, but it's still quite possible to get it > right, it just takes a lot more effort. It also helps if you run the > HTML you start with through something like Tidy so that it starts out > sane and you don't have to deal with the infinite range of completely > broken HTML as a starting point. I suspect, but don't know for sure, > that many JavaScript editors struggle with this, but our Java based > editor has thankfully managed to avoid this particular set of problems. > > >> | p.s. one missing feature in CSS that would really help would be a >> | means to add a forced line break symbol to the rendering of <br> >> | elements. It is already easy to add a paragraph symbol, but CSS >> | balks at <br> elements for inappropriate reasons. > > We just changed the renderer to put it there without CSS, but I guess > that's cheating. > > Regards, > > Adrian Sutton > > PS: My apologies, I've lost track of who actually said what above. > Adrian, My statement "HTML DOM model is not suitable for WYSIWYG editing" meant not physical limitation but logical one. I agree with you - theoretically it is possible to create some WYSIWYG HTML editor that will be asymptotically close to some ideal. But somewhere on the way to it system will hit a point when it will become a "determenistic chaos" where each line of code is a perfect finite state automata but the whole system is not manageable. Yes, practical solution for this is to simplify DOM structure ( you use "attributed strings" and I am using "flat DOM" in my http://blocknote.net and couple of other editors) I think that HTML WYSIWYG editing solution to be used as <htmlarea> engine in HTML should not use HTML, at least it should not use HTML DOM "as is" but something more human visualy comprehensible. Andrew Fedoniouk. http://terrainformatica.com
Received on Sunday, 25 February 2007 18:18:27 UTC