Re: Exposing the critical ACTUAL style values? from Ray Whitmer on 2002-12-17 (www-dom@w3.org from October to December 2002)

From: Ray Whitmer <rayw@netscape.com>
Date: Tue, 17 Dec 2002 11:57:30 -0800
To: Shelby Moore <shelby@coolpage.com>
CC: www-dom@w3.org
Message-ID: <3DFF81AA.70305@netscape.com>
Shelby Moore wrote:

>Ray,
>
>Your points in 2nd email are worded more forcefully, so please do not take
>it personally if I respond with equal force to make my points.  Our goal is
>same.  Quickly this needs to efficiently lead to workable path forward,
>else I am sure we will both tire quickly (since only the 2 of us so far).
>  
>
I do not take such arguments, made purely on a technical basis, 
personally at all.  The forcefulness only conveys that I believe I can 
demonstrate the technical merits of my argument, not that it should be 
so obvious that I should not have to make the case.

We certainly need to at least understand better why we each are holding 
to the arguments we make.  I want us both to understand the other's 
point of view.  If that does not come across in the way I have written, 
it is a flaw, perhaps, in the way I write.  

I suspect that both of us think they understand the other's point of 
view, and at least one of us is wrong, and we will have to keep making 
some noise until someone is better able to express and someone is better 
able to comprehend what the other has written.

>>It was my opinion at the time that the model 
>>advocated would actually be quite trivial to wrap around existing 
>>implementations
>>    
>>
>
>My goal is in the real world of commercial success, to always reuse
>existing work where possible, and typically to rewrite bad designs before
>digging deeper hole.  Which is related to why I am here at this cross
>roads.  Which is why I am surprised that you frown on virtual methods.  I
>have never been involved with a failed commercial project.  Every major
>project I have touched has lead in it's market.  From WordUp with > 15%
>market share on AtariST, to what became Corel Painter (the world's first
>natural media painting app), to Cool Page (our current popular product for
>novices which is at a cross roads).  How many programmers have hit 3 major
>homeruns in commercial (mass market) programs?
>  
>
I certainly have no reluctance to use virtual methods.  Perhaps I 
misunderstood basically what you mean by "using virtual methods".  I 
wrote a long response, but I deleted it because I may have misunderstood 
your meaning of "Using Virtual Methods".  Were you referring to:

1.  My reluctance to make Node castable to XPath and Segments.

This has very solid technical reasons, that have to do with modularity, 
non 1:1 mappings between content and presentation, the limitations of 
C++ and Java, etc.

2.  The use of generic accessors with string arguments instead of 
specific accessors.

This only has to do with satisfying some critics who didn't want a 
potentially-infinite set of interfaces.  I am just as happy to create 
distinct interfaces to handle these things and instead of having strings 
have IDL.

>Cool Page has a simple one (does not use the browser to render content),
>but I would not say it is good or general.  But it efficiently does the
>limited job it was designed for being not dependent on browser or
>proprietary apis.
>  
>
Does it rely on extensive formatting to determine where text lands, what 
fonts, etc, or is it more like a final form or Graphics layout, where 
there is a strict 1:1 correspondance between presentation and content?

>>and there is 
>>seldom a direct mapping from the content to the type of values that are 
>>needed from a view.
>>    
>>
>
>It depends on the goal.  It wouldn't hurt to expose the content
>transformations as a first step, since conceptually (deciding what to
>expose and what not to) it is the most obvious and leverage existing work
>on what is important.
>  
>
I know a great deal about transforming content into presentation, but 
this statement says nothing useful to me.  Which content transformations 
would you expose, and how would you expose them 1:1?  Perhaps you are 
only talking about actual computed style values, not things like 
position, hit tests, physical events, etc.

The values I am concerned with in the presentation have no corresponding 
single content object that could be cast to obtain it.

>For DHTML, the DOM transformation to the rendered view is very often what
>is needed.  
>
There is no "DOM transformation" to the rendered view, other than 
proprietary ones that have been created.  DOM defines how to access the 
content, which either has no view or has a view that is created in a 
very proprietary way.  That is the strength of the markup language -- it 
is independent of the view.

>One is typically needing to know what the position, z-index, or
>state of content node is within the rendered view.
>
Such a question is often not answerable.  In some cases, there may be no 
position in the geometric coordinate sense.  In other cases, there may 
be any number of different positions and z indexes, depending upon how 
the text is formatted, in how many views it is visible, and so on.  You 
did not answer the questions in my last response about what that really 
means in common cases where there is no 1:1 correspondance.

That is why it is useful to have:

1.  A model that can really give the correct answer, which is not 1:1. 
 An impossible model does not help the implementer to implement.
2.  Simplifications that keep the simple cases simple.  This obviously 
needed work in the proposal, and making more-specific APIs instead of a 
generic API with strings is one possible good course to follow.

>  Even for using the
>browser as the rendering agent for Cool Page, I think the one-to-one
>mapping addresses many of the needs, which could be supplemented as
>necessary.  For example, if Cool Page wanted to know what color the link is
>currently rendered as (to show user current color for editing), then the
>ACTUAL style value is what is needed.  Similarly for z-index, position,
>selection state (not CSS yet?), etc..
>
I cannot judge with respect to Cool Page, because I do not know how 
indirectly it's content model is related to it's presentation.

I agree that the call should be very simple and intuitive to do simple 
things, just not that it is anywhere close to 1:1 if you expect to 
expose information on the view and formatting.

Perhaps what you really want is actual style values instead of a views 
and formatting model (which is clearly where your original message's 
title came from).

>I do not (yet) see a compelling reason to reimplement the DOM for each
>special case module.  That seems to me to fly in the face of OO paradigm--
>which espouses maximum reuse.
>
>It is disruptive.  It is also likely impacts on backward compatibility (or
>at least ease of forward migration).
>
>The Match interface could come later if a compelling need is demonstrated
>after at least covering the basics in the least disruptive manner possible.
> Is there a compelling case of future generality that is lost by starting
>with reuse of the DOM??
>  
>
I do not understand your comment about reimplementing DOM or not getting 
maximum reuse by the decisions that were made.

If the model is too simple, it is, itself, disruptive, by requiring the 
implementation to answer questions that cannot be reasonably answered.

It is not a views and formatting model if it does not permit the obvious 
disassociation to be properly represented.

I can easily accept the position that what you want is actual style 
values, which often DO have a direct association, unlike a formatted 
view.  But please do not ask for the position of an element in the 
presentation unless you want a potentially-compound answer.

>>>Specifically I suggest that the View's Segment properties for any object be
>>>exposed by cast and method call of any node of DOM.  The eliminates the
>>>Match model.  Matching can be performed with scripting on the DOM and thus
>>>could be left for future spec if necessary at all.
>>> 
>>>
>>>      
>>>
>>This denies the nature of the formatting problem.  There is no 1:1 
>>correspondence between these.
>>    
>>
>
>
>Actually I was probably going to suggest dropping the Formatting from the
>module.  What is the compelling case of first release use for Formatting?
>And what do you mean by "Formatting"?  Isn't all formatting done by editing
>the content?
>  
>
Not in any text-based non-final form I am aware of.  Formatting is an 
integral part of the display process.  The HTML does not store where 
soft line breaks occur, nor which fonts are used, etc.

Certainly you can have this sort of 1:1 correspondance in final form 
such as PDF or SVG.

HTML is designed such that it adapts automatically to the constraints of 
the display device.  If the user resizes the window of a particular 
view, not only does the position at which an element is presented in 
that specific view change, but it changes in compound fragments.

>>How do you get the properties of a line? 
>>    
>>
>
>Agreed this is an important special case not in content.  Cool Page's
>presentation class creates an array of line properties.  But I am not sure
>this is very necessary for first release.  For first release, the browser
>could handle line issues internally with normal unattended user interaction
>(such as cursor to end of line or beginning of line), and expose only
>current selection and importance issues.  With a HitTestChildren(), which
>could take either a point, rect, polygon, etc, then you can deduce these
>metrics in a slow way for interim, although I not advocating that use.
>  
>
I am not comfortable answering the question "what is the position of 
this element" without the possibility of answering in-compound, and 
answering that the medium for other reasons cannot answer that question, 
and permitting alternative questions and answers appropriate to other 
situations and media, such as in an aural presentation, your hit test 
may be by time.

>In any case, I think this is best exposed as an extension to DOM-- a
>extended interface for nodes to get more detail information about inline or
>replaced sub-content.
>  
>
I agree that it is best exposed as an extension to DOM.  The only thing 
that is not clear to me is whether this means it is something that every 
DOM Node must be forced to implement.

View.findSegment(element[,...]).getFoo()

seems much preferrable to

element.getPresentationFoo()

which cannot be answered because it often has no single answer.

>>How about of a column?
>>    
>>
>
>CSS3 has Columns.  I think they are working on words.  And I had a long
>debate yesterday about sentences.
>
>  
>
>> How about of a character?
>>    
>>
>
>Some form of HitTest() plus a CursorEvaluator which can return properties
>at the cursor perhaps.  I think exposing non-CSS (non-content) things in
>chunks (in context) is more OO than a global Segment properties
>enumeration.  Incremental design always wins if OO is used properly.  An OO
>is incremental design.
>  
>
The segment design is incremental.  Binding these things to elements 
seems pretty non-incremental.

>>If you ask for the segment of a "bold" tag, what does this mean if the 
>>bold starts on one line and finishes on the next?
>>    
>>
>
>What ever CSS properties are allowed on <B>.  From the content, we know it
>is immutable in content.  What additional properties could we need in
>compelling first use case?
>  
>
If all you want is CSS properties, then this is not views and 
formatting.  I want to know where the text that is bolded, which may 
have some special significance to me, is located and various other 
things.  CSS is not views and formatting.  Position is seldom a CSS 
property in HTML.

>I think you are asking about nodes which are not of much interest in the
>View from the perspective of first compelling use applications.  As more is
>learned, additional properties can be added to the extended interface of
>the DOM.
>
These are very interesting to simple applications I have written.  I 
think you are confusing CSS with the view, which it is not.  A view of 
HTML relies heavily on formatting.  CSS computed values do not (unless 
you are trying to artificially extend them where there was no CSS input 
of a simplar CSS property).  Also, CSS computed values are not 
necessarily the same as the value used by the view, I think, even where 
there is overlap.

>I am just advocating building on the DOM and CSS, then extending and
>caveats from that foundation.
>
>It makes sense to me that if the applications are basically trying to sync
>what is happening on content side (where they edit) with the actual state
>and actions in the view, then the 1-to-1 mapping is the most obvious
>logical first step.
>  
>
I guess it just depends upon what you are trying to accomplish.

>>A character has no corresponding DOM object, so how do you get the 
>>position and size of it?
>>    
>>
>
>First you need to identify the character.  So I say this is block level
>node.CharAtOffset(), which could be result of a HitTest().
>
>Then query it's interface (class) for properties.  This is OO.  A flat
>enumation model of Segment types and properties is not.
>  
>
I have no problem with specializing segment interfaces to an appropriate 
level of specialization as long as we keep the more-abstract base 
classes so that an application can be as abstract as it needs to be when 
dealing with them.

>Casting the document to get View?  You were doing that also.
>
>I am casting the content to get an extended interface on the content which
>can take a View and return new properties.  Why is that unreasonable?
>  
>
I think it is unreasonable to place the burden for maintaining the 
relationship to the view on the individual content nodes.  Any such 
relationship should be owned by the view, and the most the content model 
should do is support registered listeners.  that has been a 
highly-successful model for me, and it is the only way to properly 
maintain modularity of many diverse views that have no 1:1 
correspondance (I am not referring to actual style values which have 
1:1, but actual presentation values which do not).

>There is nothing stopping you from also exposing the SAME code in method(s)
>of View which takes a Node pointer.
>
And likewise, a method on node, which accepts a view, might just be a 
simple convenience method that calls the method on the view.   It will 
not work in common cases because it lacks the ability to properly ask 
the question, but I am all for gathering common cases and making 
convenient conveniences.  It is also not more convenient to a 
moderately-sophisticated implementation of languages such as HTML that 
clearly require a formatting operation to make all the final choices.

It is quite possible for the font to be different depending upon where 
it appears and how it wraps.  You can answer a computed font value, but 
not necessarily the final presentation font.

>The point is that these are properties of the transformed (rendered)
>content, so why not expose them as a transformed interface of the DOM?  Of
>course there are some properties which are exclusively View properties,
>such as View's position, size, scoll, etc..
>  
>
You are talking about computed style values.  This seems reasonable.  It 
is not reasonable for the formatted presentation in 1:1 correspondance.

>If someone wanted all methods to be in View, so that it is not possible to
>do node.Property, then I have no major qualms with that.  I just think it
>is more intuitive to stick with the DOM hierarchy.
>
In the general sense, this is a false intuition, because it involves 
questions that cannot be properly answered.  I have no problem making 
convenience methods as long as the limits of those methods are clearly 
spelled out in terms of the real formatted presentation model.

>node.TransformedProperty is like saying give me x View's transformation of
>the DOM.  In fact, the extended interface does not need to be polled on
>each node, just once on the document (without error checking):
>  
>
I would need more details on this to understand what you mean to have 
this method do.  There is no transformation of the DOM ever defined. 
 There are certain CSS transformations defined, but that has nothing to 
do with DOM.  There is a reasonable expectation of computed style 
values, but that has little to do with position, for example, of most 
elements which are not positioned explicitly by CSS.

>View* v = (View*)document.GetxView();
>ViewTransformedDOM* vtdom = ((View*)document.getInterface( "View"
>)).getViewTransformation( v );
>Node* n = vtdom.getElementById( ... );
>n.TransformProperty (i.e. n.ActualCSSValue)
>  
>
Again, I have no clue how you would expect this to behave.

>>But I would not claim it is easier for the browser implementer to 
>>answer unanswerable questions than to expose a simple wrapper which 
>>corresponds to internal frames, lines, etc. that every reasonable 
>>implementation has to have internally.  Once we have refined the model, 
>>it will not be difficult from an implementation standpoint.
>>    
>>
>
>First you assume there will be lines.  CSS does not guarantee you will get
>lines in your View.
>  
>
That is why they are called segments.  It is reasonable to expect that a 
visual presentation may have smaller presentational segments that are 
not a continuous presentation of the information on the screen.  This is 
why maintaining whatever degree of presentational abstraction is 
important.  If you are not assuming a visual presentation, then it is 
not reasonable to expect font, x,y position, z order, etc.

>And you are defining NEW things which have not been calculated to be
>exposed.  Whereas, with the 1-to-1 DOM mapping, the implementation already
>calculated the ACTUAL style values on each node.
>  
>
OK.  Drop everything but actual style values, and we are on the same 
page now?  This is layers away from the actual presentation and is not 
what was intended to be addressed by the Views and Formatting model.

Drop the position, drop the hit tests, etc., none of which were 
calculated in a way to answer a question on an element.

>And above I postulate that many of these news things may be unnecessary in
>first release.  And you have to be careful about exposing too deeply and
>lock yourself into future.  At least with exposing 1-to-1 you know you are
>exposing things that have to be maintained for existing standards.  It is
>no brainer to me.  And there is the issue of flat enumeration not being OO,
>etc.
>
Then you are not talking about Views and Formatting.  Views and 
Formatting, from A to Z is not 1:1.  Actual style values can be 1:1, 
because it is not bound to the actual presentation for formatting of 
that presentation.

>I mostly agree, except that at least if we expose a 1-to-1 mapping, it is
>safer because:
>
>1. CSS is a standard so those values are calculated any way (better chance
>our standard gets and stays supported and bug-free...bugzilla :)
>
>2. Less to spec and make mistakes on
>
>3. A one-to-one mapping is what DHTML needs
>
>4. Less chance of exposing too deeply apis which become impossible to
>maintain in future (in theory CSS has been designed with great care for future)
>  
>
It is not views and formatting.  My own interest lies in the field of 
views and formatting.  Actual computed values do not necessarily match 
the view and formatting.  They were the best guess before the formatter 
tried to make it proper for presentation on the screen.  It has nothing 
to do with hit tests or positions of formatted text.

>It can't Match only children or sub-content of a Node.  It could be
>extended to do that though.  I am just saying it is a dangerously broad,
>unnecessary, and flat enumeration.  It requires a re-work (non-reuse) to
>define new Properties and object model.  It gains some unknown capabilities
>at cost of some other known capabilities.
>  
>
It gains access to a standardized presentation in an intentionally flat 
abstract (as abstract as possible) manner, which has never been 
standardized before.

If you are content with not being able to ask for position or hit tests, 
which have nothing to do with CSS, and hoping that the computed values 
are close to what the presentation actually used, fine.  That is not a 
subset that I found compelling, based upon personal experience.

>Your interest seems to be in a very general and new abstract way of looking
>at the rendered content.
>
Yes, keeping it as simple as possible, exposing the presentation and 
formatting, which have never been exposed before and require new 
standardization.  Admittedly, this is way beyond getting the actual CSS 
values computed by the CSS hierarchy.  I have never felt much need for 
actual CSS values, and I thought Mozilla even already provides that.

>Whereas my pragmatic focus in on how to get the basic data we need to edit
>content (DHTML and applications) based on the transformed state.  It seems
>like we have an immediate problem that should have been address 2 - 3 years
>ago, and no interest.  We don't have the luxury of inventing totally new
>things.
>
>I'd rather expose the DOM and CSS, then extend from there as more is learned.
>  
>
I apologize for not reading more carefully in the beginning.  DOM and 
CSS has nothing to do with hit tests or positions of content in the 
presentation.

I believe Mozilla already implements the standardized method for this: 
"getComputedStyle".  There are nuances, and I have not spent too much 
time on this.  It might be useful to identify where this does not meet 
your needs and see if the remaining requirements are properly CSS 
functions (in which case the CSS WG has taken over maintenance of that) 
or Views and Formatting issues, in which case, we need to discuss these 
issues, because they are not CSS.

>>I did not see you address issue 5 at all (perhaps you did not understand 
>>it), nor were the others addressed conclusively to my satisfaction.  But 
>>these are truly details, compared to the caring that has to happen by 
>>enough interested parties.
>>    
>>
>
>
>#5 disappears because Segment becomes a Node and there is one-to-one
>mapping to DOM and CSS (with caveats).
>  
>
Computed content is not present in nodes.  We received feedback that 
computed content needed to be addressed.  Computed content is, 
typically, where the presentation may contain new text, for example, 
that are not actually contained in the content.  They are computed by 
the presentation.

>#3 is reusing values computed from CSS.  The browser better know what color
>it is displaying :)
>  
>
It knows what color it is displaying, and it knows what is computed by 
the CSS hierarchy, but these may not always be the same for the entire 
element.  For simple cases, using the computed style will suffice.  That 
still does not make it exactly the presentation, which may choose, for 
example, to change the color of part or all as the mouse passes over or 
for other purposes.

>#2 disappears because of using DOM model
>
>See how quickly reuse eliminates the nasty design issues of a new abstract
>metaphor?
>  
>
You are using computed styles, which don't answer the questions people 
were trying to ask in Views and Formatting (many, not just computed 
content).  You have your answer to a different set of questions.  CSS 
can be reused as long as you are talking CSS results (which may consult 
the display) and not views and formatting (which has a model that does 
not match CSS's).

>It is very difficult to teach people to actually use the 80/20 rule.  Or to
>use it wisely.
>
>[snip] answered previously
>
>Unfortunately, I am getting the sense that W3C is not the right place to
>take up this matter.  It looks like I will stuck with writing my own
>proprietary one.  It has always been faster for me to implement than talk
>and politic.  But I am here to see if there is any serious interest at W3C
>for a non-proprietary Views api.
>  
>
You have rejected a views model, and asked for computed actual values of 
styles, which matches the view sometimes in some ways, because the 
display resources were taken into account, but is far from being the 
view or presentation of the formatted content.

I think we need to get back to specific cases that you think are not 
solved, and ask how to solve them.  You obviously do not like exposing 
the view and formatting, which this specification tried to do.

>Any way, I still have an open-mind at this point. And perhaps IE will win,
>and I will leverage it for another year or so (Cool Page v4) and then
>retire instead.
>  
>
The better question is, is it suitable for your purposes.  I doubt it 
will ever be suitable for mine, but many people have considered it 
suitable for their purposes for a very long time.

>If anything, I hope my feedback has been shocking to you, given you are the
>DOM Lead.  I hope I have laid some questions about fundamental design which
>will be useful forward.
>

Certainly not shocking.  You showed that your requirements were 
different from the ones that the WG was trying to fill with the Views 
and Formatting specification.

Certainly enough people wanted computed style values that it is 
supported today.  It does not go far enough for me, and it may also not 
be exactly what you want, but it may well go far enough for you, if your 
have no requirements that require the presentation and formatting, for 
which a 1:1 mapping still makes little sense.

It is easy to confuse things which look similar and then think they 
should be the same.  That is why when I talk about reuse, I first stop 
to make sure it is appropriate.  It makes sense4 to reuse CSS for actual 
computed style values, but not so much for Views and Formatting.

The CSS working group continues, I believe, to work on the CSS API for 
DOM, which permits computed style values to be obtained.  I cannot 
guarantee that it is supported in all major browsers, but I think 
Mozilla does a reasonable job.

Ray Whitmer
rayw@netscape.com
Received on Tuesday, 17 December 2002 14:58:04 UTC