Re: ISSUE-41/ACTION-97 decentralized-extensibility from Shelley Powers on 2009-10-05 (public-html@w3.org from October 2009)

From: Shelley Powers <shelleyp@burningbird.net>
Date: Mon, 05 Oct 2009 08:06:23 -0500
To: Henri Sivonen <hsivonen@iki.fi>
CC: "public-html@w3.org WG" <public-html@w3.org>, Adrian Bateman <adrianba@microsoft.com>, Sam Ruby <rubys@intertwingly.net>
Message-ID: <4AC9EF4F.9070504@burningbird.net>
Henri Sivonen wrote:
> On Oct 3, 2009, at 02:45, Shelley Powers wrote:
>
>> I can speak for an SVG editor, Inkscape, which uses namespaced 
>> elements and attributes to record information about the SVG it 
>> produces. And before you mention using HTML comments, you should 
>> spend time with an Inkscape SVG file, to see how extensive the use of 
>> namespaced elements and attributes are in an Inkscape managed SVG file.
>
> Inkscape is indeed a good example.
>
> It isn't clear to me why Inkscape couldn't serialize its state using a 
> proprietary key-value syntax inside comment nodes instead of using 
> attributes as the key-value syntax. I can see why one might consider 
> it unnatural to invent a product-specific key-value syntax when the 
> off-the-shelf part of the underlying XML language framework already 
> offers attributes. (OTOH, xml-stylesheet is precedent for minting 
> above-XML-layer key-value syntax.)
>
> The copious amounts of internal state that Inkscape dumps into files 
> raises the question of whether it's a good idea to dump that kind of 
> state in files used for interchange.

Whatever the reason, Inkscape's usage of namespace elements exists, and 
is real, and does impact on HTML5 because of the inclusion of SVG in HTML.

>
>> I'm also aware of other applications that have defined attributes to 
>> be applied to HTML elements so that JavaScript libraries can do 
>> something with the data. A case in point is a accordion JS 
>> application that would depend on these attributes. To use the 
>> application, I had to add these attributes to the div elements that 
>> formed the accordion panels and header bar.
>>
>> Now, the HTML5 custom data  could be useful for something like this, 
>> except for a problem: According to the HTML5 Specification, "These 
>> attributes are not intended for use by software that is independent 
>> of the site that uses the attributes."
>
> As I understand it, JS libraries included by the page itself are not 
> "independent of the site" for the purpose of the spec sentence even if 
> the libraries are treated as off-the-shelf commodity parts that aren't 
> modified by the site author.
>

But the attributes and elements are defined within a library meant to be 
used places other than where the script originated.

Obviously the specification needs clarification in this regard.

>> Probably the reason why is there is nothing in data-* to handle name 
>> collision. After all, there's only so many types of words to use, and 
>> there could very well be collision between a JavaScript library that 
>> does an accordion, and perhaps another one that does a tabbed UI, and 
>> both are used in the same site.
>
> If library authors are concerned of collisions, the probability of 
> collision goes near enough zero if you use the name of the library as 
> part of the identifier and rely on social and trademark structures to 
> avoid collisions of library names. E.g. data-dojo-foo and 
> data-jquery-foo don't collide with each other and also don't collide 
> with data-foo.
>

And eventually we'll have separate specifications for people to use to 
avoid name collisions, so that there's consistency in the result. Oh 
wait, we have something already...

>> Using RDFa for these purposes would be inappropriate because that's 
>> not the underlying purpose for RDFa.
>
> I think a stronger argument for why RDFa (or Microdata for that 
> matter) is inappropriate for this use case is that the RDF graph 
> represented by RDFa doesn't have data model-level correspondence to 
> particular elements in the DOM even though syntactically the graph and 
> the DOM are overlaid.
>

Perhaps my concern is just as valid, and my argument is strong enough.

>> An extensibility mechanism that is decentralized would be 
>> appropriate, because such a mechanism would have, built in, the 
>> capability of dealing with name collision. That means JS library 1 
>> could define it's own "js-tab" attribute, and JS library 2, could 
>> define its own "js-tab2" attribute, and a person could use both 
>> libraries without a problem.
>
> You can do this as data-library1-tab and data-library2-tab. These 
> don't collide with data-tab, so this scheme works even without 
> everyone participating.
>

And again, perhaps someday will have a specification that details this 
how to handle name collision so that such uses would be consistent. Oh, 
wait...

>> In my opinion, RDFa is a decentralized extensibility, yes, but with a 
>> limiting constraint: the data that is recorded using RDFa is based on 
>> a specific model, RDF.
>
> OK.
>
>> As such, it's extensible, in that many data vocabularies can be 
>> defined within RDF, and recorded with RDFa. It's extensible in the 
>> same way that the SQL data model is extensible. Like RDF, the SQL 
>> model is decentralized, too, in that you don't have to go to some 
>> governing body in order to define a new use for SQL. Wordpress has 
>> its own database design that differs from Drupal's, but both are 
>> based on the SQL Data model.
>
> I don't follow this example. Surely typical SQL databases have a table 
> called 'users' and you can't just go and merge two SQL databases 
> without a table name or schema collision.
>
Ignore the example, then, as irrelevant to the discussion.

>>>>> How does the Web become better if additional pieces of native-code 
>>>>> software hook to the DOM in addition to hooking to 
>>>>> <object>/<embed> and a byte stream?
>>>>>
>>>> Native code software?
>>>
>>> Code implemented as instructions native to the CPU. The way NPAPI 
>>> plug-ins and ActiveX controls are implemented.
>>>
>>
>> You're talking about the additional functionality necessary in order 
>> to parse the namespaced elements in HTML, and attributes, and store 
>> them in the DOM? We're not talking about XHTML, I'm assuming, since 
>> this capability already exists. You're talking about the proposal and 
>> its impact on the parsing, and JS access, in HTML?
>
> I'm not talking about additional functionality in building the DOM in 
> the parser. I'm assuming that the MS proposal meant to specify changes 
> to the functionality shipped in every HTML parser.
>
> I'm talking about the processing of the DOM subtrees that contain 
> extension elements or attributes once the tree has been built by the 
> browser-supplied parser. I'm asking if the purpose of enabling such 
> subtrees is to increase the plug-in API surface of browsers and enable 
> third-party native code implement the presentation of such subtrees.
>
I would imagine so.

>>>> Well, when it comes to namespaced elements in SVG in an HTML 
>>>> document, I can see immediate benefit to JS libraries accessing 
>>>> those elements.
>>>
>>> SVG is not a "decentralized" extension to HTML, AFAICT. It's 
>>> centralized right here at the W3C together with HTML.
>>>
>>
>> I think you misunderstood my answer. And I'm not quite sure if your 
>> understanding of "centralization" is the same as mine, either, from 
>> how you're using it in the above sentence.
>>
>> I wasn't talking about SVG as an extension to HTML. I was talking 
>> about SVG being part of HTML, and the fact that it's fairly common 
>> for namespaced elements and attributes to be used in SVG. Common, and 
>> permissible based on the SVG specification.
>
> I see. Are existing JS libraries operating on SVG trees in existing 
> Web content using namespaced attributes and elements in way that 
> data-* attributes don't address?
>

So let me get this straight: you're expecting that when a person copies 
and pastes a SVG file into HTML5, they will go through the SVG and for 
every namespaced attribute, they will replace it with a data-* 
attribute?  And what are they supposed to do with the namespaced elements?

Do I understand you correctly? Is this your proposal?

>> I tried to explain some uses and interests in distributed 
>> extensibility above. Let me know if these weren't sufficient.
>>
>> I believe that Tony also referenced a view of distributed 
>> (decentralized) extensibility, as well as some possible use cases.
>
> I'm interested in seeing a definition of what "decentralized 
> extensibility" so that alternative proposals can be tested against the 
> definition to determine if they constitute "decentralized 
> extensibility". (I guess it wouldn't be unexpected if you, Tony and 
> Sam came up with different definitions, although so far Sam seems to 
> be avoiding writing down a definition even when asked.)
>
> Currently, the WG lacks a definition against which to assess if e.g. 
> the naming scheme Jonas mentioned (<org_example_foo>) would be 
> "decentralized extensibility".
>

I believe I have answered the question, and I think others have also. 
I'm not sure how else to answer it, though, so that it meets your 
criteria for a definition.

I think there may be a language or other communication problem, though. 
You're saying, in effect, you don't understand what decentralized 
extensibility is, or why needed. I would have thought this had been 
expressed, but evidently, we've used the incorrect terminology. Hmm, 
I've noticed that Ian also has trouble with understanding some of our 
concerns and expressed interests.

I think there is a real disconnect in the perceived importance of the 
needs of web page authors, developers, and accessibility advocates, and 
those who write HTML parsers. It seems that if we can't word our 
requests or concerns in terms that would primarily benefit HTML parser 
developers (browser or validator), the concern or need is, somehow, 
unimportant, or difficult to understand.


>>> Is the set of characteristics a proper subset of the characteristics 
>>> of Namespaces in XML or are the sets one and the same?
>>
>> Being able to handle name collision is probably the biggest area of 
>> concern.
>>
>> Any form of extensibility has to be able to handle a wide, diverse 
>> audience, who may or may not be aware of the work others are doing 
>> when they take advantage of the extensibility.
>
> For clarity, do you mean element and attribute names?
>
> Are there any other salient characteristics of "decentralized 
> extensibility", in your view, than
>  * a diverse audience being able to mint names
> AND
>  * collision avoidance when the minters are unaware of each other
> ?
>
I think avoidance of name collision is important, particularly if we 
need to phrase this discussion in terms parser developers can 
understand. I think parser developers can understand this one.

I think it's also important to facilitate an approach that is based on a 
mature specification that has widespread use. More importantly, one that 
could be used in SVG regardless of whether the SVG is parsed as XML, or 
as part of an HTML document.


> Would a scheme where names with an underscore are reserved for 
> extensions and anyone minting an extension name is told to prefix the 
> name with their domain name with components reversed and joined with 
> underscores (followed by an underscore), e.g. org_example_foo, be 
> "decentralized extensibility"?
>
>>>>> * When content depends on language extensions that need client 
>>>>> software extensions to process, the ability of users to read Web 
>>>>> content is harmed in software/hardware environments for which the 
>>>>> client software extensions aren't available.
>>>>>
>>>>
>>>> I would say that a significant proportion of HTML5 falls into the 
>>>> category of needing implementation that isn't universally available 
>>>> in all environments.
>>>
>>> As far as I know, the HTML5 spec is royalty-free and it's being 
>>> implemented in multiple engines some of which are Open Source. There 
>>> doesn't seem to be any one party controlling the availability of an 
>>> HTML5 implementation for a given computing platform.
>>>
>>
>> You really misunderstood me on this one. I think it's the different 
>> views we each have.
>>
>> By this I meant that there are elements of HTML5 that are not 
>> currently supported in any browser, and won't be available in all 
>> environments for potentially years in the future.
>>
>> I can see now you must have meant something about royalty or patents? 
>> Is that it? I'm not sure where you're coming from with this one, 
>> based on your most recent answer.
>
> I mean that the ability of a user to read HTML5 or SVG content is not 
> gated by one party. The user's ability to read content in HTML+FooML 
> (where FooML is an extension) could be gated by one party if the 
> ability to process FooML is only available from the proprietor of 
> FooML as a binary extension to those browsers that the proprietor of 
> FooML chooses to support.
>

Gated?

So an application provides several namespaced elements and/or attributes 
that a person could add to their SVG or HTML, but the application 
developer's are claiming that the elements and/or attributes are 
proprietary and can't be processed by anyone else.

So therefore, the use of namespaced elements and/or attributes is invalid?

I would assume, then, that we should pull of the W3C specifications, 
because there's isn't a single one of them that can't be used to create 
something proprietary.




>>>>> * Working with string tuple identifiers is harder than working 
>>>>> with simple string identifiers.
>>>>
>>>> Again, this has nothing to do with your concern about decentralized 
>>>> extensibility. I think we should focus on the most significant 
>>>> concern, address it, and then move on to implementation once past 
>>>> that initial concern. Don't you think?
>>>
>>> It depends on whether 'decentralized extensibility' is synonymous 
>>> with Namespaces.
>>>
>> No, I don't believe that decentralized, or distributed, extensibility 
>> has been defined as another name for Namespaces.
>
> OK.
>
>>>>> * Prefix-based indirection (where the prefix expands to something 
>>>>> as opposed to being just a naming convention) confuses people.
>>>>>
>>>> Again, outside of the initial concern about decentralized 
>>>> extensibility as a whole.
>>>
>>> It depends on whether 'decentralized extensibility' is synonymous 
>>> with Namespaces.
>>>
>>
>> So, what you're saying, then, is that your concern really isn't about 
>> distributed/decentralized extensibility. Your concern is about the 
>> specific implementation. Do I have that right?
>
> Assuming that the <org_example_foo> naming convention without API 
> changes counts as decentralized extensibility, then this particular 
> concern doesn't apply to decentralized extensibility in general but to 
> a particular implementation.
>

I'm confused, sorry. You have said we have not stated what decentralized 
extensibility is, but then you're saying whatever it is, we can use 
<org_example_foo> instead.

It seems to me, then, that you do understand what we mean by 
decentralized extensibility, and your requests for clarification in this 
regard are, sorry to be so frank, disingenuous.

So your counter proposal is, then, that people just make up whatever 
elements, without namespaces, and incorporate reverse DNS into them, as 
a way of handling name collision?

Then how do we handle the round trip for SVG, from XML to HTML5 and back 
again?

I'm also disturbed at the willingness to support an approach that 
creates such a divergence between the HTML and XHTML serializations of 
HTML5.

Frankly, I'm disappointed at the suggestion of a proposal that creates 
such a divergence from what's been practiced, and promoted, in the past 
from the W3C.



> However, my concern about needing proprietary processing software for 
> a FooML extension seems to apply to any extension scheme where 
> extensions are processed via broadened plug-in API surface.
>
>> Hopefully others will jump in and provide you the answers I don't 
>> seem to be able to provide. I do feel you deserve answers to your 
>> questions.
>
> Thanks. I'm hoping Tony (when he returns from vacation) and Sam would 
> clarify what their goals are when it comes to the "decentralized 
> extensibility" label.
>

You'll have to forgive me for being frank again--frank, not shrill--by 
saying that not everyone is confused by what Tony and Sam have said 
about this phrase. But hopefully they can provide the clarification you 
need.

Shelley
Received on Monday, 5 October 2009 13:07:03 UTC