Re: Use cases from Benjamin Hawkes-Lewis on 2011-01-02 (public-html-xml@w3.org from January 2011)

From: Benjamin Hawkes-Lewis <bhawkeslewis@googlemail.com>
Date: Sun, 2 Jan 2011 18:01:40 +0000
To: Julian Reschke <julian.reschke@gmx.de>
Cc: Norman Walsh <ndw@nwalsh.com>, public-html-xml@w3.org
Message-ID: <AANLkTi=Kd1hZKRm-=ALS29t2m4pPg_VtFkxtkShYr6LF@mail.gmail.com>
On Sun, Jan 2, 2011 at 10:56 AM, Julian Reschke <julian.reschke@gmx.de>
wrote:
>> The hypermedia interface is enabled by vendors agreeing on the common
>> interpretation of the semantics of a vocabulary and building
>> interface that reflects that interpretation, not by any particular
>> mechanism of enforcing the uniqueness of vocabulary terms.
>
> So?

My objection to arbitrary vocabularies was that they break the uniform
interface. You said they don't because of namespaces. I think this shows
that's a red herring.

>>> It can also happen in controlled environments, where you may be able to
>>> rely
>>> on certain browser extensions to be there.
>>
>> At that point, by definition, the interface is no longer uniform and
>> instead
>> requires specialized client knowledge, breaking REST.
>
> Um, no. "Uniform" doesn't necessarily mean that everybody needs to
> understand it right now.
>
> It can depend on the intended audience, and a point
> of time. What's not understood widely today maybe tomorrow. If this wasn't
> the case, we couldn't evolve the language, and add vocabularies the way we
> just did (with SVG and MathML).

Agreed.

But the implementors of the client software of the world wide web coming
together to agree how they will evolve the uniform interface is
different from producing gobbledygook only understood in "controlled
environments" like an intranet with known software characteristics, or
only understood via special communication between clients and servers
like in an RPC service.

And while the uniform interface cannot block evolution, text/html needs
to be evolved to maximize the uniformity of interface even for today's
user agents. Hence the attention paid to backwards compatibility in
HTML5; for example with the new "input" types that fall back to "type=text".

>> Also, in controlled environments you can just use other media types
>> including all the text/html vocabularies if you want arbitrary XML
>> vocabularies, so this *cannot* be a use case for adding such functionality
>> to text/html.
>
> Just because there's more than one way doesn't mean the other way "can't" be
> used.

No need has been demonstrated.

> I like progressive enhancement. It would be nice if it's always possible to
> use.
>
> It works best if you start with data that's close enough to what HTML
> already allows.

When the text/html media type does not allow you to represent your data,
maybe you're using the wrong media type …

>> If the consumer threatens applies untrusted JS, then at some later
>> state the document might be made sensical by converting the nonsense
>> into recognized semantics. This is a big "if", because of network
>> unreliability and varying implementations of the language and DOM
>> APIs. But worse is that this is "forcing users to put themselves at
>> risk by executing untrusted code just to gain access to basic content
>> and functionality", as I mentioned before.
>>
>> At web scale, not all consumers will apply untrusted JS or even
>> implement JS. For these consumers, the document will remain
>> nonsensical. In this way, unrecognized vocabularies break the uniform
>> interface.
>
> Yes, that's a drawback.

QED.

>> Or as Fielding puts it:
>>
>> "Distributed hypermedia provides a uniform means of accessing
>> services through the embedding of action controls within the
>> presentation of information retrieved from remote sites. An
>> architecture for the Web must therefore be designed with the context
>> of communicating large-grain data objects across high-latency
>> networks and multiple trust boundaries."
>>
>> http://www.ics.uci.edu/~fielding/pubs/dissertation/introduction.htm
>>
>> Breaking RESTful expectations and endangering end-users in this way
>> is the exact opposite of what W3C should be encouraging.

[snip]

> What does the 2nd sentence have to do with what we're discussing?

You agreed above that "unrecognized vocabularies break the uniform
interface". How is that not "Breaking RESTful expectations"?

You agreed above that "unrecognized vocabularies" require consumers to
apply "untrusted JS". How is that not "endangering end-users"?

How is the mission of the W3C irrelevant to the use cases this task
force should address?

>>> But sometimes, annotating the HTML clearly is not the best approach, in
>>> which case alternate embeddable vocabularies may be the better choice.
>>
>> Prove it.
>
> We just added MathML and SVG, right?

That suggests we sometimes need to expand the core vocabulary by means
of the standards process, not that we need to bypass the standards
process.

>> Please give a real example of a resource that you imagine *cannot* be
>> represented in terms of the uniform interface provided by text marked
>> up with generic HTML/MathML/SVG semantics.
>
> Oh, so you say that after adding these, no new use cases will ever
> surface?

No, I'm trying to focus the discussion on solving real human problems.

> There are many more vocabularies that might qualify; the 3D stuff is
> one, Music and Chemistry might be others.

These are entire vocabularies rather than the concrete examples of
resources I was hoping for, but thanks for raising them nevertheless.

3D and music are essential aspects of the human experience and chemistry
is one of the fundamental sciences, so it is very important that W3C treats
them seriously and gets them right by adding any required features to
the core vocabulary, rather than just trusting in the magic of
distributed extensible gobbledygook to meet the W3C's commitments to
interoperability, internationalization, accessibility, security, etc.

I disagree that the information that might be represented via these
vocabularies *cannot* be represented, albeit sometimes cumbersomely,
with text marked up with today's generic text/html semantics.

Resources that could be described using 3D graphics can also be
projected in SVG and described using text. The later is critical
to preserving media independence and accessibility even if we added
3D graphics markup to the core vocabulary.

3D environments today have an accessibility deficit:

http://wiki.secondlife.com/wiki/Accessibility

The involvement of W3C in specifying 3D markup for the baseline
hypermedia format would increase the chances of baking in the
necessary accessibility features. Indeed, Web3D work along these
lines is already looking to W3C for help:

http://web3d.org/x3d/wiki/index.php/X3D_and_E-Learning

There's a proposal on the table to add X3D to HTML5:

http://www.web3d.org/x3d/wiki/index.php/X3D_and_HTML5

Web3d want "to make the native authoring and use of declarative" 3D
"scenes as natural and well-supported for HTML5 authors as the support
provided for Scalable Vector Graphics (SVG) and Mathematical Markup
Language (MathML)." Getting vendors to agree on a common approach
through the standards process might achieve that; dumping random 3D
vocabularies straight into text/html is unlikely to do so.

With respect to music, there's been a recent discussion over on the
Audio Incubator Group mailing list about musical notation on the web:

http://lists.w3.org/Archives/Public/public-xg-audio/2010Dec/0029.html

I strongly everyone on this task force to read it with care, because it
discusses some of the human issues at stake and is really very
illuminating. In particular, I can't see how allowing arbitrary
vocabularies inside text/html would help solve the problems described
such as:

   - There are commercial incentives against making musical scores
     available in an open format on the web, including bandwidth,
     copyright, and vendor lock-in.
   - We need to support multiple representations (different
     scoring notations from different cultures, Braille, talking
     music, possibly actual performance).
   - There is more than one digital format for musical notation.
   - Existing formats have interoperability problems.
   - Existing formats confuse semantics and formatting.
   - Some existing formats are hard to hand author (not least because
     of XML).
   - Rendering music notation is extremely complex.
   - Browsers don't render the existing formats natively, although
     there is some plugin support.

I'd note that there is general enthusiasm for W3C being involved in the
standardization work when it comes to browser rendering or that text/html
parsing was not mentioned as a problem.

On the other hand, there might be utility in defining the text/html
algorithm so that we can add MusicXML disambiguated by a <music> element
later. Leaving our options open in this way seems a far cry from Norm's
use case as described though. In passing, another approach might be to
look at mapping text/html, XML, and other vocabularies alike to common
RDF vocabularies, so that the serialization details cease to be
relevant.

You claim that music cannot be represented using the core vocabulary in
text/html today. While text/html is certainly not ideal for the job,
it's by no means impossible. You can render a musical score with SVG
(some music software like Lillypond already supports SVG export),
provide text equivalents of the form that would be spoken in talking
music, and allow direct data extraction from the markup by annotating
with the RDA Form of Musical Notation.

http://wiki.lilynet.net/index.php/SVG_backend

http://projects.dedicon.nl/am/talkingmusic.html

http://metadataregistry.org/vocabulary/show/id/55.html

Obviously, you can supplement this with links to alternate formats
(using "audio" for direct performance, and "a href" for MusicXML
and Braille Music Markup Language).

Chemistry: why can't chemical information be represented using text
marked up with the core vocabulary? Can't chemical construction be
diagrammed with SVG, articulated through MathML equations, and described
in plain text, then annotated for data extraction with RDFa?

http://chem-bla-ics.blogspot.com/2006/12/including-smiles-cml-and-inchi-in.html

Again, you can link to alternate representations like Chemistry Markup
Lanugage for more specialized clients:

http://cml.sourceforge.net/

> What I'm saying is that there are cases where you want to *embed* this data
> in an HTML document.

Is anyone embedding the arbitrary vocabularies you mentioned in
non-experimental contexts, rather than using text/html semantics and
linking to/annotating with/transcluding more specialised content?

>> Please further prove that it is better to break the uniform interface
>> rather than extend the uniform interface (by adding to the common
>> text/html vocabulary) in order to represent that resource.
>
> The only difference here is that you want central control. That's a process
> question.

Central steering is critical to the uniform interface.

The "process" (getting vendors and users and authors together in a place
where multiple concerns like accessibility and internationalization are
addressed) is critical to the quality of the uniform interface.

If you don't think central steering is useful, if you don't think having
at least one format that provides a baseline of access to the riches of
the internet is handy, then you don't need the W3C or IETF. Just publish
gobbledygook as text/html. The internet police can't stop you.

--
Benjamin Hawkes-Lewis
Received on Sunday, 2 January 2011 18:02:16 UTC