Semantic Argument (Warning: Long Post) from Doug Schepers on 2004-11-11 (www-svg@w3.org from November 2004)

From: Doug Schepers <doug@schepers.cc>
Date: Thu, 11 Nov 2004 04:45:37 -0500
To: <www-svg@w3.org>, "'Ian Hickson'" <ian@hixie.ch>
Message-Id: <20041111094537.8F0A314968C@pillage.dreamhost.com>
Hi, Ian-

You and I both have been throwing the word "semantic" around quite a lot.
I'm not sure we always mean the same thing by it, nor am I entirely certain
that we each have been self-consistent about its use. I'm almost certain
that we disagree on how best to achieve, facilitate, and present semantic
content.

Naturally, I understand what I mean by "semantic," and I've defined it
several times. In short, as I see it in a Web context, semantic content is
that which is marked up consistently with tags and attributes from a defined
ontology within a particular domain.

The audience for this content need not be universal. I care very little
about, for example, literary criticism or automobile manufacturing, but
those fields have their jargon and needs, and they care about it very
deeply. (In the case of lit-crit, perhaps a little too deeply).

The question is how best to present those data. I believe that the proposed
feature-set of SVG 1.2 concisely and specifically addresses how we can do
just that. It is bottom-up, not top-down; SVG doesn't dictate domain
semantics, nor should it... it just provides the basic building blocks to
create appropriate representation of those ontologies. This is, in my
opinion, an improvement on the model of an 80-Percent-Solution, where we
will always be constrained by what the W3C has decided to pay attention to.
This empowers users and authors to extend the Web, not just stumble along in
the status-quo.

Neither you nor I can decide what every user wants or needs. But given the
right tools, they can, and they can express it in a semantic fashion.

I don't see a very consistent usage of "semantic" in your posts. This isn't
surprising; you're clearly an intelligent guy, and I'm sure you have a
sophisticated idea of Web semantics that can't be expressed in a few
out-of-context emails. But I want to understand where you're coming from, so
I've quoted you on the subject from several recent posts. I'll respond to
specific points inline, and I'll welcome your response. Note that I'm not
continuing the threads I'm quoting, per se, but trying to establish a
rationalization for semantics and its role in SVG.


Ian Hickson wrote:
| Sent: Wednesday, November 10, 2004 7:08 AM
| Subject: Re: SVG 1.2 Comment: vector effects
|
|
| On Sat, 6 Nov 2004, Andreas Neumann wrote:
| >
| > Besides that, vector effects can help to solve quite a few GIS
| > analysis features (such as intersection, excluding, merging of
| > elements). Doing it all by script would be complex and
| slow, besides
| > the problems that Doug mentioned, regarding semantics.
|
| I don't really see that it is appropriate for the Web browser
| to have built-in support for GIS analysis. I don't doubt that
| it would be very useful in your domain, but there is a line
| to be drawn at how much a Web browser needs to support.
|
| To give parallels with HTML -- HTML has support for a
| definition list - <dl>/<dt>/<dd> -- which can let you write,
| for instance, a glossary.
| That's great,

It is? It's better than nothing (maybe), but it's a far cry to call it
"great".


| but dictionary and encyclopedia authors ask why
| doesn't HTML also have support for saying that a word is a
| noun? Or an adjective? Why is there no way to define the
| syllables of a word? Why is <dd> only one level deep,
| allowing for multiple definitions but not sub-definitions?
| Why is there no markup for highlighting the root of the word,
| the pronounciation of a word, the etymology of a word?

You claim elsewhere that one of your objections to textflow in SVG is that
it loses the semantic value available in HTML, and yet here you admit that
the HTML semantics are inadequate for all but the lowest common denominator
of certain established print-legacy domains. There are a few semantic tags
in HTML, such as 'p' and 'cite' and 'acronym', but it's not very rich, and
we won't get to the Semantic Web that way.

The answer I would give is for a dictionary definition language to be
developed, and the markup to be preserved, merely styled with CSS or sXBL.


| The answer is that HTML -- like SVG! -- should be a generic
| language, suitable for a wide range of fields, relatively
| simple to implement. For more specific work, like writing a
| dictionary, more specific languages should be used, which can
| then be transformed into HTML when it comes to the final
| stage of showing it to the user.

But then you lose almost all the semantic value that's available! It does
very little good to have a rich structure that is then stripped of all that
structure in order to be presented. This goes directly against your stated
goal of increasing semantic content.

The proposed scheme for SVG+sXBL, as far as I understand it, is for the rich
semantic structure to be preserved in the presentation document, available
at the DOM level (and to document-scrapers that might read the content in
and render it appropriately), while the SVG (whose only semantics are
geometric and style-oriented) is what is shown at a visual level.

Doesn't that seem like a better choice?


| Sent: Monday, November 01, 2004 3:00 PM
| Subject: Re: SVG 1.2 Comment: 4 Flowing text and graphics
|
|
| > You did not answer my question. How about drawings? Maps,
| schematics,
| > diagrams - pure drawings - use multiline text more often than not.
|
| I can't off hand think of any drawing that used multiline text with
| automatic word-wrapping where that text would not be better marked up
| using a semantic markup language. Maps rarely use multiline
| text in my experience (mostly text on a path),
| schematics typically have just labels, and when the
| labels expand into multiline text that text would need rich
| semantic markup such as HTML <var> elements.

Which rich semantic markup would that be? How useful is <var> in a larger
context? Here are most of the logical or semantic tags in HMTL:
'h1-6', 'p', 'em', 'strong', 'cite', 'blockquote', 'dfn', 'acronym', 'abbr',
'address', 'ins', 'del', 'samp', 'code', 'var', 'kbd', 'href', 'alt',
'longdesc', 'title', 'ul', 'ol', 'li', 'dl', 'dt', 'dd', 'table', 'tr',
'th', 'td', 'col', 'caption', 'label', 'legend', 'sub', 'sup'

I may be missing some, but not many. You might argue for the inclusion of
form widgets, but those are so variable in function, as I mentioned before,
that only the labels, legend, and submit button have any sort of reliably
conveyable meaning. Of these 40-ish tags, many are structural, some are
hyperspecialized (4 for computer terms), some are near-synonyms ('cite' vs.
'blockquote', 'dfn' vs. 'dl'), some are for editing ('ins', 'del'), many are
loosely defined as far as content ('address', 'h1-6'), and others are
misused (tables for layout, lists for navigation menus as well as shopping
items). Add the fact that many are seldom used where they should be (how
often does a shopping cart really use lists? How many acronyms are tagged?),
and in a pragmatic sense, you don't have a very semantic Web.

And this ignores the case of specialized domains, which would need their own
lexicon. So, not only is the Web not truly an incarnation of the subset of
semantic capability it does have, but it leaves out almost everything!

This isn't HTML's fault. It's a hard problem. XBL could be used with HTML to
preserve semantic content while using HTML's strong side: presentation of
text in a standardized document format.


| > There is nothing that we can reuse as of today.
|
| Except for the entire line layout model and the semantic
| richness of using
| HTML as the markup language (not to mention the ability to
| then mix in
| MathML and instantly allow math expressions in SVG text as well).

Yes, this is what I'm enthusiastic about: using SVG to depict meaningful
markup, as a sort of transmission medium.



| Sent: Wednesday, November 03, 2004 11:23 AM
| Subject: Re: SVG 1.2 Comment: 4 Flowing text and graphics
|
|
| On Tue, 2 Nov 2004, Robin Berjon wrote:
| >
| > Ever seen poetry laid out inside a shape? Ever seen ad text
| following
| > the shiny curves of the latest spacecraft? Ever seen some sombre
| > lament about the passing of time animated as it falls
| through an hourglass?
| > *That* is what it's for. It's for text when used as graphics.
|
| All three of those examples are great examplies of documents
| that need semantic markup. Sure, they are presented with
| lovely shapes. But at the heart of the issue, they are still
| text, and it would make just as much sense for them to be
| rendered aurally using a speech CSS stylesheet, or to a TTY
| using a UA's built-in styling rules, or to have them indexed
| using Semantic Web inference rules.
|
| If those three examples are examples of when multiline text
| is to be used in SVG, then multiline text in SVG should be
| done by applying SVG to documents in other markup languages,
| not by adding more text markup to SVG, in clear violation of
| both AWWW and WCAG.

The flowing features aren't meant to convey meaning, specifically. They are
generalized metaphors that can be applied across a broad range of tasks with
similar constraints. Need a dictionary? Represent entries in a block of
flowing text that looks much like a paragraph (even though it's not), and
retain the entry syntax "behind the scenes." That's flowPara. Need a list of
items? Separate them with a flowLine. Need to cite someone? In classic Ted
Nelson/Xanadu style, use a flowTref. (Okay, so it's not full-featured
transcopyright, but it is the closest thing we have).

As for the naming conventions, if it quacks like a duck, let's call it a
damn duck. It's just a metaphor for ease of use and understanding, not a
claim to deep semantics.

The semantics are in the original content being represented, which is stored
in the document and the DOM.


| Sent: Wednesday, November 03, 2004 11:27 AM
| Subject: Re: SVG 1.2 Comment: 4 Flowing text and graphics
|
|
| On Tue, 2 Nov 2004, Jon Ferraiolo wrote:
| >
| > In my opinion, all of this talk about how SVG 1.2 flowing text
| > infringes on XHTML or infringes on CSS represents barking
| up the wrong
| > tree. What it really "infringes" on, if anything, is XSL-FO's
| > <fo:block> and <fo:inline>.
|
| You'll notice that I've been saying the same thing about
| XSL:FO as I have been saying about SVG's proposed multiline
| text feature. It violates AWWW and WCAG, has poor
| accessibility, and semantic markup styled with CSS is a much
| better model which should have been used instead.

Why is semantic markup styled with CSS better than semantic markup styled
with SVG? I submit that it isn't.



| Sent: Wednesday, November 03, 2004 8:20 PM
| Subject: RE: SVG 1.2 Comment: Detailed last call comments
| (all chapters)
|
| On Mon, 1 Nov 2004, Doug Schepers wrote:
| >
| > Are you saying that sXBL is not a good use of SVG? That
| creating a GUI
| > using a guiML rendered in SVG is a bad idea? I doubt that's
| what you
| > mean, but that's what it sounds like.
|
| Yes, I am saying that. It would be very bad for any unknown
| XML language to be sent over the Web -- sXBL doesn't change that.

Just because it's not from the W3C doesn't mean it's unknown.


| > The semantics would come from the domain-specific XML; this in turn
| > would lead to accessiblity (when coupled with SVG1.2's new focus
| > attributes). In fact, this would be a very good accessiblity case.
|
| Using a language that was well-known (e.g. one that was a W3C
| Recommendation, such as XForms) would mean that the content
| had semantics.

That's the only criteria for semantics? That it's a W3C Recommendation? I
know some people who have designed ontologies for medical records that would
take issue with that analysis.


| Using a language that is known only to the sender, and that
| the user's Web
| browser has no build-in support for, would lead to very _poor_
| accessibility. sXBL can't add semantics any more than CSS can.

Again, there are many cases (in fact, the overwhelming majority of them) in
which domain-specific semantics have user bases of many thousands of people,
and which are very well-defined, but which cannot currently be represented
in a meaningful way on the Web (or on intranets, as it may be). sXBL doesn't
add semantics, and nobody made that claim. It simply allows them to be
represented.

But let's take your extreme example of an XML dialect that is developed by
some solipsist who sees a need for it. He makes a graphical interface or
representation of it using sXBL+SVG. Worst case scenario: he is
misunderstood, ignored, and his work goes the way of all flesh. Best case
scenario: his work is seen as laudable, it is taken up and improved by
others, it becomes well known and well understood, and the semantics of his
XML become entrenched.

If I may wax metaphoric for a moment, please think of Esperanto... a
top-down, designed language that died because it had no roots; contrast this
with the lively and growing sign language that arose spontaneously among
deaf children in Central America, which flourished because it was created
out of a need. [1] This is how I think the Semantic Web will grow, by
allowing all sorts of XML dialects to be expressed, using basic tools; not
by mandating limited sets of XML and relying only on legacy technologies to
render them.


| Sent: Wednesday, November 03, 2004 8:57 PM
| Subject: Re: SVG 1.2 Comment: 4 Flowing text and graphics
|
| The point is all of those ideas would re-use the existing
| text layout model from SVG, wouldn't introduce an entire
| chapter's worth of new features, wouldn't step on CSS's toes,
| and wouldn't encourage the abuse of SVG for what should be
| semantic-level (HTML) markup.

Let's let RDF and other ontological XML dialects do their job of providing
rich semantics, and use HTML, CSS, and SVG to do their job of presenting
them.


Regards-
Doug

[1] http://encyclopedia.thefreedictionary.com/Nicaraguan%20Sign%20Language
Received on Thursday, 11 November 2004 09:45:40 UTC