[whatwg] Ghosts from the past and the semantic Web

Shannon wrote:
> However to be on par with RDFa this proposal simply needs a CSS-like
> @import statement or vocabulary property and possibly an inline
> attribute as Silvia suggested.
> 
> <link rel="vocabulary"
> href="http://some.official.vocabulary/1.1/metadata.cm">

Not workable, as this in the HEAD of the document and oftentimes we
simply can't expect users to be able to modify the head of the document
(widgets, blog engines where you can only modify the content of a blog
entry, etc...)

> <metadata>
> @import http://some.official.vocabulary/1.1/metadata.htmd

So, here you're making a new *element* up, which means that if you want
to add this to your document, your DOM tree changes in such a way that
your existing CSS rules may well go bust.. at least if you intend on
reusing the rendered content and the machine-readable data the way RDFa
does:

  <h2 property="dc:creator">Ben Adida</h2>

[...]

> <div meta="vocabulary:
> url(http://some.official.vocabulary/1.1/metadata.htmd); title: Computer
> Engineer"></div>
> 
> These CSS behaviours each have benefits and drawbacks but all are widely
> used and understood by authors.

I don't think what you wrote above is widely used or understood. In
fact, I think it's not used at all, whereas RDFa is actually being used
today.

Also, one big hole: how do you make a statement about another item? How
do you describe multiple items on a page? How do you relate two items on
a page? Say, the Craigslist example, with multiple listings?

eRDF tried to squeeze everything into @class, and it isn't able to be as
flexible as RDFa (and thus as we need) in this respect. It has a lot of
trouble expressing data about multiple items.

What's surprising to me is this attempt to shoe-horn so much unexpected
stuff into @class. What is so sacred about HTML4 that *this* issue can't
be helped by a bit of rethinking? Certainly, everything else seems to be
up for rethinking in HTML5.

> Metadata comes in a large number of syntaxes of which RDF is only one.

RDF is not a syntax, it is a model for metadata.

> Since nearly all are text-based most can be easily transcribed from one
> to the other.

No, that's not true. The models have to be compatible. RDF provides a
model that is the most web-like of the models I've seen. I could be
wrong, of course, let me know if there are other models that you know
of. Maybe you're confusing RDF and RDF/XML, its XML serialization?

> I don't think the format is important one way or the other
> except when you want to embed the vocabulary in a HTML page.

You mean the syntax? Yes, the syntax is what we're after. RDFa is a
syntax for RDF in HTML.

> RDF can't
> do that because it's vocabularies are XML.

No, not at all. Its vocabularies have nothing to do with XML (that's
actually why we don't use QNames). I think you're confused about RDF. It
can be serialized as XML, but that is just one choice of serialization,
not anything that ties RDF to XML conceptually.

> RDFa simply specifies a
> relationship between parts of two documents and is therefore not
> entirely different to @class, @rel or anchor fragments

Exactly, it's not that different! That's why we use @rel and @href. But
we need a little bit more, which is why we also have @about, and
@typeof. By the way, we considered using @class instead of @typeof, but
we met with serious opposition from folks who didn't want us to mess
with the existing uses of @class.

> and in itself
> does not appear to be "years of work" (except in advocacy maybe).

I said *RDF* was years of work. RDFa was also years, but a lot of that
was framing of the issue and figuring out if we could do things using
existing attributes (we couldn't).

> I don't see why RDF can't be parsed as an import. I was simply
> demonstrating that it is not necessary or desirable for all metadata
> properties of an object to be defined directly ON the element. They can
> be described elsewhere and associated through the use of CSS-like rules.

Except that means you have to coordinate multiple documents, and that
really doesn't follow the goal of having HTML be the carrier of all the
information. That's an important requirement for Creative Commons and
others.

> RDFa is just a bunch of custom attributes. I meant you can't embed the
> RDF vocabulary. You may say site-specific vocabularies are a bad idea

No, I don't think they're a bad idea, I just think that you need to have
the *ability* to create global, reusable vocabularies.

> but I have to disagree there. It is extremely common for small groups to
> develop their own "lingo".

Yes, and what better way to do that than to use URLs they own, at least
for the machine-readable portion?

> But the real point of all this wasn't just to recommend a new syntax but
> rather to recommend the reuse of class and selectors rather than the
> creation of new and clearly controversial attributes.

The existing attributes don't do everything we need, as the Primer makes
clear with the use of, for example, @about.

> Since CSS is
> familiar to web developers AND already implements the extension and
> selection mechanisms to target specific elements and groups it is
> superior to RDFa in many ways.

This argument is *exactly* the opposite of criticism we heard during our
open period for comments, that whether @class is a semantic extension or
just a CSS hook, most users think of it as purely a CSS hook.

Plus, it's *not enough* to express the full range that we need.

> * Import OR embed official OR unofficial metadata and vocabularies with
> OR without modifying the target element. This is consistent with authors
> (including CC's) actual needs.

It's not consistent with CC's needs. Please let me speak to CC's needs :)

We need to be able to make statements about embedded images, sound
files, etc... with a consistent syntax across all. @class is not enough.

> * You can assign metadata properties by tag type, attributes and
> adjacency. The primary benefit is where custom classes and attributes
> aren't an option.

That's one extra level of indirection, which makes things more complicated.

> * You can specifically target elements by class and id. The primary
> benefit is more clarity or specificity.

One extra level of indirection doesn't make anything clearer. That's the
criticism we heard about CURIEs, and Manu explained well why we think we
gain from CURIEs more than we lose.

In this case, you're proposing indirection that gains us nothing, since
@class will still have to provide some semantic meaning that is *then*
mapped to more generic semantics in a separate file. What's the gain?
There's no real separation of concerns, unlike with CSS.

> * You can store the metadata locally or remotely and in any format (ie,
> RDF, ID3) that can be parsed by the agent to key/value pairs.

Key/value pairs are not enough. We need the subject, too. Triples.

> * The structure and metadata can be separated. The benefit is both the
> metadata and the HTML are cleaner and easier to maintain.

As I mentioned above, I don't buy this argument one bit. You still have
to assign @class, which is effectively an indirection on the metadata,
but really is another way of expressing (a limited amount of) metadata.
So there's no real separation of concerns.

> * Vocabularies can be joined (cascaded) or scoped using multiple classes
> or nesting.
> * Metadata can be commented.

I'm not sure I understand those two points, but I don't think it matters
much: the solution you propose appears to be quite a bit more convoluted
than the simple examples Manu described.

It's surprising to me that there is this aversion to additional,
well-thought out attributes, but no aversion to stuffing untold
complexity inside @class and CSS, in a way that most HTML authors would
never expect.

It is quite a bit cleaner with the RDFa attributes.

> Anyway I'm just saying the OP was right. RDFa solves nothing that CSS
> syntax and classes can't, except that CSS syntax is familiar, simpler,
> more flexible AND more powerful. A win-win-win-win for authors.

No, it's not a win for Creative Commons and other folks who have worked
on RDFa because they needed a consistent syntax. Your solution doesn't
provide the expressiveness we need, and, in terms of design, I find it a
*lot* more crufty to start pushing untold amounts of extra information
into CSS.

@class may be a semantic extension for HTML, but CSS is not. It's
Cascading *Style* Sheets, so your proposal tries to to fit a square peg
into a round hole.

-Ben

Received on Wednesday, 27 August 2008 23:36:59 UTC