[whatwg] RDFa Features from Manu Sporny on 2008-08-27 (public-whatwg-archive@w3.org from August 2008)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Wed, 27 Aug 2008 12:56:32 -0400
Message-ID: <48B58740.7010802@digitalbazaar.com>
Hi Smylers,

Thanks for taking the time to read this rather long thread and
contribute to the discussion. Responses to your comments are below. :)

Smylers wrote:
> Hi Manu.  Do you disagree with the Microformats community's belief about
> namespaces being more difficult, or do you think they are more difficult
> but that this doesn't matter?

I agree with the Microformat community's belief that namespaces are more
difficult to understand, if you do not have any sort of programming
background.

I know that everyone that worked on RDFa also believes that namespaces
are more difficult to grasp than no namespaces at all. Some of the
questions that we wrestled with are:

1. How much more difficult are namespaces to grasp than not having
   namespaces?
2. Are the trade-offs worth the steeper learning curve?
3. Will it harm adoption?

How much more difficult are namespaces to grasp than no namespaces?
-------------------------------------------------------------------

The concept of namespaces is something that most regular folks can grasp
if you explain it to them in the correct way. This is mostly an
education issue because users use namespaces all the time on the web.
Case in point, the URL. Another way of explaining namespaces are via the
example of file folders on a computer hard drive. These are just two
examples, there are more, but the point is that namespaces are a fairly
simple concept as long as you don't get into the linguistics and
computer science behind why they are important in knowledge representation.

Namespaces are difficult for people to grasp because most technical
people don't ground the student with a solid example of a namespace -
the URI or files within file folders.

We realized that it would be much easier for people to understand
namespaces if we used URIs as the method of namespace expression. It is
a concept that everybody is already familiar with and it has several
other benefits that I explained in the previous e-mail (uniqueness and
differentiability).

Are the trade-offs worth the steeper learning curve?
---------------------------------------------------

To understand this point, we must first examine why namespaces exist in
the first place. Namespaces are a method of disambiguation - they exist
to provide more context to the item to which we are referring. If we
choose not to have namespaces, we risk introducing ambiguity into the
language we are constructing. This ambiguity results in vocabulary term
collisions as evident in our work in the Microformats community.

Microformats chose to not use namespace because we are not trying to
solve the larger problem of knowledge representation in that community.
We are solving very specific problems, but the community never intended
our approach to scale to the scope of the entire Web (all vocabularies).
In other words, Microformats are not extensible beyond a certain point
due to not using namespaces.

The possible trade-off is to not use namespaces at all and thus ensure
that the language won't scale to all vocabularies.

RDFa is designed to be extensible to all vocabularies and heavily guards
against namespace collisions with the use of URIs.

Will it harm adoption?
----------------------

Only time will reveal the answer to this question. It will affect
adoption to a small degree, but not to a large one because people use
URLs all the time on the web. At the very worst, users don't have to
understand that they're namespacing anything, they will just cut-paste
from the examples listed on any one of the vocabulary tutorials on
rdfa.info:

http://rdfa.info/wiki/audio-tutorial

We also plan to release tools for Wordpress, Drupal and other CMSes that
will aid in the markup of semantic data. We also plan to release
validation tools to check the RDFa on a page (make sure the links are
dereferenceable, catch invalid input on the property names, etc).

> So that is one disadvantage of URIs: they are long.  

Yes, absolutely. This is one disadvantage of URIs. One must weigh this
disadvantage against the advantages that are being provided by using URIs:

1. Dereference-ability.
2. Easy namespacing - the ability to create your own vocabularies that
   are guaranteed unique.
3. A concept that is easily taught and learned by regular folks on the
   web.

> In fact they are so
> long that people have gone to the bother of inventing additional syntax
> to avoid having to write them out.

Yes, we did create another syntax to ease URI authoring called CURIEs,
but probably not for the reasons that you're thinking. We also rejected
QNames to fill that role for the following reasons:

1. We believe QNames should not be used in attribute values.
2. QNames are really restrictive in what can be used as a "reference".
3. Qnames do not expand to URIs, they map to a tuple and RDFa (and many
   other approaches that use URIs as resources) need things to map to
   URIs.

The CURIE spec explains these points in more detail:

http://www.w3.org/MarkUp/2008/ED-curie-20080617/

> The other advantage of unique prefixes over URIs is the one you mention:
> they are not dereferenceable.  As has been mentioned on this list, that
> means nobody (human or system) will attempt to reference them, either by
> mistake or in the hope of finding something there. 

All URLs listed as vocabulary terms in RDFa should be dereferenceable -
that's the whole point. What you are listing as an advantage has been
identified as a severe disadvantage for users.

We are strongly suggesting that the document that is dereferenced have a
machine readable (RDF vocabulary) and human readable (human explaination
of vocabulary and all terms). Take a look at the following vocabulary
for an example of what should be at the end of an RDFa vocabulary link:

http://purl.org/media/video#Recording

The vocabulary term above is dereference-able and if you put the term
into a web browser, you will get both a machine-readable and
human-readable definition of that term. Contrast that functionality with
the following as a namespaced vocabulary term that you cannot dereference:

foo.blah

If you wanted to know what the definition of foo.blah is, there is no
way for you to do so other than relying on a search engine to find the
vocabulary for you. The page that you land on might not even be the
correct page for the vocabulary that was used to mark up the original
page that you were viewing.

Using URLs allow one to specify, with great accuracy, both for machines
and for humans, the meaning behind semantic statements.

> So unique prefixes have 2 advantages over URIs; therefore they cannot be
> dismissed as unnecessary merely because URIs exist.

The approach wasn't dismissed - we chose a better solution that
addressed all of the needs of a reliable knowledge representation
system, something that was scalable without needing a central authority,
and a system that could be adopted by ordinary folks on the Internet.

> Of course those advantages don't necessarily apply to all users in all
> situations; there may be users whom don't find the above advantageous,
> and prefer URIs for other reasons.  That's OK, because such users can
> still choose to use a URI as their unique prefix.  (And there can be a
> rule which says you are only allowed to have something which is
> syntactically a URL as a unique prefix if you own that URL.)

You can always do something like

xmlns:foo="whatever-"

...

property="foo:blah-fnurt-jackalope"

although, doing so is highly frowned upon for the reasons explained above:

1. The link isn't dereference-able.
2. It uses a namespace mechanism that is not used anywhere else on the
   web.
3. It has a much higher learning curve than plain old URIs.

> That suggests that giving users the freedom to use either URIs or any
> other prefixes of their choice is superior to forcing them to use URIs,
> surely?

I think we do give people the freedom to use any other sort of prefix of
their choice, within reason. I do not think that using anything other
than a dereferenceable URL is superior for the reasons outlined above.

Does that make sense? Do you have any further concerns with these responses?

-- manu

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: Bitmunk 3.0 Website Launches
http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches
Received on Wednesday, 27 August 2008 09:56:32 UTC