ISSUE-1: Using RDF to express tokens/keywords

On 03/14/2010 08:08 AM, Mark Birbeck wrote:
> I think you're alighting on a very small point here, when as I say,
> the key thing is to first establish whether we should be using RDF at
> all.
>
> You seem to be implying that we *must* use RDF, because if we use
> name/value pairs there will be 'prefix leakage'...

Ha! I like the term 'prefix leakage' - sounds like a type of semantic
incontinence. =P

Background
----------

Good - I think this discussion is helping. It's certainly helping me
understand some of the more nuanced arguments that you are making as
well as clarifying some of your goals. For example:

* I thought you were proposing that @token/@vocab would operate exactly
  like xmlns:, but with a different attribute value syntax. Now I know
  that wasn't your intention.
* I thought you were proposing JSON strictly because of the Cross-Origin
  Resource Sharing issue, but it has more to do with name/value than it
  does with CORS.
* I thought you had been on the call that we discussed the "using RDF
  to express prefixes/tokens" issue, but now that I think back, you
  may not have been there... which is why we may have been talking
  past each other.

I don't mean to imply that we *must* use RDF. The 'prefix leakage' issue
is one that I'm concerned about, but you have largely mitigated that in
your last e-mail by saying that @token/@vocab/@whatever could be a
better solution than xmlns: and would be processed differently to
prevent the 'prefix leakage' issue.

We briefly discussed whether or not it makes sense to utilize RDF to
express prefixes/tokens/keywords and also whether it makes sense to use
RDF to express RDFa processing instructions. IIRC, there didn't seem to
be anyone opposed to the idea of using RDF to declare
prefixes/tokens/keywords.

RDF Triples Influencing RDFa Processors
---------------------------------------

Historically, we have been very wary of polluting the default graph. We
don't want leaks - we want to save ourselves from the embarrassment of
semantic incontinence (sorry, I just couldn't resist).

Speaking from a strictly philosophical standpoint concerning language
design, there is no hard and fast rule stating that a language processor
cannot be affected by the language itself. Lisp is a perfect example of
this philosophy in action - code and data are interchangeable, the
programmer has access to the parse trees created from the program. The
parse trees can be modified during the operation of the program to get
different effects. One can even create Domain Specific Languages in Lisp
at runtime, using Lisp, to express concepts more succinctly.

Languages can be written in the languages themselves. A language can
have the ability to affect the runtime, which can be built into the
language itself. C compilers, after being written in assembler, were
then re-written in C. Javascript's prototype-based object extension
mechanism (foo.__proto__ = proto_object) - these paradigms show that
it's quite possible to have the interpretation of a language be
deterministically affected by the language itself.

We can look at this as an RDF vs. name/value discussion, but I don't
think that's the main issue. I think the main issue has more to do with
in-band vs. out-of-band messaging to affect how a language is interpreted.

Two RDFa Processor Behavior Alteration Alternatives
---------------------------------------------------

I could go either way on the RDF vs. name/value discussion as I have
come to understand it over the past few days.

1. We choose a new attribute @token/@keyword/@whatever to add mappings
   to the "list of mappings". That mechanism would be preferred to
   xmlns: going forward and would operate like it, save for one
   difference. When this new attribute is used in an RDFa Profile
   document, the mapping would also become available in the document
   that utilizes the RDFa Profile.

2. We choose to create an RDFa Processor vocabulary that could
   future-proof us against changes to the RDFa Processor. We may or may
   not add @token/@keyword/@whatever, but if we do, it would have
   exactly the same functionality as @xmlns:. The RDFa Processor
   vocabulary would have "rdfa:keyword" or "rdfa:token" or something
   similar to provide an out-of-band mechanism of expressing mappings
   to the RDFa Processor. These "rdfa:"-specific attributes would be
   placed into a different graph - not the default graph.

So, fundamentally, both approaches are perfectly workable... which is
why this is going to be a difficult decision for the group. Our final
choice is not going to be influenced by "Which solution is workable?",
because both are, but rather "What gives us the greatest benefit for the
future of the RDFa language?"

Benefits of an RDFa Vocabulary
------------------------------

There are some other future-proofing benefits of an RDFa Vocabulary,
which may influence our decision on the final solution. One of these may
be how to trigger new functionality in RDFa Processors.

ISSUE-15 [3] deals with the deprecation of the @version tag. It's going
to be a hard argument to get @version re-instated in HTML WG.
Alternatively, we could create a new element "rdfa:version" - which
authors could optionally use to ensure that their document is parsed
according to a certain versions Processing rules:

<head>
...
 <meta property="rdfa:version" content="1.1" />
...
</head>

Assuming that we're going to make the default @datatype a plain literal
in RDFa 1.1. We could do other tricks like "rdfa:default_datatype" -
which would tell the RDFa Processor to generate XMLLiterals for elements
that contain children if no datatype is present:

<head>
...
 <meta property="rdfa:version" content="1.1" />
 <link rel="rdfa:default_datatype" resource="rdf:XMLLiteral" />
...
</head>
...
<div property="math:formula">E = mc<sup>2</sup></div>

I'm not arguing that we want to do this... just that we /could/ do this
if we wanted to, along with expressing tokens/keywords, without
inventing a new attribute each time we wanted to extend the
functionality of RDFa Processors.

Other Factors Influencing this Decision
---------------------------------------

There are a number of other questions that may better influence this
decision - among them are:

* Does one of these solutions solve other issues that we have
  identified?
* Does one of these solutions provide some degree of language
  future-proofing?

If we choose Solution #1 above, we may simultaneously solve ISSUE-1 [1]
and ISSUE-14 [2].

If we choose Solution #2 above, and with the addition of
@token/@keyword/@whatever, we may simultaneously solve ISSUE-1,
ISSUE-14, and ISSUE-15 [3]. We may also provide some degree of
future-proofing for RDFa Processors.

-- manu

[1] http://www.w3.org/2010/02/rdfa/track/issues/1
[2] http://www.w3.org/2010/02/rdfa/track/issues/14
[3] http://www.w3.org/2010/02/rdfa/track/issues/15

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
President/CEO - Digital Bazaar, Inc.
blog: PaySwarming Goes Open Source
http://blog.digitalbazaar.com/2010/02/01/bitmunk-payswarming/

Received on Sunday, 14 March 2010 21:15:00 UTC