RDFa generalization (part 2)

More conversation on ways to extend RDFa to support the Microformat's
view of the world:

[11:18:17] Manu Sporny: RE: XMDP being wierd
The alternative is to express the vocabulary extension terms in RDFa...
which would fit cleanly with the RDFa model (of course). The uF
community wouldn't have to deal with it either... but I have a feeling
that they might want to use XMDP because that's how the uF community
specifies the machine readable part of their standards.
[11:18:38] … Using RDFa to mark up the vocabulary term extensions would
be much better.
[11:18:54] Shane McCarron: I think we could use something like it with
rdfa embedded so it was actually machine parseable
[11:19:01] … been thinking about it all night - no solution yet
[11:19:31] Manu Sporny: no, I think that is the solution: XMDP profile
(makes the uF community happy)... XMDP marked up using RDFa (to enable
the machine-readability portion of it) makes us happy.
[11:20:17] Shane McCarron: perhaps.  note that this is a slightly
different problem than what I was addressing in the page I wrote
[11:20:26] … I was trying to help eliminate all the xmlns boilerplate
nonsense
[11:20:44] … I think the two solutions can dovetail but they are for
different audiences
[11:22:08] Manu Sporny: I thought the solution to the xmlns boilerplate
nonsense was @prefix?
[11:22:14] … are you proposing @profile instead of @prefix?
[11:22:15] Shane McCarron: that's still nonsense
[11:22:20] … no - both really
[11:22:32] … @prefix is a replacement for @xmlns.  and you can use it if
you want
[11:23:19] … if you, however, need to define 10 prefixes in every
document it would be better to just reference a profile that did it for
you.  a well known profile that processors would already know about.
like w3.org/xhtml/rdfaProfile or something
[11:23:56] Manu Sporny: So, the tradeoff is that your parser then has to
go out and fetch another document in order to understand the current
document?
[11:24:41] … I don't know if that issue is serious enough that we need
to require that of all conforming parsers.
[11:24:43] Shane McCarron: yes - and that's not great.  but processors
will cache external referfences if they are written well.  its the same
with microformats and their processors isnt' it?
[11:24:56] Manu Sporny: nope, not the same with uF processors.
[11:25:04] … uF processors just know which vocabulary they're parsing.
[11:25:12] … There is no "follow-your-nose" concept in uF (in practice).
[11:25:13] Shane McCarron: based upon values in @profile
[11:25:33] Manu Sporny: Yes, in theory, but nobody uses @profile when
writing Microformats (in practice, people just haven't done it).
[11:25:43] Shane McCarron: erk
[11:25:47] … then how the hell does it *work* ?
[11:25:51] Manu Sporny: yeah... Microformats == hardcoded.
[11:26:00] Shane McCarron: omg.  you have got to be joking
[11:26:03] Manu Sporny: You have a different parser for each Microformat.
[11:26:18] Shane McCarron: but how does it know which format to use on
which document?
[11:26:21] Manu Sporny: The parser hits the page and starts trying to
find, for example, "haudio" tags in each @class element.
[11:26:27] … It doesn't.
[11:26:31] Shane McCarron: wow
[11:26:41] Manu Sporny: You seem to think that there's some sort of
grand design behind uFs.
[11:26:48] … :)
[11:26:50] Shane McCarron: tantek always said there was
[11:27:02] Manu Sporny: *shrug* - well, kinda.
[11:27:07] … but in practice, we don't use that design.
[11:27:19] … because nobody follows using @profile in their uF documents.
[11:27:36] … They just start tagging their HTML and Operator does it's
best to detect Microformats without looking at @profile.
[11:28:00] … Operator does such a good job of finding stuff without
@profile that it has become unnecessary.
[11:28:01] Shane McCarron: its tag soup all over again...  that's tragic
[11:28:16] Manu Sporny: yes.
[11:28:19] Shane McCarron: and we want to perpetuate this nonsense?
[11:28:24] Manu Sporny: no
[11:28:47] … we want to provide a mechanism that maps all Microformats
to RDFa (which will, by default, pull people away from this nonsense)
[11:29:17] … so, while they use the same sort of markup (which makes uF
people happy)... perfectly valid triples are being generated behind the
scenes (which makes RDFa people happy).
[11:29:40] Shane McCarron: well - to be fair if they were using @profile
I could do that now via GRDDL
[11:30:11] Manu Sporny: our requirement is that they use something like
@prefix or @profile to hint to RDFa processors of what the valid values
are. If they don't use it, the triples aren't generated.
[11:30:30] … I, unfortunately, don't know that much about how GRDDL works.
[11:30:55] … I know there are crazy XSLT transforms that you can use
with GRDDL to produce RDFa, but it's really above my head at this point.
[11:30:58] Shane McCarron: ignore that...  I am sure we can provide a
reasonable solution here
[11:33:34] … but any such solution is going to require an RDFa processor
to retrieve a document that defines the reserved values
[11:34:40] Manu Sporny: yes, and I think that's fine.
[11:34:40] Shane McCarron: some combination of role and an xhtml vocab
item that we standardize on, then a collection of triples that mean rdfa
reserved CURIE value for rel and rev
[11:35:29] … and, of course, we need to use the same convention in the
vocab# document 'cause any processor should in theory be reading that
document to get the default reserved values
[11:39:20] Manu Sporny: *nod*
[11:39:22] Shane McCarron: thats a little circular but I wil get over it.
[11:40:21] Manu Sporny: The biggest down-side of this method is that
we're now requiring the RDFa processor to go get it's configuration from
somewhere else (which is not a big deal for the parsers written in
Perl/Python/etc. - but it is a big deal for parsers written in C, C++, etc.)
[11:40:47] … I'm not looking forward to writing the code to go out to
the network and fetch a document containing yet more triples.
[11:41:04] … we should definitely make this extension an optional thing,
not a RDFa requirement.
[11:41:07] Shane McCarron: dont you already have code to fetch the
document you are parsing in the first place?
[11:41:20] Manu Sporny: nope, the application does that, not the parser.
[11:41:25] … The application feeds the HTML to the parser.
[11:42:02] … I'll probably end up putting in a callback from the parser
to the application: requestVocabularyTriples() or something to that effect.
[11:42:43] Shane McCarron: well - just generalize it.  getN3(URI) -
returns a graph. ;-)
[11:42:50] … probably by calling you
[11:43:11] Manu Sporny: Yeah, the more I think about this, the more I
think that this should be an optional parser feature, as far as parser
conformance issues are concerned.
[11:44:36] Shane McCarron: I don't mind that.  an "extension".  we need
some sort of feature test too then.  no problem
[11:47:53] Manu Sporny: *nod* - we could have a separate part of the
test harness to test optional conformance requirements.
[11:48:05] Shane McCarron: what would happen if a document required a
parser that supported the extension and some parser did not - how does a
consumer of triples know that?
[11:48:32] Manu Sporny: damn, good point.
[11:48:47] … I want to say that the consumer won't know that there are
more triples in the document.
[11:49:07] … They'd just get the "immediate" triples - the ones that are
defined using purely RDFa.
[11:49:15] … and not the extension.
[11:49:35] … "default prefix term expansion"  extension?
[11:50:32] Shane McCarron: "reserved word definition" extension
[11:50:39] … we call those things reserved words
[11:52:11] … "reserved word collect expansion" extension....
[11:52:31] Manu Sporny: "reserved word expansion" extension...
[11:52:40] Shane McCarron: kk
[11:54:02] … still need to create a mechanism for actually declaring
additional reserved words and mapping them to URIs, of course.
[11:56:39] … and of course having some sort of OWL declaration that
gives the terms meaning would be nice too, but let's not go crazy
[11:59:12] … I personally like a model where we say that a given uF
vocab has a prefix foo, which maps to URI bar.  and it also adds a bunch
of reserved values to the collection that, if you use them, act just as
if you said foo:value
[12:01:25] Manu Sporny: The triples that we would want to generate in
the @profile could be something along the lines of:

<http://example.org/vocab#foobar> rwe:alias "foobar".
[12:01:47] … or just rdfs:label?
[12:02:08] … rwe == "reserved word expansion" vocabulary
[12:02:36] Shane McCarron: I got that....  I think that its important we
use a term that has semantics for us as a predicate so we dont run into
them in the wild and start creating accidental reserved terms
[12:03:11] … could just be xhd:reserved
[12:03:35] … err.... xhv:reserved
[12:03:44] Manu Sporny: *nod*
[12:05:53] Shane McCarron: so we use @about and @property to declare the
mappings?
[12:06:35] Manu Sporny: I think so.
[12:06:58] … I'm genning up some sample text now to see how it feels.
[12:07:02] Shane McCarron: ok
[12:08:05] … if there is a way to combine it with the prefix definition
strategy I proposed that would be cool
[12:15:21] Manu Sporny: yeah, this looks like it's going to be really
simple (defining the vocabularies, that is).
[12:15:33] …
  <dl>
   <dt about="#haudio" property="xhv:reserved">haudio</dt>
   <dd>
    Used to identify and describe metadata associated with an individual
audio recording.
   </dd>
 </dl>
[12:16:14] … That's how you define the vocabulary - that file would be
placed at something like: http://microformats.org/rdfvocab.html
[12:16:42] … Then in your document, you would use either @profile or
@prefix to extend the allowable default prefix terms.
[12:18:53] …
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
 "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html version="XHTML+RDFa 1.0" xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title>An hAudio example document</title>
</head>

<body prefix="EXTEND=http://http://microformats.org/rdfvocab.html">

<div typeof="haudio">
   <span property="title">Start Wearing Purple</span> by
   <span property="contributor">Gogol Bordello</span>
   <span property="published" content="20020514">May 14th, 2002</span>
</div>

</body>
</html>
[12:20:05] … or replace:

<body prefix="EXTEND=http://microformats.org/rdfvocab.html">

with:

<body profile="http://microformats.org/rdfvocab.html">

[12:21:06] Shane McCarron: yeah not wild about EXTEND.  Also, I would
like the rdfvocab document to contain a default prefix declaration so if
it is used in @profile you could also in your documents say ha:title if
you wanted to
[12:21:17] … at least I think I would
[12:21:22] Manu Sporny: Note that I used XMDP to specify the
Microformats vocab, but marked it up using RDFa for the benefit of the
RDFa parser.
[12:21:31] Shane McCarron: yeah I got that
[12:21:50] Manu Sporny: yeah, EXTEND is a hack - but i'm trying to find
a solution that doesn't use @profile.
[12:22:00] … just to see if one could exist.
[12:22:41] Shane McCarron: well @prefix="ha:URI" would be fine.  parsers
that support the extension could retrieve the URI to see if there are
reserved value extensions
[12:22:56] Manu Sporny: I think "rdfvocab document to contain a default
prefix declaration so if it is used in @profile you could also in your
documents say ha:title if you wanted to" is a very good idea.
[12:23:03] Shane McCarron: k
[12:23:11] Manu Sporny: ahh, right.
[12:23:51] … I'd prefer that we stay away from @profile since there are
a number of people that are using it for other stuff (GRDDL)... and what
we're talking about isn't some sort of "transformation mechanism".
[12:23:54] Shane McCarron: ok
[12:24:33] … how about @prefix="ha==URI" if you want to retrieve the
content... or some other conventionm
[12:25:09] Manu Sporny: that's a solution, but would probably be lost on
most developers. People would complain because we have this weird syntax
now that doesn't exist in any other attribute.
[12:25:24] … could we re-use CSS syntax... like from the style="" attribute?
[12:25:56] … prefix="foo:http://bar.org/ ; ..."
[12:26:01] … blech... don't really like that.
[12:26:07] Shane McCarron: me either
[12:27:32] Manu Sporny: It would be nice, though, to be able to hint to
the parser that the document should be retrieved and does contain
reserved words.
[12:27:35] Shane McCarron: some other attribute that means that
[12:27:56] … extendReserved="true"
[12:28:00] Manu Sporny: or, we just state that "there could be reserved
words, a parser must download and process the document in order to find
out if that's the case".
[12:28:23] … HTML5 would flip out if we try to add more attributes than
we already have...
[12:28:32] … (not saying it's a good reason)
[12:28:48] … ... just saying that there would be push-back for something
like extendedReserved="true".
[12:29:13] … Plus, you never know if that target document
updated/removed certain reserved words... so it would be good to check
every time.
[12:29:27] … (from the cache, or retrieving the document each time).
[12:29:32] Shane McCarron: HEAD
[12:30:03] Manu Sporny: ?
[12:30:19] Shane McCarron: you use HTTP HEAD requests to validate your cache
[12:30:33] Manu Sporny: right.
[12:30:36] Shane McCarron: checks the last modified date.  you can also
use GET with an if modified after qualifier
[12:31:24] Manu Sporny: My (veiled) point being that you don't want the
document using the vocabulary to state whether or not it should retrieve
the vocabulary document.
[12:31:49] Shane McCarron: oh I just meant we could flag that some
vocabs nneed retrieval.  some will not
[12:31:56] Manu Sporny: The client document never knows what reserved
values the vocabulary document specifies until it goes and reads the
values - and that's probably the way it should be.
[12:32:03] Shane McCarron: foaf for example.  there are no reserved values
[12:32:11] Manu Sporny: meaning, we shouldn't do something like
extendReserved="true"
[12:32:16] … *nod*
[12:32:33] … and I don't think we should promote the use of extending
the set of reserved words.
[12:32:39] Shane McCarron: well quite
[12:33:29] Manu Sporny: so, prefix="uf=http://microformats.org/rdfvocab"
[12:33:57] … then people can do stuff like "uf:haudio, uf:title, uf:hresume"
[12:34:13] … and in that target vocabulary document are triples that
specify an extension to the set of reserved words.
[12:34:33] …    <dt about="#haudio" property="xhv:reserved">haudio</dt>
[12:35:11] … So, really - the only thing we need is @prefix.
[12:35:26] … and the syntax doesn't change from what was proposed
several weeks ago.
[12:35:42] Shane McCarron: and processors retrieve every referenced
vocab to check for reserved word extensions
[12:36:01] Manu Sporny: yes - optionally, right?
[12:36:18] … optionally retrieve.
[12:36:31] Shane McCarron: maybe.   surely for now.  If they don't
retrieve it the triples wont work
[12:37:02] Manu Sporny: ? - do you mean the RDFa triples, or the
Microformat triples.
[12:37:15] Shane McCarron: well.... same thing
[12:37:42] Manu Sporny: not really, the only triples that are not
generated are the unprefixed ones.
[12:37:55] … if downloading the source document is optional.
[12:38:03] Shane McCarron: quite
[12:38:12] … but presumably any vocab can extend the collection of
reserved values
[12:38:22] Manu Sporny: yes, but it's heavily frowned upon.
[12:39:26] … if we don't frown upon that behavior, everyone might start
making all their vocab terms prefixed as well as unprefixed.
[12:39:32] … which will lead to vocab clashes.
[12:39:35] … pretty quickly.
[12:43:24] Shane McCarron: we need a rule for resolving clashes.....
[12:43:34] … I say last loaded reserved term wins
[12:43:52] Manu Sporny: *nod*
[12:43:55] … agreed.
[12:44:16] Shane McCarron: that way if I wanted to do something insane
like a scoped @prefix reference that brought in reserved terms (e.g. in
a div) it would override some global
[12:44:19] Manu Sporny: that also let's people override xhv values :)
[12:44:37] … *nod*
[12:44:44] … yeah, this is sounding really good.
[12:46:24] Shane McCarron: earlier you said "so,
prefix="uf=http://microformats.org/rdfvocab""
[12:46:52] … didn't you mean so,
prefix="ha=http://microformats.org/rdfvocab" ??  you wound't combine
more than one uf in a vocab would you?
[12:47:31] Manu Sporny: they're all combined.
[12:47:46] … All terms are re-usable by all other vocabularies (which is
why Microformats don't scale).
[12:47:55] … It's one gigantic namespace.
[12:48:06] … flat... ugly... unscalable.
[12:48:18] Shane McCarron: I get that is how THEY did it.  but does that
mean we need to do that too?  or are you suggesting we do this sort of
once and be done
[12:49:05] Manu Sporny: You can do it either way - I think the correct
way to do it for uFs is: prefix="uf=http://microformats.org/rdfvocab"
[12:49:29] … hAudio already has it's own RDF vocabulary:
http://purl.org/media/audio
[12:49:59] Shane McCarron: so would that mean if you wanted to
referenced uf in a scoped way you would say uf:haudio ?
[12:50:01] … correct.
[12:50:07] Manu Sporny: However, the Microformats vocabulary for hAudio
is different (they map between each other, but there are different terms
due to Microformats constraints).
[12:50:48] Shane McCarron: well if you think they would swallow that I
don't mind.  I think it would be cool if each uf mapped into its own
prefix too - to promote a correct use going forward.
[12:51:36] Manu Sporny: I don't know if they'd swallow that approach.
There will be people on both sides of the debate.
[12:52:31] … I agree that it would be good if each uf mapped to it's own
prefix, but that's not the way it's done now... there is a global
Microformats vocabulary namespace and that's the approach that they've
taken right now.
[12:53:29] … http://microformats.org/wiki/existing-classes
[12:53:38] Shane McCarron: I understand that.  I think it would be nifty
if we could do both.....  here's this reserved term that works, but if
you wanted to reference it in a scoped manner here is the right way to
do that
[12:53:48] … there are no prefixes now so i think that it would work
just fine
[12:53:53] Manu Sporny: *nod* - I agree.
[12:54:11] Shane McCarron: one crisis at a time
[12:54:18] Manu Sporny: We should support both.
[12:54:26] Shane McCarron: exactly...
[12:54:48] Manu Sporny: Especially since there is no downside to
supporting both.
[12:55:09] … *nod* - I'll repost this discussion to the list, to keep
everyone apprised?

-- manu

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: Bitmunk 3.0 Website Launches
http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches

Received on Monday, 1 September 2008 17:17:17 UTC