Re: RDFa and Web Directions North 2009 from Kjetil Kjernsmo on 2009-02-14 (public-rdfa@w3.org from February 2009)

From: Kjetil Kjernsmo <kjetil@kjernsmo.net>
Date: Sat, 14 Feb 2009 01:51:49 +0100
To: Ian Hickson <ian@hixie.ch>, public-rdfa@w3.org, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
Message-id: <200902140151.54545.kjetil@kjernsmo.net>
On Saturday 14 February 2009, you wrote:

> Please don't take these questions as personal attacks. I honestly am
> trying to find out how RDF and RDFa are to work in HTML5, to see if
> they make sense to add.

Sure! Skepticism is sound, but you have be aware that the questions you 
raise has all been discussed at length elsewhere, and sometimes all 
this advocacy seems to be a waste of time, time that would be better 
spent actually writing code (and stick to XHTML for the web page needs) 
to prove the case by actual running code. Thus, I will be very brief.

> On Fri, 13 Feb 2009, Kjetil Kjernsmo wrote:
> > On Friday 13 February 2009, Ian Hickson wrote:
> > > Note that you can already "ask questions" on the Web. For
> > > example, I just searched for "which country napolean", which is
> > > neither the right question nor correctly spelt (though that
> > > wasn't intentional), and Google answered:
> >
> > Well, you just proved that google sucks, didn't you? It couldn't
> > get the answer to that basic question right...
>
> Would a system based on RDF or RDFa give a better answer to the same
> question? How? Is there a system running somewhere that can
> demonstrate this? Does it require all data to be marked up as RDFa?

I suggested a SPARQL query builder for KDE yesterday. It would be very 
good at cases as this. 

> > Another example, I'd like to have the latest version of the SPARQL
> > Update spec, and I expect to get it if I ask for "sparql update".
>
> How does RDF or RDFa solve this problem?

dct:date 


> Do we have reason to believe that it is more likely that we will get
> authors to widely and reliably include such relations than it is that
> we will get high quality natural language processing? Why?

Yeah. Because high quality natural language processing is very unlikely 
to ever happen. It will remain a niche auxiliary system, and something 
that is only half-decent for English.

>
> How would an RDF/RDFa system deal with people gaming the system?

trust networks.

> How would an RDF/RDFa system deal with the problem of the _questions_
> being unstructured natural language?

See my tuberculosis use case. You make the false assumption that the 
user needs to formulate a question.

> How would an RDF/RDFa system deal with data provided by companies
> that have no interest in providing the data in RDF or RDFa? (e.g.
> companies providing data dumps in XML or JSON.)

I think we need something I've called GRLLA, i.e. the guerilla version 
of GRDDL ;-)

> How would an RDF/RDFa system deal with companies that do not want to
> provide the data free of charge?

That's OK. As long as there are links to something that the rest of the 
world likes, this is not a problem, it is a good thing.

>
> How would an RDF/RDFa system deal with companies that want to track
> per-developer usage of their data?

Wrong question, developers as we see them today will be an anachronism, 
that's part of the fun.

>
> On Fri, 13 Feb 2009, Kjetil Kjernsmo wrote:
> > On Friday 13 February 2009, Ian Hickson wrote:
> > >If Amazon couldn't even be bothered to add a class for "price" in
> > > the last  decade, why do we believe they will add RDFa?
> >
> > Because RDF(a) is actually powerful, class isn't. That's what I
> > think anyway...
>
> The problem description was just to get a relationship between an
> item and a price. Both a simple set of classes and RDFa completely
> solve this problem. Being more powerful is irrelevant in the context
> of that problem.

Again, you're looking at it from a single-use-case perspective. Please 
take a step back.

>
> > >How does RDFa solve the problem that they have that I described
> > > but that you cut from the above quotes, namely that they want to
> > > track usage on a per-developer basis?
> >
> > OK, it doesn't.
>
> If the problem is that we want price data out of Amazon pages, and
> RDFa doesn't solve the problem to Amazon's satisfaction, then why is
> RDFa being put forward as a solution?

I think Amazon will realise that they do not act in their own best 
interest, though it may take some time. 


> What did you do with the genres once you had them all aligned with
> union, intersection, and sams-as relationships? That doesn't seem
> like the most useful structure for data to be exposed to a random
> user.

We did a bit of reasoning, constructed a graph from it where all the 
relations between genres are expressed, then found that the we didn't 
have the hardware to do what we wanted, so we chopped it up to a tree 
again. So the user has a nice 2D tree on a ball that can be rotated at 
30 frames per second. With better hardware, we want to do a 3D 
rendering of it. Well, this is not an HTML page application, as you can 
tell. So I don't particularly care about serialising RDF in HTML, as we 
haven't got any uses where HTML is of importance. So my agenda here is 
mostly that I'd like to include the full semantics of what we have in 
the HTML that we do generate now and then, just in case somebody finds 
it useful. I think the most fascinating thing about the innovation that 
we've seen on the web is how people have taken things that have been 
meant for one purpose and reuse it for something entirely different. To 
me the question is not whether we need RDFa, the question is whether we 
need HTML5. 

> On Sat, 14 Feb 2009, Karl Dubost wrote:
> > I love natural language processing too. It is useful, though it
> > doesn't solve everything (except maybe in an English centric
> > world.)
>
> Nobody said it would solve everything. My point was just that it
> solved one specific problem as well as RDFa does.

No, you demonstrated that there were cases where it didn't solve the 
problem.


> > I want to provide pointers to detailed descriptions of the things I
> > mention in what I write.
>
> Isn't an <a href=""> suitable for this already?

Nope, this should be self-evident.

>
> > I want to be able to express myself succinctly with pointers to
> > other places on the Web where descriptions of the people, places,
> > subject matter can be obtained.
>
> Again, <a href=""> seems to have solved this problem well until now,
> why does it no longer solve the problem?

I really don't understand that you cannot see the problem with how this 
is done today...

>
> > Note, I don't want to point them to another chunk of blurb, I want
> > to point my readers to a page that has the sole function of
> > describing the aforementioned entities via their attributes and
> > relationships.
>
> Why?

Oh, please... This is the kind of questions that gives people a strong 
impression that talking to you is a total waste of time...

> > As a page reader:
> > I want to have access to the entities behind the blurb. Today I can
> > see an opaque but nice looking Web page, I can also see the markup
> > behind the page, but I cannot easily discern the description of
> > entities mentioned in a Web Page.
>
> What good are these entities? What is my dad supposed to do with
> them?

The same thing that the people talking with our librarians are doing 
with them, actually find the information they look for.

> Can an RDF/RDFa system do better from a natural language query?

To date, I haven't even seen a half-decent natural language system in my 
own language, and I've seen a lot of nice sales-pitches. NLP is nice 
for some things, but part of the idea here is that we have a billion 
people out there who can tell us everything with much higher precision 
than NLP can, and they are happy to.

You know, I've heard a lot of criticism of the Semantic Web that amounts 
to "it is the failed AI of the 80-ties", and so it is rather funny to 
see all this fascination with NLP...

> If the above represents the state of the art for RDF or RDFa, then we
> are a _long_ way from RDF being ready to be exposed to regular users.

Yeah... Well, it is a question of how you'd expose it... It is the data, 
not the model that is interesting to expose to the user right now.

> People have a hard enough time (as you point out!) doing simple
> natural language queries where all they have to do is express
> themselves in their own native tongue.
>
> Asking them to understand "yago:BattlesOfTheNapoleonicWars" or
> "dbpedia-owl:MilitaryConflict" isn't going to fly.

Actually, this is an easier problem that you'd might think, it just 
hasn't had any attention yet. It is easy enough to attach an rdfs:label 
to those URIs, in any language, which would make it a lot more 
friendly.

Cheers,

Kjetil
-- 
Kjetil Kjernsmo
Programmer / Astrophysicist / Ski-orienteer / Orienteer / Mountaineer
kjetil@kjernsmo.net
Homepage: http://www.kjetil.kjernsmo.net/     OpenPGP KeyID: 6A6A0BBC
Received on Saturday, 14 February 2009 00:52:40 UTC