Re: SemWeb Non-Starter -- Distributed URI Discovery

On Apr 10, 2005, at 11:16, ext Charles McCathieNevile wrote:

> On Mon, 04 Apr 2005 19:04:28 +1000, Patrick Stickler 
> <patrick.stickler@nokia.com> wrote:
>
>>
>>
>> On Apr 4, 2005, at 11:31, ext Jeremy Carroll wrote:
>>
>>> Al Miles wrote:
>>>>> Second, there is a more general discovery requirement which can be 
>>>>> loosely phrased as, 'I want to find out what x said about y,' or, 
>>>>> 'who said what about what?'  I have no ideas for how to solve 
>>>>> that.
>>
>> Hmmm...  I'm trying to grok how I might best rephrase the
>> question in a more practical form to avoid philosophical
>> nuances.
>>
>> Perhaps "what is the core, authoritative body of knowledge provided by
>> the owner of this URI which describes the resource identified by the 
>> URI?"
>>
>>> The right question is Alistair's.
>>
>> My point was that there is no single "right question". There are 
>> several.
>>
>> *A* right question is certainly Alistair's.
>>
>> *Another* right question is the one I pose above (hopefully better
>> worded than previously).
>
> Actually Alistair poses two questions, one of wich strikes me as to 
> general to be one of the "most right" ones :-). I think that Patrick's 
> question is also somewhat general - what does the author of X have to 
> say about X is a pretty unconstrained question. It strikes me as 
> something that is often interesting, but is the kind of question I 
> would avoid asking if I were trying to do any specific work.


So you are not interested in using any authoritative
assertions in any of your work?

???


>
>>> A google like system seems to be a plausible answer, we just need an 
>>> economic model for it.
>>
>> A google like system is certainly a part of the answer, but we
>> also need access to authoritative descriptions of resources in
>> an analogous manner to how we now have access to authoritative
>> representations.
>
> I'm not sure this is true, if we have a google-like system that can 
> read RDF.

I think that it's being assumed that any such "google-like system"
will read RDF, and presumably also SPARQL.

The issue is not about whether it supports RDF/SPARQL, but about
depending on centralized repositories to serve us data rather
than going to the authoritative sources directly.

Both approaches will be useful. And both will be necessary. And
it will be the direct, distrubuted access via web authorities
that feeds/builds those centralized repositories, but also
providing checks and balanaces against fraud and (data) corruption.

>  After all, RDF should be capable of defining various "authority" 
> relations, so you just describe the kind of authorative that you mean 
> as part of your query, no?

Yes, but this is entirely separate from where that knowledge is
obtained. No matter where the knowledge is obtained, we will need
to be able to authenticate it.

(though, note that knowledge obtained directly from the web
authority brings with it a defacto form of authentication
and validity -- even if that will not be sufficiently robust
for some applications, e.g. financial transactions, etc.)

>
>> One reason why the web is a success is because it is distributed,
>> not centralized. One does not have to be aware of third party
>> centralized repositories of representations in order to ask for
>> one, given a particular URI. One just asks the web authority of
>> the URI. Yes, centralized repositories (or indexes) of representations
>> such as google are tremendous tools, but they simply augment the
>> fundamental architecture of the web.
>>
>> GET is to MGET as GOOGLE is to SPARQL
>>
>> Given a particular URI, and no further knowledge, one could
>> ideally obtain an authoritative description of the resource
>> identified by that URI from the web authority of the URI.
>
> Yep. But the idea that any collection of descriptions selected by 
> whoever sets up a webserver is intrinsically more interestng than a 
> speciic query over descriptions that may have cme from anywhere 
> strikes me as pretty flawed.

That's not at all what I'm arguing. Perhaps you should
re-read what I wrote.

I'm not arguing use either-or. I'm arguing use both.

> If I could rely on that data to answer a handful of chosen questions 
> then I can see it being more useful, and if I could know in advance 
> when it would be useless that would be even better.

It may be more useful for some applications to query
third party sources -- but where shall those third
party sources get their knowledge, in a manner that
is traceable to its authoritative source?

Again, it's not either-or. Rather, google'esque crawlers
can harvest authoritative knowledge from the web authorities
of URIs, recording the source/authority of the knowledge
harvested using e.g. signed graphs, and make that knowledge
available in a centralized knowledge base.

If anyone questions the validity, freshness, or completeness,
of that third-party-maintained knowledge, they can check
with the authority directly.


>
>> But a centralized solution to knowledge discovery cannot be
>> the foundational or primary solution if we are to see
>> global, ubiquitous scalability achieved for the SW in the
>> same manner as has been achieved for the web.
>
> Right. Of course a lot depends on what you mean by "a centralised 
> solution"...

By 'centralized' I mean that (efficient) access to knowledge
must be via third parties, not directly from the web authority
of the URI identifying the resource in question.

Regards,

Patrick

>
> cheers
>
> Chaals
>
> -- 
> Charles McCathieNevile                      Fundacion Sidar
> charles@sidar.org   +61 409 134 136    http://www.sidar.org

Received on Sunday, 10 April 2005 18:02:48 UTC