Re: RDFSync 0.1 from Danny Ayers on 2006-03-22 (semantic-web@w3.org from March 2006)

From: Danny Ayers <danny.ayers@gmail.com>
Date: Wed, 22 Mar 2006 14:26:37 +0100
To: "Dan Brickley" <danbri@danbri.org>
Cc: "Hans Teijgeler" <hans.teijgeler@quicknet.nl>, "Giovanni Tummarello" <g.tummarello@gmail.com>, semantic-web@w3.org, reto@gmuer.ch
Message-ID: <1f2ed5cd0603220526x1b952715k35378fc11c72acb8@mail.gmail.com>
I've been very much looking forward to someone dealing with the sync
problem (reasons below), I only got as far as realising it's harder
than it first seems... Nice work Giovanni!

Given that Reto's been working in the same general area with diff,
patch and leanify [1], maybe you could coordinate on test sets - help
keep the competition friendly ;-)

One of his big targets is RDF version control (how best to *implement*
provenance & temporal labelling). This must also be in general scope
for your algorithms/tools. Has your previous digital-signing work
suggested any good strategies for managing the names/labels/signatures
of MSGs (/molecules/CBDs)? What kind of granularity seems appropriate
for practical version rollback?

That reminds me, after seeing your DBin demo I planned to re-read the
docs on Minimum Self-Contained Graphs [2],  RDF Molecules [3] and
Concise Bounded Resource Descriptions [4], still not had chance. I'm
still confused over the similarities/differences and circumstances for
which each would be better. Has anyone by any chance done a short
compare and contast?

> * Hans Teijgeler <hans.teijgeler@quicknet.nl> [2006-03-22 00:30+0100]

> > Interesting, but what requirement is fulfilled by this?

On 3/22/06, Dan Brickley <danbri@danbri.org> wrote:
> Perhaps addressbook synchronisation over bluetooth? Syncing my
> Mac addressbook with my P800 mobile phone used to take many
> minutes (I believe it used SyncML).
>
> I have however solved that particular problem by thinking outside
> the box, and losing my mobile phone. My bluetooth addressbook
> problems are no more...

Heh, the drawback of that strategy is that it usually results in a
more complex replacement...

My Holy Grail for SemWeb tech is the Personal Knowledgebase:
addressbook and *everything* else. Although I share the view sometimes
expressed around these parts that looking for an individual "killer
app" for the Semantic Web is to miss the point somewhat, the PKB seems
like a scopable application of the technologies which could provide
direct, tangible benefit to the majority of computer users. What's
more it doesn't depend on the greater vision - such a (set of) tool(s)
could be useful without any network effects, although its benefits
would be magnified by such effects. For practical reasons, I think
graph sync is a prerequisite.

There's considerable overlap between what I'm calling a PKB and the
aims of the Semantic Desktop efforts (a SD could be a PKB). But what I
have in mind is looking at the info management problem more loosely:
the unified UI is certainly a worthy goal, but not essential for a
PKB. Additionally a SD could be viewed as one aspect of a wider
(personal) system encompassing lots of remote, distributed components.
To put it another way, the ultimate Personal Knowledgebase *is* the
Semantic Web (with tools to enable personal relevance filtering, trust
and access control).

Back to the lower-hanging fruit and the value of sync. For a PKB
system to be generally useful it should be possible to work offline,
and distributing data across machines probably makes sense for
robustness. Performance is another issue with which sync can help,
having e.g. the addressbook part of the graph local to to a mobile
phone rather than visiting a remote server for every lookup.

In practice, in today's typical system architecture data for different
application domains are managed through different, usually *discrete*
applications with specialised, evolved user interfaces. But there's no
reason for such applications not to hook into a shareable data model,
i.e. RDF. So I might have the addressbook model fronted by one
application, calendaring by another, feed aggregator, weblog and
wiki/notebook and (oh yes) email by yet more. There are advantages
from the application developer and vendor's point of view to have
application-local/domain-specific data, and that's how the current
generation of tools are generally built. But this needn't be an
obstacle to increased integration.

Even with RDF-based applications, most of the time it won't be
necessary for e.g. a Semantic Wiki and iCal/RDF app to know about each
other's model. But if the data from these can be periodically exposed
and sync'd into a common store (probably using named graphs to
maintain provenance info), when queries or navigation across the whole
knowledgebase are needed, a more general purpose tool (e.g. Longwell)
can be applied.

Scale to the enterprise and I think this approach is even more
(pragmatically) compelling:  the company addressbook may be in LDAP,
the mail in IMAP, accounts in Excel. All more-or-less straightforward
to translate to an RDF model (slightly harder work going the other way
because of the filtering needed, but in practice you probably wouldn't
need to).

As long as it's possible to sync, integration of data sourced across
diverse applications can be relatively transparent. Just push stuff
onto the RDF Bus [5].

Cheers,
Danny.

[1] http://wymiwyg.org/2005/12/22/announicing-rdf-utils
[2] http://www2005.org/cdrom/docs/p1020.pdf
[3] http://ebiquity.umbc.edu/_file_directory_/papers/178.pdf
[4] http://www.w3.org/Submission/CBD/
[5] http://www.w3.org/2005/Talks/1110-iswc-tbl/#(24)



--

http://dannyayers.com
Received on Wednesday, 22 March 2006 13:26:47 UTC