RE: SIMILE Research Drivers

Rob,

> -----Original Message-----
> From: Tansley, Robert 
> Sent: 7 April 2003 16:19
> To: Seaborne, Andy; 'Kevin Smathers'
> Cc: 'www-rdf-dspace@w3.org'
> Subject: RE: SIMILE Research Drivers
> 
> 
> Hi Andy,
> 
> > 1/ Just using RDF as a transport format is not really 
> > utilizing RDF because the semantics are hidden in the 
> > internal processed representation and not necessarily 
> > preserved on converting into and out of RDF.  As it is the 
> > internal semantics that matter (i.e. are first-class), you 
> > might as well use XML for a transport as you are relying on 
> > the converters to maintain the semantics across the Web.  
> 
> Could you explain what you mean here, perhaps with an example?
> 
> One of the big issues that SIMILE (and the Semantic Web as a 
> whole) will have to deal with (IMHO) is that [meta]data is 
> not necessarily going to be 'born RDF'.  One of the main 
> points of the project was, I thought, allowing communities to 
> use their own metadata schemas, most of which won't currently 
> be in RDF ('RDF native').  So, a big research question in 
> SIMILE as I see it is the feasibility of representing all 
> [meta]data in RDF, and how much work it is to translate 
> [meta]data in some other format into RDF.  We have to be able 
> to do this, do it without losing data, without unreasonable 
> effort required, otherwise we're not going to get very far 
> with the RDF approach in this area.

This is a significant issue for SIMILE.  A community isn't going to want a
reduction in the information in their [meta]data.

There are two ways of approaching interoperability between different
metadata schemas: a number of pairwise conversions (least lossy, more work)
or a single canonical form.  So I agree that a research question is the
feasibility of representing everything in RDF.  Maybe we also need to store
the original as well for the use solely within a community to ensure the
original information is not lost.

The phrase "RDF as a transport format" conjures up the idea that RDF is only
used between systems, and not internally, so a lot of conversion is being
done.  This is potentially lossy as (1) no system is perfect and (2)
typically the internal form is the master and is not accessible except
through the transport format.  One *small* step with RDF is that there is a
way to say what was meant by a property - you give it an identifying name
and say "its defined over there" providing a point of reference.  It may
well still take a person (programmer) to use that definition - I'm not
saying that it is completely automatic - but at least a fact (statement) can
be moved around through intermediate points with less risk of alteration or
confusion.

Example: conference site says "paper X in session with Y and Z"; SIMILE can
harvest that and pass it on to the client which can display X with a note to
say presented with Y and Z.

Converters usually choke on this because they are restrictive on things they
don't grok something and the internal form can't capture it.

> I'm also curious about your use of the word 'semantics'.  How 
> are 'semantics' represented in RDF or any other data format?  
> My understanding was that the 'semantics' of data were the 
> real-world interpretations of the data.  For example if some 
> resource has a property 'foo:length' that is a literal '30', 
> the semantics of that are that the real-world object that 
> resource represents has a length of 30mm.  This information 
> isn't in the RDF, isn't in the schema; it's not in the 
> computer at all, it's a concept that's fundamentally outside 
> of the computer which is just dealing with symbols.
> 
> Is there some other meaning of the word or something else I'm missing?
> 
>  Robert Tansley / Hewlett-Packard Laboratories / (+1) 617 551 7624
>  

"Semantics" is a somewhat loaded term but I use it in a low level way. Here
the definition in the RDFS vocabulary should say foo:length works in terms
of millimetres.  Today, this might be in the rdfs:comment so it takes
someone reading the vocabulary to use the data vocabulary correctly.
Something should be GETtable at the namespace URI but there is no common
practice of what should go at the end of a namespace URI, it could be an
HTML page, not RDFS, and we will have to see what common practice emerges.
I prefer the style of putting the vocabulary file there.

A better data model here might have been a bnode with units and rdf:value
hanging off it but the definition you use is a good example because, in the
real world, we have to deal with less-than-ideal vocabularies and may have
been fine for their original purpose in a closed world.

	Andy

Received on Tuesday, 8 April 2003 11:27:05 UTC