RE: datatypes: conversation with Mark Butler, chair of cc/pp

> -----Original Message-----
> From: ext Butler, Mark [mailto:Mark_Butler@hplb.hpl.hp.com]
> Sent: 16 August, 2002 13:48
> To: Stickler Patrick (NRC/Tampere); Butler, Mark; McBride, Brian;
> w3c-rdfcore-wg@w3.org
> Cc: Reynolds Franklin (NRC/Boston)
> Subject: RE: datatypes: conversation with Mark Butler, chair of cc/pp
> 
> 
> Hi Patrick
> 
> Firstly thanks for your detailed reply.
> 
> > The datatype that constrains the value of BitsPerPixel to
> > the set of integers is implicit in the CC/PP standard. It
> > couldn't be explicit yet, because RDF doesn't provide a
> > mechanism for making it explicit.
> 
> The people who originally worked on UAProf and CC/PP seem to 
> think of it as
> being implicit, but my view is slightly different: I think 
> CC/PP and UAProf
> desperately need a way to make it explicit, because I think 
> CC/PP processors
> should be general purpose and able to process any vocabulary 
> as this is
> essential for device independence. This means that vocabulary 
> data typing
> information shouldn't be hardcoded into processors, because 
> processors (even
> if they are only processing UAProf) have to process several 
> vocabularies.
> For example UAProf currently puts the data type in the 
> comments field in the
> schema e.g.
> 
> <rdf:Description ID="BitsPerPixel">
>   <rdf:type 
> rdf:resource="http://www.w3.org/TR/PR-rdf-schema#Property" /> 
>   <rdfs:domain rdf:resource="#HardwarePlatform" /> 
>   <rdfs:comment>
> Description: The number of bits of color or grayscale 
> information per pixel, related to the number of colors or shades of 
> gray the device can display. 
> Type: Number 
> Resolution: Override 
> Examples: "2", "8"</rdfs:comment> 
>   </rdf:Description>
> 
> So at the moment my processor has to parse the comments field 
> to get the
> data type information! Clearly this is not ideal. 

I would consider that an understatement, but you're
probably just being nice ;-)

Definitely, the more explicitly the semantics can be
expressed the better off we'll be.

> I've proposed an
> alternative form to OMA (see below) 
> 
> <rdf:Description rdf:about="&ns-prf;BitsPerPixel">
>   <rdf:type rdf:resource="&ns-rdfs;Property"/>
>   <rdfs:domain rdf:resource="&ns-prf;HardwarePlatform"/>
>   <rdfs:range rdf:resource='&ns-prf;Number'/>
>   <prf:resolutionRule rdf:resource='&ns-prf;Override'/>
>   <rdfs:comment xml:lang="en">
> Description:  The number of bits of color or grayscale information per
> pixel, related to the number of colors or shades of gray
> the device can display.
> Type:         Number
> Resolution:   Override
> Examples:     "2", "8"
>   </rdfs:comment>
> </rdf:Description>
> 
> which is much easier for an RDF processor to read. Now 
> obviously when data
> types are finalized, the schema should use the approved data 
> typing method.

Right. And if/when any new properties and/or datatypes
are added to CC/PP, applications which were written properly
to take their information from the schema itself rather than
having it hard coded need not be modified.

(Not to mention the opportunity to create/use generic
software components for dealing with arbitrary datatyped
content)

> The only problem then would be persuading OMA to make this 
> change. Perhaps
> as you work for Nokia you could help persuade OMA that 
> revising the UAProf
> schemas in this way would be helpful?

I'm not myself directly involved in the OMA work, but
certainly can ping the folks who are about this.

> From your other comments, it sounds like we are in agreement. 

Looks like we are -- though a word of warning, not all WG
members subscribe to the above solution, whereby inlined
property values such as <BitsPerPixel>10</BitsPerPixel>
denote an integer value and rdfs:range can be used as above
to define the datatype by which such values are to be
interpreted. The WG has been a little stuck on this issue,
though I am hoping that your comments here will help nudge
us towards the above solution.

> > IMO, we will want our systems to become more modular and
> > generic insofar as knowledge representation and inference
> > is concerned, and to have to rely less on application specific
> > interpretation, so having the above sort of fuzzyness in
> > the datatyping semantics and pushing the ultimate interpretation
> > out to the application layer will negatively impact scalability
> > and portability of knowledge, as one will have to be concerned
> > whether all applications utilizing the same RDF expressed
> > knowledge employ the same actual interpretations.
> 
> Yes, exactly. The only problem is this is hard: we don't have 
> this in CC/PP
> yet, because processing CC/PP i.e. performing resolution to 
> merge multiple
> profiles isn't fully defined so is potentially application 
> dependent. It is
> fully defined in UAProf, but other CC/PP vocabularies don't 
> have to follow
> the UAProf model. So my big concern with CC/PP at the moment 
> is that when
> people create new vocabularies they may use it in a very application
> specific way, which negates the advantage of using CC/PP. If 
> CC/PP has this
> problem, my guess is it's potentially even harder for RDF.

Quite so. And one step towards managing this larger problem
is to be sure that the semantics of those vocabularies are
expressed in RDF and not left at the internalized application 
level, and that means capturing the full meaning for all 
literals, as much as possible, including inlined literals
which are implicitly typed by schema assertions on the
properties. 

> Incidentally I characterise the resolution problem as 
> follows: you have
> multiple values for the same attribute of the same resource but with
> different contexts so it is necessary to determine the most 
> appropriate
> context to get the most appropriate value. I think this is not just a
> problem for CC/PP but also a general task for RDF. Is it 
> likely future work
> will consider standardized approaches to this problem?

Contextual or qualified knowledge is certainly something that is not
addressed very well in RDF at the moment, and also something
that I myself have been thinking about, as have many others. I
expect and hope that such issues will be better addressed in the
next incarnation of RDF, as well as in intermediate solutions in
the not-so-distant future.

Interestingly, it was thinking about such issues that got me
concerned about datatyping, as I consider that strong and explicit
datatyping to be a foundational component of that larger problem.

Cheers,

Patrick

> best regards
> 
> Mark H. Butler, PhD
> W3C CC/PP Working Group Chair
> Research Scientist                HP Labs Bristol
> mark-h_butler@hp.com
> Internet: http://www-uk.hpl.hp.com/people/marbut/
> 

Received on Friday, 16 August 2002 07:11:29 UTC