what triple bloat? [was: RDFCore WG minutes for the telecon 2002-08-02 (rdf doc, datatypes)]

[...]

> 8) Datatypes
>
> Discussion of Guha's proposal to concentrate on local typing only;  his
> basic position is that we are signally failing to make progress on 
"global
> idiom", but it would be highly unsatisfactory to publish without a way 
to
> express (say) that some given literal is a number.
>
> Mike and Frank expressed a desire for the global idiom to be available.
> We believe that PatrickS (in absentia) strongly desires global idiom.
> DanC asked to explore going to last call without any datatyping.
>
> Guha clarifies:  proposal is to start with local idiom, allow
> application-specific or other layers to add global idiom;  i.e. not to 
rule
> it out completely.
>
> MikeD raises two objections:  local idiom adds triple-bloat;  local 
idiom
> privides noi way for schema to say that a particular data type is 
expected
> for a given property.
> Guha clarifies:  "local idiom" means some way of specifying the type of 
a
> particular literal, not necessarily extra triples.  In particular, allow
> literals to include things other than strings.
>
> Jeremy:  what does graph look like with local typing?
>
> (Jeremy?):  restricting ourselves to local typing is one thing, but 
there's
> also the issue of tidy vs untidy literals that still needs to be 
addressed.
>
> ... (more discussion of various details.  General tone is supportive, 
other
> than concerns already mentioned.)
>
> Question:  can use of schema to express expected typing be consistent 
with
> local typing?  Yes, but the schema-described typing would not be 
enforced
> by the model theory.
>
> For clarity, the proposal is something like this:  <age
> xsi:type="xsd:integer">10</age>
>
> DECIDED:  Guha, Sergey, PatH, Mike, Jos will work on a new proposal to 
do
> local typing only;  jjc will do a test case or two.
>
> DaveB notes that if there are syntax changes, the final draft of syntax 
due
> next week is at risk.
>
> MikeD reasserts that triple bloat would be a real problem (current
> databases c. 0.5million triples).  The new proposal won't introduce new
> triples;  still some concern about need to annotate literals with type.
>
> ACTION 2002-08-02#3, Guha:   lead submission of new datatyping proposal, 
ASAP
> ACTION 2002-08-02#4, Jeremy: prepare test case(s) for new datatyping 
proposal


in the meeting I argued
[[[
[14:43:02] gk-scribe
JosD: don't buy triple-bloat argument -- prefer's
"interpretation properties" approach, and believs
it is scalable
]]]
-- http://ilrt.org/discovery/chatlogs/rdfcore/2002-08-02.html#T14-43-02

a bit later from MikeD
[[[
[15:02:02] gk-scribe
MikeD: triple-bloat argument -- has a number of
databases in the 0.5million triple range -- taking
these to 2 million would be a problem.
]]]
-- http://ilrt.org/discovery/chatlogs/rdfcore/2002-08-02.html#T15-02-02

I really have a very hard time evaluating that
so called "triple bloat" argument...
Let me take (although this shouldn't be usual)
our implementation.
In the worst case that all RDF objects are literals
and that they are all typed (i.e. no strings)
we go from 6x java objects to 8x java objects
which is an increase of 33%.
In a more typical case, I would guess maybe 3%
or less which is far from that mentioned 300%

also

"The all knowledge is contained in here"
is not true (*)

and

we can translate from the more precise to
the less precise; one way

-- ,
Jos De Roo, AGFA http://www.agfa.com/w3c/jdroo/

(*) taken from Tim Berners-Lee
    this is at least true for this message, RDF
    and Graham's meeting minutes :-)

Received on Friday, 2 August 2002 16:36:22 UTC