Fwd: a concern on SW technologies: document content

---------- Forwarded message ----------
From: Henry Story <Henry.Story@sun.com>

Yes. It's a good point. I think XML is excellent as a markup
language, which is what is is. Ie: start with some text

[[
Hello Joe,

How are you. I'll be over on 20th Dec.
]]

The markup is here to bracket parts of a document, to give it more
meaning.

[[
<b>Hello Joe</b>,
<p>
How are you. I'll be over on <date>20th Dec</date>.
</p>
]]

In RDF we treat it as a literal.

So my thought is that rdf treats documents as literals. And XQuery is
great at querying such documents. RDF is good at showing the
relationship between the documents.

As it is pointed out, you can't stuff everything on the planet inside
a document. Or if you do you have rdf/xml, and XQuery is no use for
querying rdf/xml.

So I think one should use the tool appropriate for the job. When you
have text to mark up, use xml. When you have to express relations
between things use rdf .

So the problem goes both ways. XQuery is no good at querying rdf,
which is exactly the same problem for rdf querying xml. The issue
gets confused because one can express rdf in xml. The pain XQuery and
XSLT people feel with rdf/xml, and that is causing the seeming rift
in the W3C, has nothing to do with the rdf/xml format. It has to do
with the fact that those tools are no good at querying relational
data. For the XQuery/XSLT people to work with their tools they need
to impose an arbitrary tree structure on the data even when this
brings nothing to the discussion. One can of course always do this
for local problems, but the cost of doing so is huge [1]. Since the
structure is completely arbitrary there is no way to really settle
the discussions, which then go on for ever. Imagine if when
serializing java objects one also had to decide for each such
serialisation on a tree structure for the serialization. Working
groups all over the world would have to spring up to deal with the
problem, and could never come at a satisfactory solutions.

In the same way it is a little useless to try to model an xml marked
up document in rdf. It can of course be done, but it would be a
little heavy handed.

Perhaps one should find a way to mix the two in SPARQL like this

SELECT ?dtString
WHERE
    :henry :wrote ?doc .
    ?doc :xquery ?dtString .
    FILTER xquery(?doc, ...some xquery to select the <date>...</date>
subtree of doc...)

The XQuery and SPARQL people could easily get together to work out
the details.

Henry

[1] http://blogs.sun.com/bblfish/entry/how_applying_xml_to_data


Home page: http://bblfish.net/
Sun Blog: http://blogs.sun.com/bblfish/
Foaf name: http://bblfish.net/people/henry/card#me



On 9 Dec 2006, at 02:31, Danny Ayers wrote:

> Any thoughts on this? (from the Semantic Web Education & Outreach
> list :
> http://lists.w3.org/Archives/Public/public-sweo-ig/2006Dec/0080.html )
>
> ---------- Forwarded message ----------
> From: Lee Feigenbaum <feigenbl@us.ibm.com>
> Date: 08-Dec-2006 21:52
> Subject: a concern on SW technologies: document content
> To: public-sweo-ig@w3.org
>
>
>
> Hi SWEOids,
>
> Wing and I had an interesting and somewhat enlightening
> conversation with
> another IBMer today. Our colleague was somewhat familiar with the
> SW world
> and is very familiar with the XML world, and he expressed concerns
> that SW
> technologies (and RDF / SPARQL in particular) may fall short in one
> prominent area in which XML / XQuery shines: dealing with content-
> oriented
> (often mixed content) documents. He was concerned about this given
> some of
> our claims about the value of RDF/SW technologies as a unifying
> environment for data and metadata.
>
> He gave various examples ranging from insurance policies to resumes to
> rentral agreements, with the basic idea being that XQuery can easily
> answer questions that involve searching within a document (or, more-
> so,
> searching for text in a particular paragraph of a document, perhaps
> with
> emphasis added) which uses XML markup. He wondered aloud and we
> discussed
> what the SW approach to this would be, and we agreed that it's lacking
> right now. He expressed worry that whereas XML can wrap data that
> might be
> best expressed as relational or RDF data (and then join that data in
> XQuery queries with document data), the RDF world may not have as
> nice a
> story.
>
> I (personally) need to think the issues here through a bit more,
> but to me
> it was not an objection that I've heard commonly, but it was an
> interesting one to which I had no immediate response, so I wanted
> to throw
> it out here and solicit thoughts and/or feedback. (I don't think it's
> imperative that we have an immediate or bulletproof response to every
> potential SW objection, but thinking about where the technologies fall
> short in addition to where they excel should help us craft our
> messaging.)
>
> have a good weekend everyone,
> Lee
>
>
>
>
> --
>
> http://dannyayers.com



-- 

http://dannyayers.com

Received on Saturday, 9 December 2006 17:15:07 UTC