RE: RDF [was: Proposed changes]

(cc'd to rdf-interest)

Ken MacLeod:
> > Sam Ruby is blogging his experiments with XPath and XQuery over his
> > accumulated store of XML.  I expect as part of his experimentation
> > he'll be adding more islands of information to that store as time goes
> > along.  My bet is that he's going to be writing a lot more supporting
> > code and queries to "tie together" those islands of information than
> > someone using an RDF store would have to do.

Sab Ruby:
> IMHO, the most signficant source of metadata in my accumulated store of
> information (only recently beeing converted over to XML) is within the
> hypertext links within the content.  My bet is that writing code and
> queries to extract that particular information from well formed XHTML
> will be a lot easier than trying to mine that information from within
> the parse type = "Literal" islands that have been proposed to date.
>
> Are there any proposals for "RDFizing" this information?

It's a very interesting idea. The <a href="... info is likely to be a lot
more useful for queries than knowing e.g. where the <p>s occur.

The direct approach may be to pull out the links as extra metadata, given
something like:

<entry rdf:about="http://www.intertwingly.net/blog/1624.html">
    <title>Atom in Depth</title>
	<content>
		&lt;img src="http://www.intertwingly.net/images/xml03.jpg"
class="floatleft"
		alt="XML 2003" /&gt;&amp;nbsp; I'll giving a presentation entitled
		&lt;a
href="http://www.xmlconference.org/xmlusa/2003/friday.asp#22"&gt;Atom
		in Depth&lt;/a&gt; on Friday, December 12th at the
		&lt;a href="http://www.xmlconference.org/xmlusa/"&gt;2003 XML
		Conference&lt;/a&gt; in Philadelphia.
	</content>
...
</entry>

it would be possible to extract something like

<entry rdf:about="http://www.intertwingly.net/blog/1624.html">
	<embeds>
		<link rdf:about=""http://www.xmlconference.org/xmlusa/2003/friday.asp#22">
			<title>Atom in Depth</title>
		</link>
		<link rdf:about="http://www.xmlconference.org/xmlusa/">
			<title>2003 XML Conference</title>
		</link>
	</embeds>
</entry>

A potentially cool bit is that I think it should be possible to do this on
the fly : an RDF query could ask for all links in the content of that entry,
and the RDF store could get as far as <content> but then an XPath query
could be carried out on that, the results expressed in the RDF model.

Why not do it all as XML/XPath? Main reason being that the XML tree
structure doesn't match the web's general directed graph structure, probably
a more immediate practical problem being that the subtrees aren't very
portable across systems (merging isn't straightforward).

some related notes : http://dannyayers.com/archives/001981.html

Cheers,
Danny.

Received on Sunday, 26 October 2003 04:45:53 UTC