- From: Terry Brooks <tabrooks@u.washington.edu>
- Date: Thu, 11 Jun 2009 11:41:47 -0700
- To: "public-lod@w3.org" <public-lod@w3.org>
I'm preparing course material about querying DBpedia from a web page using Firefox and Greasemonkey, unpacking the payload received and patching the information into a web page. My sample SPARQL query is for the state flowers of states of the United States, a query that is listed on the Meow meow meow blog at http://www.craigethomas.com/blog/2009/02/anatomy-of-a-sparql-query-part-1-select/
Strategies for unpacking the payload are complicated by unpredictable structural irregularities of the payload. I was wondering if someone could suggest an explanation, or point out explanatory documentation that I could provide my students.
Most of the states have a predictable XML payload that is structured like this:
<result>
<binding name="state">
<uri>http://dbpedia.org/resource/Mississippi</uri>
</binding>
<binding name="flower">
<uri>http://dbpedia.org/resource/Magnolia_Blossom</uri>
</binding>
</result>
But West Virginia's state flower is structured as a literal with an embedded HTML tag:
<literal xml:lang="en">Rhododendron<br>(''Rhododendron maximum'')</literal>
And Florida's state flower listing contains escape characters:
<uri>http://dbpedia.org/resource/Orange_%28fruit%29</uri>
There is also the general problem of multiple listings. For example, California is listed with the California_Poppy twice.
What is an explanation for these structural irregularities?
Thanks, Terry
Terrence Brooks
Information School
University of Washington
Voice: 206 543-2646
Fax: 206 616-3152
E-mail: tabrooks@u.washington.edu
Web: http://faculty.washington.edu/tabrooks/
Received on Thursday, 11 June 2009 18:42:22 UTC