- From: Brian McBride <bwm@hplb.hpl.hp.com>
- Date: Thu, 31 Jul 2003 18:05:44 +0100
- To: rdf core <w3c-rdfcore-wg@w3.org>, i18n <w3c-i18n-ig@w3.org>
This is a use case concerning xml literals which we identified during our discussions this week, and a little analysis of it. The use case may need some refining to fully capture i18n concerns. Consider an application which is building an RDF store of metadata about web pages. It crawls the web extracting title information from web pages and storing then represents this data as RDF. Lets say it is searching for <title> elements, which may contain arbritary markup. Trying for example: <title><em>title</em></title> Hmm, checking Amaya behaves oddly in this situation, and Mozilla gets it wrong. And the validator objects - says you ain't allowed <em> in titles. This is XHTML 1.1. Lets try span. No, that doesn't seem to be legal either. <title><span xml:lang="en">title</span></title> Doesn't validate. Checking, the content model for the title element is PCDATA. Ok, lets suppose its: [[ <head xml:lang="en"> <title>chat</title> </head> ]] That validates. But I note that XHTML 1.1 does not allow markup in titles! How does the application represent this in RDF? Since you can't use markup in a title element, use a plain literal :) But lets assume we are far sighted and assume that markup will be allowed in titles in the future. Well, in that case, you could use an rdf:XMLLiteral and include a span element to hold the lang tag. Objection: But then you couldn't use that literal with XHTML 1.1. Response: Record that information separately in the graph e.g. <rdf:Description rdf:about="..."> <ex:title> <rdf:Description> <rdf:value rdf:parseType="Literal">chat</rdf:value> <ex:lang>en</ex:lang> </rdf:Description> </... Objection: You've changed the title. You can't recover the exact markup that was there in the first place because you can't tell whether the span was added by the crawler or was there in the first place. Response: Most of the time, you won't care. If you do care, you can record the extra information in the graph. Objection: <span> is html specific. you might want to use the literal in another context. Response: Really need to refine the use case here, but in general if you are not prepared to commit to a specific markup language, you can use the graph to represent the underlying structure. Brian
Received on Thursday, 31 July 2003 13:07:18 UTC