- From: Lars Marius Garshol <larsga@garshol.priv.no>
- Date: 02 Mar 2002 12:22:35 +0100
- To: Chris Croome <chris@webarchitects.co.uk>
- Cc: www-rdf-interest@w3.org
* Lars Marius Garshol | | So it seems that you have a property derived from the XHTML 'dir' | attribute which associates the string "ltr" with your abstract. | Formally that looks like perfectly OK RDF to me. The property you | use also seems reasonable, though http://www.w3.org/1999/xhtml#dir | might be better. * Chris Croome | | Yes I did wonder about that, http://www.w3.org/1999/xhtmldir, seems | wrong, but can/should one use http://www.w3.org/1999/xhtml# for the | XHTML namespace? Well, you are translating between two different systems. In XML Namespaces the xhtml:dir attribute has no defined URI, but in RDF you need a URI in order to create a property for this. Putting in a hash mark seems the most reasonable approach to me. * Lars Marius Garshol | | An obvious question is whether there's any point in doing this at all. | English written in the latin script is *always* LTR, so you are adding | no useful information. * Chris Croome | | The reason that I was having a play with ltr, rather than rtl, is | that I can't write any rtl languages... Even if you had written in an RTL script (languages have no directionality, only the scripts used to write them) you wouldn't have needed this, as RTL scripts are *always* RTL. It's only when you mix scripts with different directionality that you need to specify the base direction of the text. (As my other posting explains in more detail.) | I'm anticipating that I'm going to be asked by a client to set up a | web site with Arabic and Urdu content and this will result in Dublin | Core RDF metadata files that have a mixture of directionality and | wasn't sure how to implement the advice from the W3C and Unicode | consortium [1], thats why I posted to this list. Then I understand. If this content only contains Arabic/Urdu and no latin or other LTR text then you shouldn't need to do anything special with the RDF. If it does contain bidirectional text what you will need for this to work is to capture any specifications of base direction for text that you want to become RDF literals, and also to turn any elements used to specify embedding levels (that is, elements with xhtml:dir attributes below paragraph level) into Unicode control codes. Capturing the base direction can be done either with an RDF property, or by inserting a right-to-left/left-to-right mark at the beginning of the text. | The reason I wasn't sure that specifing the directionality was | unnecessary was from an experiment in which I took a Hebrew [2] and | a Farsi [3] file from the Unicode web site, removed all HTML and CSS | directionality markup and then opened them in mozilla, the Farsi one | still displayed the text correctly but the Hebrew one was backwards. If you don't explicitly set base direction in HTML/XHTML the browsers assume that the base direction is LTR. They don't do this text analysis that I discussed in the other email. This is acceptable according to the Unicode standard, but it's not obvious that RDF software will behave the same way. In fact, I think RDF software shouldn't. It seems much more reasonable for RDF software to analyze the literals and get the base direction that way. I suspect that most RDF software does nothing at all in order to support bidi correctly, and in that case indicating base direction using RLM/LRM codes seems safer than using a property. (It is also a lot simpler, since you no longer have to reify anything.) -- Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net > ISO SC34/WG3, OASIS GeoLang TC <URL: http://www.garshol.priv.no >
Received on Saturday, 2 March 2002 06:23:17 UTC