- From: Martin Duerst <duerst@w3.org>
- Date: Mon, 27 May 2002 14:41:59 +0900
- To: Aaron Swartz <me@aaronsw.com>
- Cc: www-tag@w3.org
At 18:40 02/05/24 -0500, Aaron Swartz wrote: >On Friday, May 24, 2002, at 06:11 PM, Martin Duerst wrote: >>First some procedural points, starting with the end >>of your mail: >> >>>I'm considering appealing this decision, >> >>The Character Model is in last call, so you can raise a comment. > >Oops, I should have been more clear. It was the RDF decision I was >thinking of appealing. I see. But I guess they are not even in last call yet. >I assume that charmod will be decided in its own way. Yes, but obviously things should work together, and the RDF spec should conform to the character model. >>>I can understand presenting strings this way for user-display and >>>user-entry but storing them this way and making them the official >>>encoding seems to be going too far. >>XML can 'store' them without problems. N3 also should be able to do it. > >XML and N3 are interchange formats, I meant storage in the sense of >databases and APIs. The RDF spec defines the XML representation. I don't think there is any W3C spec for RDF databases or RDF APIs. I also don't think there are any serious databases that would have problems with 8-bit data. Same for APIs. The easiest way to define an API is to say that the parameters are encoded in UTF-8 (or maybe UTF-16). But of course you are always free to define some other conventions for your own API. >>>I would think that simply using UTF-8 %-encoding would be fine for these >>>purposes. >> >>Why do you think so? Would you think it would make sense to replace >> mailto:me@aaronsw.com >>with something like >> mailto:%6d%65@%a1%a1%72%6f%6e%73%77.%63%6f%6d >>or maybe even more appropriately, with something like the above >>but using Greek letters instead of Latin ones? This is just about >>how people using another script than Latin in their day-to-day >>work would feel. Why should they have to use special tools >>(having to do syntax analysis so that they can figure out >>where a % is an escape character and when not,...) just to >>be able to read the text, just because some tools make too >>restrictive assumptions? > >I totally understand the feeling and agree with it. It's silly to have to >enter something in like that. But that's why I have a computer to convert >it for me. I already have my computer convert "Aar" to >"mailto:me@aaronsw.com" and "D端r" mailto:duerst@w3.org. [Sorry here, my email client is not up to the job (it thinks everything is Japanese). My lame excuse is that I'm working on Web i18n, not email i18n.] >I don't expect them folks to use any special tools. In fact, requiring >Unicode would require me to go and replace a lot of my software with >special i18nized tools. Unicode is already allowed in RDF literals. Why do you say you need additional tools if it's also allowed in resource identifiers? Do you think less tools are needed? My guess would be that more tools are needed, because there are two different forms of representation of the same characters. Also, how many tools do you think there are to input/edit/... utf-8? And how many to input/edit/... %hh? Also, what kind of software are you using? For most of it (APIs, databases,...), the only thing is that they have to pass through all 8 bits. That's a lot easier than having to check that they only have ASCII. So the tools you would need are really not special internationalized tools, but just tools that don't pretend they know better than you about your data. Regards, Martin.
Received on Monday, 27 May 2002 02:34:07 UTC