Re: Oracle's stand regarding N-TRIPLES from Zhe Wu on 2011-08-20 (public-rdf-wg@w3.org from August 2011)

From: Zhe Wu <alan.wu@oracle.com>
Date: Fri, 19 Aug 2011 18:34:35 -0700
To: Steve Harris <steve.harris@garlik.com>
CC: public-rdf-wg@w3.org
Message-ID: <4E4F0F2B.7010500@oracle.com>
Hi Steve,

Thanks for the clarification! Now one thing I'd like to understand is that if we keep the current N-TRIPLE syntax, then presumably,
- existing tools/platforms dealing with N-TRIPLES don't have to change (less work for engineers :)),
- there is no risk of breaking existing client applications that accept the current N-TRIPLES syntax,

On top of everything, the existing N-TRIPLES can support all international characters. It's not like we are missing anything.
Why fix something that is not broken?

I don't see how adding UTF8 encoding can make N-TRIPLES much more useful. I do see
a lot of potential interoperability, backward compatibility issues that associate with a new encoding.

Thanks,

Zhe


On 8/19/2011 3:40 PM, Steve Harris wrote:
> Yes, we support N-Triples, but it's much less useful that it could be, as it doesn't support a common unicode encoding.
>
> - Steve
>
> On 2011-08-19, at 16:56, Zhe Wu wrote:
>
>> Hi Steve,
>>
>> I was under the impression that your product supported N-TRIPLES. Guess I was wrong.
>> Adding a new format can be more efficient for one system, and can be more in-efficient for another
>> system.
>>
>> Thanks,
>>
>> Zhe
>>
>> On 8/19/2011 2:17 AM, Steve Harris wrote:
>>> I agree with Jeremy.
>>>
>>> For us, the lack of UTF-8 support is a serious impediment to using N-Triples as a bulk dump/restore format.
>>>
>>> We use UTF-8 internally to hold RDF literals, as every other format is natively UTF-8, so the export to N-Triples requires a lot of unnecessary and inefficient escaping.
>>>
>>> - Steve
>>>
>>> On 2011-08-18, at 23:26, Jeremy Carroll wrote:
>>>
>>>> Hi Zhe
>>>>
>>>> I find this a surprisingly strong position.
>>>> When ingesting N-Triples the code path to read UTF-8 and the code path to read \uXXXX escape sequences are probably equally horrible. The UTF-8 code path is the more conventional one to be following on the Web.
>>>>
>>>> It seems like a fairly small amount of extra code for a vendor to support, with negligible impact on performance. The only downside, that I can see, would be that new data will not be readable by old software, which is the normal downside with new versions of a format.
>>>>
>>>> We may differ in our judgment about how important that downside is, or I may have missed some other disadvantage that motivates Oracle's strong reaction.
>>>>
>>>> My understanding is that 2004 N-triples docs will be valid turtle docs ....
>>>>
>>>> Jeremy
>>>>
>>>>
>>>>
>>>> On 8/18/2011 9:05 AM, Zhe Wu wrote:
>>>>> Hi,
>>>>>
>>>>> After discussing with the whole Oracle Database Semantic Technologies team, we
>>>>> have the following consensus within Oracle.
>>>>>
>>>>> 1) The existing N-TRIPLES format [1] is key to Oracle's product;
>>>>> 2) Oracle hasn't received from Oracle's customers any change request/suggestions regarding the current N-TRIPLES syntax;
>>>>> 3) As a platform vendor, Oracle does not see any significant justifications to change/mend the existing syntax;
>>>>>
>>>>> Hence Oracle will not support any major changes to the existing N-TRIPLE format, including
>>>>> support for UTF-8.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Zhe&    Souri
>>>>>
>>>>> [1]http://www.w3.org/TR/rdf-testcases/#ntriples  (In "RDF Test Cases: W3C Recommendation 10 February 2004")
>>>>>
>>>>>
>>
Received on Saturday, 20 August 2011 01:36:10 UTC