Re: Content-in-RDF stable draft from Johannes Koch on 2008-03-12 (public-wai-ert@w3.org from March 2008)

From: Johannes Koch <johannes.koch@fit.fraunhofer.de>
Date: Wed, 12 Mar 2008 13:49:10 +0100
To: public-wai-ert@w3.org
Message-ID: <47D7D146.3080101@fit.fraunhofer.de>

Carlos Iglesias schrieb:

[TextContent]

>>> Additionally the transcodification problem still remains.
>> Could you please elaborate on that?
> 
> It just remind me the problem we already faced while working on the HTTP Vocabulary in RDF and the body property [3]
> 
> [3] - [http://www.w3.org/WAI/ER/HTTP/WD-HTTP-in-RDF-20070301#body]

If you cannot create a string from a byte sequence, you cannot create a 
TextContent resource. If you do have a string, you can create a 
TextContent resource.

Situation A:
Given the byte sequence of non-text content (contentBytes)
-> Create a cnt:Base64Content resource. No cnt:TextContent resource 
should be created, although in some cases it's technically possible to 
create a string from the contentBytes using a certain character encoding.

Situation B:
Given the byte sequence of text content (contentBytes), the character 
encoding used (ce)
-> Create a cnt:Base64Content resource. Transform contentBytes to string 
s using character encoding ce. Then create cnt:TextContent resource with 
cnt:chars property with literal object for string s and 
cnt:characterEncoding property with literal object ce.

Situation C:
Given the byte sequence of text content (contentBytes), the 
inappropriate character encoding (ce)
-> Create a cnt:Base64Content resource. Transforming contentBytes to 
string s using character encoding ce fails. No cnt:TextContent resource 
can be created.


IMHO, the question of being able to transform a character sequence like 
a serialized RDF model (RDF/XML, N3, etc.) into a byte sequence using a 
certain character encoding is a general issue and not a specific 
Content-in-RDF or HTTP-in-RDF one.

OTOH, if for transforming character sequences you are somehow restricted 
to certain non-full-Unicode-capable character encodings like ISO-8859-1 
or US-ASCII, you may not be able to create certain characters like the 
EURO character in the first place.


But what about the following situation?

Situation D:
Given the character sequence of text content created in memory (i.e. not 
created by transforming from a byte sequence) (contentString)
-> Create a cnt:TextContent resource.
Can we also create a cnt:Base64Content resource using an appropriate 
character encoding? Should we allow an optional cnt:characterEncoding 
also for cnt:Base64Content resources, so allow it for cnt:Content 
resources in general?

-- 
Johannes Koch
BIKA Web Compliance Center - Fraunhofer FIT
Schloss Birlinghoven, D-53757 Sankt Augustin, Germany
Phone: +49-2241-142628    Fax: +49-2241-142065

Received on Wednesday, 12 March 2008 12:50:05 UTC