RE: AGENDA 2012-08-15: Internationalization Teleconference

Added to the agenda.

Unicode normalization is not inherently reversible and there are, potentially, corner cases in which it will affect the display of text. It isn’t harmful to the meaning to apply NFC. But you won’t necessarily get back byte-for-byte the same HTML5 file you started with.

Because HTML5 doesn’t require NFC and because we lost the battle over requiring IDs (such as attribute values) to be normalized at least on comparison, you might possibly break badly constructed (or deliberately strangely constructed?) files that rely on the difference. But this is a very small risk, IMO.

Talk to you in 15 minutes :-)

From: Felix Sasaki [mailto:fsasaki@w3.org]
Sent: Wednesday, August 15, 2012 1:38 AM
To: Phillips, Addison
Cc: www-international@w3.org
Subject: Re: AGENDA 2012-08-15: Internationalization Teleconference

agenda+ NFC, HTML5 and RDF conversion

http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0196.html

Background: in the MLW-LT working group, we are defining a transformation of HTML5 (or XML) content to the RDF format NIF, see
http://wiki.nlp2rdf.org/wiki/ITS2NIF2ITS

for details. The details are not important here, but one aspect of the transformation is that is in the 0196 mail cited above:

"- RDF recommends Unicode NormalForm C :

http://www.w3.org/TR/rdf-concepts/#section-Literals


This is why, we will make it mandatory. Some of the RDF parsers might

complain, if any literals are not in Unicode Normalform C . Sometimes
these are just warning and sometimes parsing fails completely."

What I'd like to discuss briefly:
- is this the right thing to do?
- will it work with HTML5 content in a round tripping scenario, that is HTML5 > HTML5 converted to NFC > NIF > HTML5?
- IIRC, and looking at this thread
http://lists.w3.org/Archives/Public/www-validator/2011May/thread#msg31

HTML5 doesn't require NFC. Can this cause troubles for the NIF conversion, and how could we avoid them?

Thanks,

Felix

Am Mittwoch, 15. August 2012 schrieb Phillips, Addison :
----------------------------------------------------------------------
Time     : 15:00 UTC
Bridge   : +1-617-761-6200 (Zakim)
         with conference code I18N (4186) Duration : 60-90 minutes
------------------------------------------------------------------------
Zakim information    : http://www.w3.org/2002/01/UsingZakim


Zakim bridge monitor : http://www.w3.org/1998/12/bridge/Zakim.html

Zakim IRC bot        : http://www.w3.org/2001/12/zakim-irc-bot.html



IRC channel          : #i18n on irc.w3.org:6665<http://irc.w3.org:6665>
IRC via the Web      : http://www.w3.org/2001/01/cgi-irc (#i18n channel)
------------------------------------------------------------------------
Meeting: Internationalization Core Working Group
Chair:   Addison Phillips
Scribe: Addison Phillips
ScribeNick: aphillip
Agenda:  http://www.w3.org/International/wiki/Core_Homework


=== Meeting Agenda ===

Topic: Agenda and Minutes

Topic: Action Items

Topic: Info Share

Topic: Writing Modes

Topic: TURTLE

Topic: Bidi Isolation?

Topic: AOB?

Addison Phillips
Globalization Architect (Lab126)
Chair (W3C I18N WG)

Internationalization is not a feature.
It is an architecture.


--
Felix Sasaki
DFKI / W3C Fellow

Received on Wednesday, 15 August 2012 14:48:01 UTC