RE: [all] XML from drupal from Yves Savourel on 2012-10-25 (public-multilingualweb-lt@w3.org from October 2012)

From: Yves Savourel <ysavourel@enlaso.com>
Date: Thu, 25 Oct 2012 13:13:21 -0600
To: "'Karl Fritsche'" <karl.fritsche@cocomore.com>, "'Mauricio del Olmo'" <mauricio.delolmo@linguaserve.com>
CC: <public-multilingualweb-lt@w3.org>
Message-ID: <assp.0645b5308c.assp.0645d1557d.008701cdb2e4$ccbbcdc0$66336940$@com>

Great!

Thanks for being flexible on this.

 

-ys

 

 

 

From: Karl Fritsche [mailto:karl.fritsche@cocomore.com] 
Sent: Thursday, October 25, 2012 12:57 PM
To: Yves Savourel; Mauricio del Olmo
Cc: public-multilingualweb-lt@w3.org
Subject: Re: [all] XML from drupal

 

Hi Yves, Maurico and all

after speaking with Mauricio I flattened the XML now.
It is now:



<job job_id="11" id="11" type_id="8" type="node" xml:lang="de" domain="Presse">
 <item id="11-body-0-value" its:allowedCharacters="."><![CDATA[blah]]></item>
</job>

Hope this is better for all of us.

Cheers
Karl


On 25.10.2012 14:27, Yves Savourel wrote:

Hi Mauricio,
 

I didn’t answer to this in first instance because that 
item nodes nested structure is given to me in the 
export step from Drupal.

 
Yes, that's why I was saying it was more a question for Moritz and his team.
 
 

It does not cause any problem to us in the translation aspect 
because we only extract the content of the nodes that are 
marked by the ITS rules as translatable (rules, default, 
inheritance, etc.) and have content.

 
>From an XML processing viewpoint <item id="11-body"> and <item id="11-body-0"> are both translatable and have content (each has some white-space nodes and <item> element in it, that's content).
 
You probably really mean "we only extract the content of the nodes that are marked by the ITS rules as translatable (rules, default, inheritance, etc.) and have only text nodes, and at least one with non white-space characters."
 
And sure, any tool dedicated to processing this specific format can do that. My feedback (again more for the creators of the document rather than for the consumers like you) is that I think most translation tools should be able to process a document provided by Drupal. And the way it is now, conditions like "have only text nodes" (which is not an ITS thing) will make it difficult for many tools to distinguish between structural <item> and <item> with real content.
 
It would be relatively simple to generate a file where the nodes with content are clearly different from the structural nodes, something like this for example:
 
<job job_id="11" id="11" type_id="8" type="node" xml:lang="de" domain="Presse">
 <item id="11-body">
  <item id="11-body-0">
   <content id="11-body-0-value" its:allowedCharacters="."><![CDATA[blah]]></content>
  </item>
 </item>
</job>
 
I suppose I was having a different expectations from a Drupal output. The fact that you, as a consumer, have to create your own XML file from the Drupal output and add things like <xlasbloq> illustrates that the Drupal output is not easily consumable. It shows that any consumer of the Drupal output will have to go through the same "massaging" of the data.
 
Sorry if I'm annoying and seem to be picky :)
I'm just trying to think ahead: If we end up with an open API for the Drupal module it would provide this same type of documents. And I would hope such files would be consumable easily by most translation tools.
 
Cheers,
-yves
 
 
 
 

 

-- 
Karl Fritsche, Junior Software Developer
Tel.: +49 69 972 69 2604; Mob.: +49 1520 206 30 93; Fax: +49 69 972 69 199; Email: Karl.Fritsche@cocomore.com
Cocomore AG, Gutleutstraße 30, D-60329 Frankfurt
Internet: http://www.cocomore.de <http://www.cocomore.de/>  Facebook: http://www.facebook.com/cocomore Google+: http://plus.cocomore.de <http://plus.cocomore.de/> 
Cocomore is active member of the World Wide Web Consortium (W3C)
Vorstand: Dr. Hans-Ulrich von Freyberg (Vors.), Dr. Jens Fricke, Marc Kutschera, Vors. des Aufsichtsrates: Martin Velasco, Sitz: Frankfurt/Main, Amtsgericht Frankfurt am Main, HRB 51114

Received on Thursday, 25 October 2012 19:13:52 UTC