RE: [all] XML from drupal from Yves Savourel on 2012-10-25 (public-multilingualweb-lt@w3.org from October 2012)

From: Yves Savourel <ysavourel@enlaso.com>
Date: Thu, 25 Oct 2012 06:27:57 -0600
To: <public-multilingualweb-lt@w3.org>
Message-ID: <assp.0645bfcac1.assp.0645719948.005a01cdb2ac$2a12da10$7e388e30$@com>

Hi Mauricio,

> I didn’t answer to this in first instance because that 
> item nodes nested structure is given to me in the 
> export step from Drupal.

Yes, that's why I was saying it was more a question for Moritz and his team.


> It does not cause any problem to us in the translation aspect 
> because we only extract the content of the nodes that are 
> marked by the ITS rules as translatable (rules, default, 
> inheritance, etc.) and have content.

>From an XML processing viewpoint <item id="11-body"> and <item id="11-body-0"> are both translatable and have content (each has some white-space nodes and <item> element in it, that's content).

You probably really mean "we only extract the content of the nodes that are marked by the ITS rules as translatable (rules, default, inheritance, etc.) and have only text nodes, and at least one with non white-space characters."

And sure, any tool dedicated to processing this specific format can do that. My feedback (again more for the creators of the document rather than for the consumers like you) is that I think most translation tools should be able to process a document provided by Drupal. And the way it is now, conditions like "have only text nodes" (which is not an ITS thing) will make it difficult for many tools to distinguish between structural <item> and <item> with real content.

It would be relatively simple to generate a file where the nodes with content are clearly different from the structural nodes, something like this for example:

<job job_id="11" id="11" type_id="8" type="node" xml:lang="de" domain="Presse">
 <item id="11-body">
  <item id="11-body-0">
   <content id="11-body-0-value" its:allowedCharacters="."><![CDATA[blah]]></content>
  </item>
 </item>
</job>

I suppose I was having a different expectations from a Drupal output. The fact that you, as a consumer, have to create your own XML file from the Drupal output and add things like <xlasbloq> illustrates that the Drupal output is not easily consumable. It shows that any consumer of the Drupal output will have to go through the same "massaging" of the data.

Sorry if I'm annoying and seem to be picky :)
I'm just trying to think ahead: If we end up with an open API for the Drupal module it would provide this same type of documents. And I would hope such files would be consumable easily by most translation tools.

Cheers,
-yves

Received on Thursday, 25 October 2012 13:26:35 UTC