W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > April 2012

Re: Missing requirment - content format/type

From: Arle Lommel <arle.lommel@gmail.com>
Date: Mon, 30 Apr 2012 09:29:17 -0800
Cc: public-multilingualweb-lt@w3.org
Message-Id: <3C600952-DA17-4164-87A4-FB81F95C5D0C@gmail.com>
To: Dave Lewis <dave.lewis@cs.tcd.ie>
Hi Dave,

formatType wasn't from me. It was in the original requirements doc, IIRC, and I just ported it over. It was not always clear where things came from. I took ownership of this one simply because Linport was already addressing something similar, but I don't have a strong desire to own it.

I do think, however, that your description of formatType is correct: it was designed to convey something to the translator (who might see only an XML file) something about the intended destination at the end of the process. Given that that is not always apparent from what the translator receives, it is useful. However, it may not be crucial since, in general, this would already be something conveyed outside of the files in the project negotiation.

Best,

Arle

Sic scripsit Dave Lewis in Apr 30, 2012 ad 09:19 :

> Hi Yves,
> Yes, I had similar thoughts about the contentType in processTrigger. Even to decode the different implementation variants possible with HTML5 and XML, e.g.  ITS1.0, microsdata, RDFa, it would be helpful for parsers to know what they are dealing with. Is it also conceivable that we could have document that mix host format and ITS markup formats? How much can we build on mime here?
> 
> Similar parsing issues arise with segmentation I guess. either way, it would seems this might lead to use cases that are separate to the 'processTrigger' one.
> 
> Format type though is something different. The examples given (e.g., subtitles, spoken text)  seem to indicate its more a specification of the intended delivery modality. Arle, was that one from you?
> 
> So perhap we can;
> 1) rename 'formatType' to 'delivery-modality'
> 2) promote  'contentType' to an independent data category.
> 
> Any thoughts on that anyone.
> 
> If it sounds OK, I'll make the change tomorrow, as i was planning an update to the processTrigger data category based on previous exchanges anyway.
> 
> cheers,
> Dave
> 
> On 30/04/2012 14:05, Yves Savourel wrote:
>> Hi,
>> 
>> I think there is a need for a data category indicating the format of the content to be processed.
>> 
>> More and more data formats hold content in different formats, a classic one is HTML inside an XML document. But there are many variations of this. There are also the complex cases of nested formats.
>> 
>> It would be useful to have a standard way to indicate such variations of content to extraction tools (and to the other tools down the chain). It would probably be something in the 'internationalization' category.
>> 
>> I see we have a "formatType" requirement[1] and a 'contentType' property in the "processTrigger" requirement.
>> 
>> [1] http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#formatType
>> [2] http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#processTrigger
>> 
>> What I have in mind is very close or identical to the 'contentType' Pedro has listed, but as a distinct data category (that, obviously, could be used also in the processTrigger information).
>> 
>> (I also think the 'formatType' data category may be clearer with a different name)
>> 
>> Cheers,
>> -yves
>> 
>> 
>> 
> 
> 
Received on Monday, 30 April 2012 17:29:53 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:24:55 UTC