- From: <w3t-archive+esw-wiki@w3.org>
- Date: Wed, 14 Sep 2005 03:30:54 -0000
- To: w3t-archive+esw-wiki@w3.org
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "ESW Wiki" for change notification.
The following page has been changed by GoutamSaha:
http://esw.w3.org/topic/its0908LinguisticMarkup
------------------------------------------------------------------------------
3-Tier Schemas to embed linguistic-related metadata information
in the structure of an XML document in order to improve the
translation process for obtaining more meaningful translation.
+ This 3-tier schema scheme is also useful for the Translation Memory
+ processes to keep context markups when I18N & L10N developers
+ use this scheme for both source and target text.
- Develop the 1st xml schema that contains various categories on
content domain,
@@ -50, +53 @@
- Examples on word POS categories: for Noun type- proper,
abstract,compound etc, and so on.
- Developers can categorize the linguistic-related metadata information in the following ways.
+ Developers can categorize the linguistic-related metadata information in the following ways.
Metadata information may vary little from one language to another language.
Content Domain
@@ -266, +269 @@
<right_parenthesis>
<mid_sentence>
<other_punctuation>
- ==============================================
+
+ ================
+
A typical schema of the content_domain is shown below.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
@@ -353, +358 @@
</xs:complexType>
</xs:element>
</xs:schema>
- ==================================
+
+ ==================
+
A typical schema for Sentence_categories is shown below.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
@@ -427, +434 @@
</xs:complexType>
</xs:element>
</xs:schema>
+
- ================
+ ==============
+ A typical schema for Parts-of-Speech (POS) Markups is shown below.
+
+ <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+ <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.kolkatacdac.in/w3ci18npos" elementFormDefault="qualified">
+ <xs:element name="cat">
+ <xs:complexType mixed="true">
+ <xs:attribute name="name" use="required">
+ <xs:simpleType>
+ <xs:restriction base="xs:string">
+ <xs:enumeration value="adjective"/>
+ <xs:enumeration value="adverb"/>
+ <xs:enumeration value="conjunction"/>
+ <xs:enumeration value="noun"/>
+ <xs:enumeration value="post_position"/>
+ <xs:enumeration value="preposition"/>
+ <xs:enumeration value="pronoun"/>
+ <xs:enumeration value="punctuation"/>
+ <xs:enumeration value="verb"/>
+ </xs:restriction>
+ </xs:simpleType>
+ </xs:attribute>
+ <xs:attribute name="type">
+ <xs:simpleType>
+ <xs:restriction base="xs:string">
+ <xs:enumeration value="abstarct_concrete"/>
+ <xs:enumeration value="abstract"/>
+ <xs:enumeration value="addressing"/>
+ <xs:enumeration value="adjective_adjective"/>
+ <xs:enumeration value="adjective_adverb"/>
+ <xs:enumeration value="adversative_coordinating"/>
+ <xs:enumeration value="cardinals"/>
+ <xs:enumeration value="case"/>
+ <xs:enumeration value="causative"/>
+ <xs:enumeration value="collective"/>
+ <xs:enumeration value="comma"/>
+ <xs:enumeration value="common"/>
+ <xs:enumeration value="comparative"/>
+ <xs:enumeration value="compound"/>
+ <xs:enumeration value="conclusive"/>
+ <xs:enumeration value="coordinating"/>
+ <xs:enumeration value="correlative"/>
+ <xs:enumeration value="demonstrative"/>
+ <xs:enumeration value="denoting_others"/>
+ <xs:enumeration value="disjunctive"/>
+ <xs:enumeration value="end_inflecting"/>
+ <xs:enumeration value="eternal_joined"/>
+ <xs:enumeration value="exclusion"/>
+ <xs:enumeration value="following_noun_of_title"/>
+ <xs:enumeration value="fractional_number"/>
+ <xs:enumeration value="general"/>
+ <xs:enumeration value="group"/>
+ <xs:enumeration value="hyphenated_numbers"/>
+ <xs:enumeration value="imperative"/>
+ <xs:enumeration value="inclusive"/>
+ <xs:enumeration value="indeclinable"/>
+ <xs:enumeration value="indefinite"/>
+ <xs:enumeration value="interjectory"/>
+ <xs:enumeration value="interrogative"/>
+ <xs:enumeration value="intransitive"/>
+ <xs:enumeration value="joining"/>
+ <xs:enumeration value="left_parenthesis"/>
+ <xs:enumeration value="material"/>
+ <xs:enumeration value="mid_sentence"/>
+ <xs:enumeration value="negative"/>
+ <xs:enumeration value="non_finite"/>
+ <xs:enumeration value="noun"/>
+ <xs:enumeration value="noun_location"/>
+ <xs:enumeration value="numbers"/>
+ <xs:enumeration value="numeral"/>
+ <xs:enumeration value="onomatopoeic"/>
+ <xs:enumeration value="ordinal"/>
+ <xs:enumeration value="ordinal_number"/>
+ <xs:enumeration value="other"/>
+ <xs:enumeration value="participle/principal"/>
+ <xs:enumeration value="personal"/>
+ <xs:enumeration value="place"/>
+ <xs:enumeration value="preceeding_noun_of_title"/>
+ <xs:enumeration value="preceeding_noun_title"/>
+ <xs:enumeration value="primary/root"/>
+ <xs:enumeration value="pronoun's"/>
+ <xs:enumeration value="pronounian"/>
+ <xs:enumeration value="proper"/>
+ <xs:enumeration value="quantitative"/>
+ <xs:enumeration value="quote"/>
+ <xs:enumeration value="reflexive"/>
+ <xs:enumeration value="relative"/>
+ <xs:enumeration value="repeated_verb_ending"/>
+ <xs:enumeration value="repetitive"/>
+ <xs:enumeration value="right_parenthesis"/>
+ <xs:enumeration value="sentence_final"/>
+ <xs:enumeration value="subordinating"/>
+ <xs:enumeration value="superlative"/>
+ <xs:enumeration value="suspicion"/>
+ <xs:enumeration value="tense"/>
+ <xs:enumeration value="time"/>
+ <xs:enumeration value="transitive"/>
+ <xs:enumeration value="unit_of_measurement"/>
+ <xs:enumeration value="verb_ending"/>
+ <xs:enumeration value="verbal"/>
+ </xs:restriction>
+ </xs:simpleType>
+ </xs:attribute>
+ </xs:complexType>
+ </xs:element>
+ </xs:schema>
+
+ <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
+ <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:p="http://www.kolkatacdac.in/w3ci18npos"
+ elementFormDefault="qualified">
+ <xs:import namespace="http://www.kolkatacdac.in/w3ci18npos"
+ schemaLocation="C:\Documents and Settings\Administrator\My Documents\ITS-XML-Test\auth-postag-14091.xsd"/>
+ <xs:element name="pos_tag">
+ <xs:complexType>
+ <xs:sequence>
+ <xs:element ref="p:cat" maxOccurs="unbounded"/>
+ </xs:sequence>
+ </xs:complexType>
+ </xs:element>
+ </xs:schema>
+
+ ====================================
+
Received on Wednesday, 14 September 2005 09:51:50 UTC