- From: <w3t-archive+esw-wiki@w3.org>
- Date: Wed, 14 Sep 2005 03:30:54 -0000
- To: w3t-archive+esw-wiki@w3.org
Dear Wiki user, You have subscribed to a wiki page or wiki category on "ESW Wiki" for change notification. The following page has been changed by GoutamSaha: http://esw.w3.org/topic/its0908LinguisticMarkup ------------------------------------------------------------------------------ 3-Tier Schemas to embed linguistic-related metadata information in the structure of an XML document in order to improve the translation process for obtaining more meaningful translation. + This 3-tier schema scheme is also useful for the Translation Memory + processes to keep context markups when I18N & L10N developers + use this scheme for both source and target text. - Develop the 1st xml schema that contains various categories on content domain, @@ -50, +53 @@ - Examples on word POS categories: for Noun type- proper, abstract,compound etc, and so on. - Developers can categorize the linguistic-related metadata information in the following ways. + Developers can categorize the linguistic-related metadata information in the following ways. Metadata information may vary little from one language to another language. Content Domain @@ -266, +269 @@ <right_parenthesis> <mid_sentence> <other_punctuation> - ============================================== + + ================ + A typical schema of the content_domain is shown below. <?xml version="1.0" encoding="UTF-8" standalone="yes"?> @@ -353, +358 @@ </xs:complexType> </xs:element> </xs:schema> - ================================== + + ================== + A typical schema for Sentence_categories is shown below. <?xml version="1.0" encoding="UTF-8" standalone="yes"?> @@ -427, +434 @@ </xs:complexType> </xs:element> </xs:schema> + - ================ + ============== + A typical schema for Parts-of-Speech (POS) Markups is shown below. + + <?xml version="1.0" encoding="UTF-8" standalone="yes"?> + <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.kolkatacdac.in/w3ci18npos" elementFormDefault="qualified"> + <xs:element name="cat"> + <xs:complexType mixed="true"> + <xs:attribute name="name" use="required"> + <xs:simpleType> + <xs:restriction base="xs:string"> + <xs:enumeration value="adjective"/> + <xs:enumeration value="adverb"/> + <xs:enumeration value="conjunction"/> + <xs:enumeration value="noun"/> + <xs:enumeration value="post_position"/> + <xs:enumeration value="preposition"/> + <xs:enumeration value="pronoun"/> + <xs:enumeration value="punctuation"/> + <xs:enumeration value="verb"/> + </xs:restriction> + </xs:simpleType> + </xs:attribute> + <xs:attribute name="type"> + <xs:simpleType> + <xs:restriction base="xs:string"> + <xs:enumeration value="abstarct_concrete"/> + <xs:enumeration value="abstract"/> + <xs:enumeration value="addressing"/> + <xs:enumeration value="adjective_adjective"/> + <xs:enumeration value="adjective_adverb"/> + <xs:enumeration value="adversative_coordinating"/> + <xs:enumeration value="cardinals"/> + <xs:enumeration value="case"/> + <xs:enumeration value="causative"/> + <xs:enumeration value="collective"/> + <xs:enumeration value="comma"/> + <xs:enumeration value="common"/> + <xs:enumeration value="comparative"/> + <xs:enumeration value="compound"/> + <xs:enumeration value="conclusive"/> + <xs:enumeration value="coordinating"/> + <xs:enumeration value="correlative"/> + <xs:enumeration value="demonstrative"/> + <xs:enumeration value="denoting_others"/> + <xs:enumeration value="disjunctive"/> + <xs:enumeration value="end_inflecting"/> + <xs:enumeration value="eternal_joined"/> + <xs:enumeration value="exclusion"/> + <xs:enumeration value="following_noun_of_title"/> + <xs:enumeration value="fractional_number"/> + <xs:enumeration value="general"/> + <xs:enumeration value="group"/> + <xs:enumeration value="hyphenated_numbers"/> + <xs:enumeration value="imperative"/> + <xs:enumeration value="inclusive"/> + <xs:enumeration value="indeclinable"/> + <xs:enumeration value="indefinite"/> + <xs:enumeration value="interjectory"/> + <xs:enumeration value="interrogative"/> + <xs:enumeration value="intransitive"/> + <xs:enumeration value="joining"/> + <xs:enumeration value="left_parenthesis"/> + <xs:enumeration value="material"/> + <xs:enumeration value="mid_sentence"/> + <xs:enumeration value="negative"/> + <xs:enumeration value="non_finite"/> + <xs:enumeration value="noun"/> + <xs:enumeration value="noun_location"/> + <xs:enumeration value="numbers"/> + <xs:enumeration value="numeral"/> + <xs:enumeration value="onomatopoeic"/> + <xs:enumeration value="ordinal"/> + <xs:enumeration value="ordinal_number"/> + <xs:enumeration value="other"/> + <xs:enumeration value="participle/principal"/> + <xs:enumeration value="personal"/> + <xs:enumeration value="place"/> + <xs:enumeration value="preceeding_noun_of_title"/> + <xs:enumeration value="preceeding_noun_title"/> + <xs:enumeration value="primary/root"/> + <xs:enumeration value="pronoun's"/> + <xs:enumeration value="pronounian"/> + <xs:enumeration value="proper"/> + <xs:enumeration value="quantitative"/> + <xs:enumeration value="quote"/> + <xs:enumeration value="reflexive"/> + <xs:enumeration value="relative"/> + <xs:enumeration value="repeated_verb_ending"/> + <xs:enumeration value="repetitive"/> + <xs:enumeration value="right_parenthesis"/> + <xs:enumeration value="sentence_final"/> + <xs:enumeration value="subordinating"/> + <xs:enumeration value="superlative"/> + <xs:enumeration value="suspicion"/> + <xs:enumeration value="tense"/> + <xs:enumeration value="time"/> + <xs:enumeration value="transitive"/> + <xs:enumeration value="unit_of_measurement"/> + <xs:enumeration value="verb_ending"/> + <xs:enumeration value="verbal"/> + </xs:restriction> + </xs:simpleType> + </xs:attribute> + </xs:complexType> + </xs:element> + </xs:schema> + + <?xml version="1.0" encoding="UTF-8" standalone="yes"?> + <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:p="http://www.kolkatacdac.in/w3ci18npos" + elementFormDefault="qualified"> + <xs:import namespace="http://www.kolkatacdac.in/w3ci18npos" + schemaLocation="C:\Documents and Settings\Administrator\My Documents\ITS-XML-Test\auth-postag-14091.xsd"/> + <xs:element name="pos_tag"> + <xs:complexType> + <xs:sequence> + <xs:element ref="p:cat" maxOccurs="unbounded"/> + </xs:sequence> + </xs:complexType> + </xs:element> + </xs:schema> + + ==================================== +
Received on Wednesday, 14 September 2005 09:51:50 UTC