- From: Marie Bilde Rasmussen <mariebilderas@gmail.com>
- Date: Fri, 15 Sep 2006 22:54:21 +0200
- To: xmlschema-dev@w3.org
- Message-ID: <c36097090609151354r5d9fef8dieefea05582529108@mail.gmail.com>
Hello everybody. I can't represent the grammar that I need in aW3C schema without violating the UPA-constraint. My task is to represent hyphenation (acceptable word division) of danish words. This is my grammar expressed as an EBNF: ( hyphen, ( wordpart, ( ( ( hyphen, blank? ) | (blank, hyphen?) )? wordpart )+ ) ) | ( ( wordpart, ( ( ( hyphen, blank? ) | (blank, hyphen?) )? wordpart )+ ), hyphen? ) In (my somewhat broken) english this could be formulated as: - each represented word consist of at least 2 word parts - between two word parts, there may occur (at most) one hyphen and (at most) one blank, their order is not significant and none of them are obligatory - a word can have an initial OR a trailing hyphen (suffixes and prefixes) - a wordcan't have both, and most words have neither the initial nor the trailing hyphen. The hyphens represented as elements are NOT a representation of word division points - they are part of the ortography of the word. I can see, that my EBNF-representation violates the UPA-constraint in the sense that it is not unambiguos which branch in the gramar tree is to be used, when a hyphen is encountered immediately following a wordpart in the input data. Can anybody help me reformulating this rule or tell me why this isn'tpossible witout violating the UPA-constraint. If so, I would be very grateful :o) Marie Bilde Rasmussen Gyldendal Publishers, Copenhagen (Denmark)
Received on Saturday, 16 September 2006 00:36:11 UTC