- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Fri, 25 Mar 2022 10:29:43 +0000
- To: "Bethan Tovey-Walsh" <accounts@bethan.wales>, ixml <public-ixml@w3.org>
- Message-Id: <1648203820883.3495840559.840456358@cwi.nl>
You're covered. See the last part of this: If more than one parse tree describes the input, the processor must serialize one of them. It is not defined how this choice is made, but the resulting parse must by default be marked as ambiguous by including on the document element of the serialization the attribute ixml:state="ambiguous". Processors may provide a user option to suppress that attribute; they may also provide a user option to produce more than one parse tree. The point is that the parses may not be equivalent, but there's no efficient way to tell, and there is no meaningful way to specify how one of the parses should be chosen. So we just require one to be produced, with an indication that it is ambiguous, so the user at least has a warning. Steven On Thursday 24 March 2022 17:39:52 (+01:00), Bethan Tovey-Walsh wrote: Hello, all, There was some discussion in Tuesday's meeting of ambiguous parses and whether it was useful to report them. In particular, I think someone suggested that it didn’t matter which parse the user receives in the case that there’s more than one valid parse. I don’t think I did a good job of explaining why I think that’s wrong, and I want to put my view about this on record. Take the following string: "Father helps mother nurse shark bite patient." And this grammar: sentence: subject, verb_phrase, -'.' . modifier: noun; adjective . noun_phrase: modifier*, noun . complement: non_finite_clause . non_finite_clause: nf_verb, object . object: noun_phrase . subject: noun_phrase . verb_phrase: f_verb, object, complement . noun: ("Father"; "mother"; "shark"; "bite"; "patient"), -' '? . adjective: "nurse", -' '? . f_verb: "helps", -' '? . nf_verb: ("nurse"; "bite"), -' '? . If you parse the string with the grammar, the processor will find two parses: <sentence xmlns:ixml="http://invisiblexml.org/NS" ixml:state="ambiguous"> <subject> <noun_phrase> <noun>Father</noun> </noun_phrase> </subject> <verb_phrase> <f_verb>helps</f_verb> <object> <noun_phrase> <modifier> <noun>mother</noun> </modifier> <modifier> <adjective>nurse</adjective> </modifier> <noun>shark</noun> </noun_phrase> </object> <complement> <non_finite_clause> <nf_verb>bite</nf_verb> <object> <noun_phrase> <noun>patient</noun> </noun_phrase> </object> </non_finite_clause> </complement> </verb_phrase> </sentence> and <sentence xmlns:ixml="http://invisiblexml.org/NS" ixml:state="ambiguous"> <subject> <noun_phrase> <noun>Father</noun> </noun_phrase> </subject> <verb_phrase> <f_verb>helps</f_verb> <object> <noun_phrase> <noun>mother</noun> </noun_phrase> </object> <complement> <non_finite_clause> <nf_verb>nurse</nf_verb> <object> <noun_phrase> <modifier> <noun>shark</noun> </modifier> <modifier> <noun>bite</noun> </modifier> <noun>patient</noun> </noun_phrase> </object> </non_finite_clause> </complement> </verb_phrase> </sentence> The first of these represents the structure of a sentence with three actors: a father, a mother nurse-shark, and a patient. The father helps the mother nurse-shark to bite the patient. The second represents a sentence with three actors: a father, a mother, and a shark-bite patient. The father helps the mother to nurse the shark-bite patient. Both of these parses are valid, and both give structure to the content of the string. The two different structures represent the semantics of the string differently, and are not equivalent or interchangeable representations of the structure of the string. Parsing linguistic examples which have structural ambiguities is one very good use-case in which I might want to see every possible parse, and might not consider the different parses to be interchangeable. Importantly, my grammar is ambiguous on purpose, because natural language is ambiguous in this way. The fact that my grammar is ambiguous therefore doesn’t represent a hygiene issue. The ambiguity is the focus, not a side-effect. In any case, I just wanted to note this as some background to explain why I think that the ability to report ambiguity, and to return all possible parses, may be very important. It also, I hope, explains why I’m very opposed to the idea that the parses returned by an ambiguous grammar are always essentially equivalent to each other. Very best, Bethan ___________________________________________________ Dr. Bethan Tovey-Walsh Myfyrwraig PhD | PhD Student CorCenCC Prifysgol Abertawe | Swansea University Croeso i chi ysgrifennu ataf yn y Gymraeg.
Received on Friday, 25 March 2022 10:30:00 UTC