- From: Fahad Khan <anasfkhan81@gmail.com>
- Date: Mon, 1 Jul 2019 19:10:04 +0200
- To: Christian Chiarcos <chiarcos@informatik.uni-frankfurt.de>
- Cc: klimek@informatik.uni-leipzig.de, public-ontolex <public-ontolex@w3.org>
- Message-ID: <CAK+N+9hadMus8M0rVPBZ6Y8AhHQb=O6x9eZZXrXHpqFwJQeb=g@mail.gmail.com>
Dear Christian, Yes it is a pain working with SWRL rules directly in turtle or XML, but there are ways of getting round this, for instance the interface in Protege that allows you to enter/view/edit the rules in the simpler horn clause like format. Fahad On Mon, 1 Jul 2019 at 17:05, Christian Chiarcos < chiarcos@informatik.uni-frankfurt.de> wrote: > Am .07.2019, 15:51 Uhr, schrieb Fahad Khan <anasfkhan81@gmail.com>: > > Dear Christian, > > I'm not proposing to parse strings using SWRL (as Wilcock tried to do; and > I understand why there wasn't much uptake for his idea at the time), only > to describe morphological patterns in a way that allows the generation of > morphological variants (word forms) of an entry (a far simpler task). The > rules for doing this in languages like Italian or French are fairly > straightforward, efficient enough to work on a reasonable sized lexicon (as > our work with the SIMPLE lexicon has shown), and the kinds of rules you get > (in this particular context) aren't really much harder to understand than > regular expressions, and in fact might even be easier to understand once > you have a basic grasp of how horn clauses are written in rule languages > (not that hard to come by). At the same time using SWRL ensures that we > produce human readable rules while remaining within the Semantic Web stack > and use pre existing technologies (SWRL functionality, including a > pre-installed rule enging is now bundled in as a feature with Protege). > > However, I have the strong suspicion that the modelling of morphological > patterns via SWRL rules (in order to generate forms) will not be viable for > all languages (Semitic languages for instance, though I haven't actually > tried to model this myself, since I'm not really compent enough to do so) > so I am not putting it forward as a general purpose method for representing > intensional morphological descriptions . In fact I don't think there is one > solution, one silver bullet, here (in the sense of being both descriptive > and machine actionable while allowing us to remain within the whole > semantic web ecosystem). However, I think that whatever we come up with > should be as compatible as possible with approaches like the SWRL one > (which does work with a lot of languages but maybe not all) while at the > same time leaving open the possibility of using other approaches such as > finite state transducers in a more expressive logic programming language. > The LMF way of doing this was interesting, they started by making a > distinction between extensional and intensional morphological descriptions, > and then came up with their own formalism to represent such patterns > (represented as strings), these could then be later translated into other > machine actionable formats. > > > Sure. My personal feeling is that regex with capturing groups might be a > means to achieve that to a large extent (for morphology). And if there was > some existing vocabulary to represent left-hand-sides and right-hand-sides > of regex-based transformations, it would be ideal for our purposes. SWRL > actually does that, but then look at the Turtle rendering of a simple > replacement rule: > > SWRL: > myFeat(?x, ?y) , replace(?y, "ak$", "a", ?z) -> myFeat(?x, ?z) > # this is nice, indeed > > TTL (as produced by Protege): > > [ rdf:type swrl:Imp ; > > swrl:head [ rdf:type swrl:AtomList ; > > rdf:rest rdf:nil ; > > rdf:first [ rdf:type swrl:DatavaluedPropertyAtom ; > > swrl:propertyPredicate :myFeat ; > > swrl:argument1 :x ; > > swrl:argument2 :z > > ] > > ] ; > > swrl:body [ rdf:type swrl:AtomList ; > > rdf:rest [ rdf:type swrl:AtomList ; > > rdf:rest rdf:nil ; > > rdf:first [ rdf:type swrl:BuiltinAtom ; > > swrl:builtin swrlb:replace ; > > swrl:arguments [ rdf:type rdf:List ; > > rdf:first :y ; > > rdf:rest [ rdf:type rdf:List ; > > rdf:first "ak$" ; > > rdf:rest [ rdf:type rdf:List ; > > rdf:first "a" ; > > rdf:rest ( :z > > ) > > ] > > ] > > ] > > ] > > ] ; > > rdf:first [ rdf:type swrl:DatavaluedPropertyAtom ; > > swrl:propertyPredicate :myFeat ; > > swrl:argument1 :x ; > > swrl:argument2 :y > > ] > > ] > ] . > > We can get it a little bit more compact if we omit RDFS-inferrable triples > and rdf:nils: > > [ rdf:type swrl:Imp ; > > swrl:head [ rdf:type swrl:AtomList ; > > rdf:first [ rdf:type swrl:DatavaluedPropertyAtom ; > > swrl:propertyPredicate :myFeat ; > > swrl:argument1 :x ; > > swrl:argument2 :z ] ] ; > > swrl:body [ rdf:type swrl:AtomList ; > > rdf:rest [ rdf:type swrl:AtomList ; > > rdf:first [ rdf:type swrl:BuiltinAtom ; > > swrl:builtin swrlb:replace ; > > swrl:arguments [ rdf:first :y ; > > rdf:rest [ rdf:first "ak$" ; > > rdf:rest [ rdf:first "a" ; > > rdf:rest ( :z ) ] ] ] ] ] ; > > rdf:first [ rdf:type swrl:DatavaluedPropertyAtom ; > > swrl:propertyPredicate :myFeat ; > > swrl:argument1 :x ; > > swrl:argument2 :y ] ] ] . > > But I see no way for further reduction. > > I'm not too deep into SWRL, maybe there is a way to provide a more > readable rendering, but better don't let this particular fragment get > anywhere near your users. For the time being, all OntoLex examples have > been Turtle-based, and shifting between different levels of representation > (i.e., mixing SWRL and TTL) in the description will leave people highly > confused. I think the best we can aim for is a vocabulary that > approximately does the following > > [ a :ReplacementRule; > :onProperty :myFeat; > :lhs "ak$"; > :rhs "a" ] > > This is much more restricted than SWRL, of course, but such a > mini-language can be processed with SPARQL Update, e.g., to generate > proper SWRL (or anything else). > > Best, > Christian > > > Cheers, > Fahad > > On Mon, 1 Jul 2019 at 15:07, Christian Chiarcos <christian.chiarcos@web.de> > wrote: > >> Dear Fahad, >> >> thanks a lot for this update. In fact, it ties in quite neatly with other >> approaches on parsing with SWRL/RIF, e.g., Graham Wilcock HPSG parser. On >> the other hand, we should keep in mind that Wilcock basically failed (not >> in terms of expressivity or performance, but in terms of adaptation by the >> community) and he himself thus abandoned the idea. So, while we *should* >> mention that rules can be implemented in this way (in terms or SW >> technology, this is the "right" way of implementing rules), I don't think >> we should prescribe SWRL nor RIF. >> >> This is for two reasons: >> >> On a technological level, RIF is a high-level technology, operating on >> top of OWL, so its proper handling requires a lot of expertise by the user >> and is technically demanding. I'm not sure about the popularity of either >> RIF or OWL beyond the core Semantic Web community anymore, whereas plain >> RDF is relatively widely used. >> >> On a conceptual level, the dominating paradigm in morphology generation >> are finite state transducers, and these can be reduced to regular >> expressions, and as we have native support for regex in SPARQL Update, SWRL >> and most programming languages, this would be more generic and come with a >> lower entry barrier. But then, regular expressions must also not be the >> only way to populate a paradigm (resp., a particular inflection type), as >> many lexicographers and linguists will find this too technical and prefer >> to provide representative examples rather than concrete rules -- and our >> modelling should cover both uses. >> >> Just my 2ct, >> Christian >> >> PS: I see drawbacks of the regex idea, too, in particular in that it is >> string-based rather than concept-based. >> >> PPS: A compromise could be to use the swrlb:replace to write >> transformation rules with regular expressions. However, the SWRL >> serialization in Turtle is close to a nightmare (because its bindings are >> internally represented by lists), and we should probably use TTL for >> illustrative examples. I doubt we could convincingly sell this to anyone. >> >> Am .07.2019, 12:13 Uhr, schrieb Fahad Khan <anasfkhan81@gmail.com>: >> >> Hi Bettina, All, >> Here is the poster I presented at Euralex last year which I mentioned in >> the last telco and which describes the approach we took to modelling >> Italian morphology using SWRL: >> >> https://docs.google.com/presentation/d/1pHt8IG0ni5x9AkoPCsCCccRPEFIeObW7eR-PxY1JN7A/edit?usp=sharing >> Cheers, >> Fahad >> >> On Tue, 25 Jun 2019 at 12:12, Bettina Klimek < >> klimek@informatik.uni-leipzig.de> wrote: >> >>> Hi all, >>> >>> this is the link to the telco today at 1pm CEST: >>> >>> https://hangouts.google.com/call/UNgLuAFv3BfDfX7P5x8EAEEI >>> >>> We will continue to discuss the modelling of morphological patterns and >>> paradigms. >>> >>> Regards, >>> >>> Bettina >>> >>> -- >>> Bettina Klimek >>> PhD Student >>> Department of Computer Science, University of Leipzig >>> Institute for Applied Informatics (InfAI) >>> Goerdelerring 9 >>> 04109 Leipzig >>> >>> Research Group: http://aksw.org/Groups/KILT >>> Homepage: http://aksw.org/BettinaKlimek >>> Projects: http://mmoon.org, http://linguistics.okfn.org >>> Events: 12 -17 May 2019 "3rd Summer Datathon on Linguistic Linked Open >>> Data (SD-LLOD 2019)" >>> https://datathon2019.linguistic-lod.org/ >>> 20-22 May 2019 "LDK 2019 – 2nd Conference on Language, Data >>> and Knowledge" >>> http://2019.ldk-conf.org/ >>> >>> >>> >> >> >> > > > -- > Prof. Dr. Christian Chiarcos > Applied Computational Linguistics > Johann Wolfgang Goethe Universität Frankfurt a. M. > 60054 Frankfurt am Main, Germany > > office: Robert-Mayer-Str. 11-15, #107 > mail: chiarcos@informatik.uni-frankfurt.de > web: http://acoli.cs.uni-frankfurt.de > tel: +49-(0)69-798-22463 > fax: +49-(0)69-798-28334 >
Received on Monday, 1 July 2019 17:10:41 UTC