- From: Felix Sasaki <fsasaki@w3.org>
- Date: Fri, 25 Nov 2016 14:46:11 +0100
- To: public-rax@w3.org
See https://www.w3.org/2016/11/25-rax-minutes.html and below as text. We discussed the two use cases from Christopher https://www.w3.org/community/rax/wiki/Draft_Material#Data_acquisition_from_job_postings_via_GATE https://www.w3.org/community/rax/wiki/Draft_Material#AutomationML_industry_automation_models_integration and issues with converting (potentially with back conversion = round tripping) from XML/HTML to RDF. From that we may derive some general patterns that may be worth documenting. We will provide examples of input output in the github - feel free to do the same. Next call would be 9 December. Best, Felix [1]W3C [1] http://www.w3.org/ - DRAFT - rax cg 25 Nov 2016 [2]Agenda [2] https://lists.w3.org/Archives/Public/public-rax/2016Nov/0008.html See also: [3]IRC log [3] http://www.w3.org/2016/11/25-rax-irc Attendees Present philr, felix, timea, christoph Regrets christian, gerard, jose Chair phil Scribe fsasaki Contents * [4]Topics 1. [5]meeting start 2. [6]bdva summit 3. [7]AOB * [8]Summary of Action Items * [9]Summary of Resolutions __________________________________________________________ meeting start phil: did a review of use cases this morning. not too much change, missed one that christoph added. [10]https://www.w3.org/community/rax/wiki/Draft_Material#Data_a cquisition_from_job_postings_via_GATE [10] https://www.w3.org/community/rax/wiki/Draft_Material#Data_acquisition_from_job_postings_via_GATE phil: thanks a lot for adding this, christoph - can you give a brief description? christoph: sure. have not yet managed to share the descriptions, I have more material, and will get it done to share this ... will also add more concrete examples. Application setting is: we collect job postings in the form of plain text from the web ... we do named entity recognition with gate, and we get XML output ... begining and end of each token is annotated <clange> text text text <start/>recognised entity<end/> text text christoph: see above XML example. this has to be translated to RDF <clange> <start id="foo"/> <clange> <start href="#foo"/> christoph: start and end tags look like the above <clange> ids or refs (forgot which direction) are in these start/end tags christoph: we are using XSLT based tool I developed (trextor) to create RDF. it is quite hard <clange> krextor christoph: with XPath it is hard to select elements between start and end tags ... that is a bit tricky, you need a good knowledge of XPath, the sibling axis' etc. ... in context of European project, in which another partner is doing the extraction phil: is this similar to Martynas case? christopher: in terms of Xpath complexity, yes ... general XML to RDF transformation issue? [11]https://github.com/fsasaki/its20-extractor/tree/master/wiki pedia-extractor [11] https://github.com/fsasaki/its20-extractor/tree/master/wikipedia-extractor <philr> felix: I've written various converters <philr> ...it is always special case issues <philr> ...XML has various ways to include content <philr> ...special purpose handling is somwhat unavoidable <philr> ...example documents with guideance would be useful scribe: may be useful to give guidance on how to handle various cases christopher: there are patterns, e.g. parent child relations in XML and RDF properties ... for this you can provide a high level translation patterns <philr> clange: High level translation is possible with simple parent-child relationships <philr> felix: mixture of text and element nodes is challenging [12]https://github.com/fsasaki/its20-extractor/blob/master/wiki pedia-extractor/its-ta-2-nif-wikipedia.xsl#L43 [12] https://github.com/fsasaki/its20-extractor/blob/master/wikipedia-extractor/its-ta-2-nif-wikipedia.xsl#L43 <clange> fsasaki: handling of specific links (specific to wiki markup) phil: in FREME project we are also doing named entity recognition on plain text. our services are capable of returning turtle files, but we can cover many formats [13]https://api-dev.freme-project.eu/ckeditor-dev/ckeditor/samp les/freme.html [13] https://api-dev.freme-project.eu/ckeditor-dev/ckeditor/samples/freme.html various types of output, inline or external using json-ld <scribe> ACTION: felix to provide examples of round tripping as done in the freme project [recorded in [14]http://www.w3.org/2016/11/25-rax-minutes.html#action01] [14] http://www.w3.org/2016/11/25-rax-minutes.html#action01] bdva summit <philr> felix: to collect information on what better tooling is needed <philr> ...best practices abd standardization <philr> ...1.5 hour session on requirements <philr> clange: is there more I can do if I do not attend the summit? <philr> felix: it would be good if someone from your organization could attend <philr> ...questionnaire to bdva members but want input from companies <philr> Is there a fee to join bdva? felix: yes, will send info on that <clange> fsasaki 14:29: EU is not necessarily interested in new standards being developed, but in existing standards to be _applied_ in a better way thanks, clange discussion on automationML use case felix will send further infos on BDVA around AOB next meeting 9th of December phil cannot make it, christian to chair Summary of Action Items [NEW] ACTION: felix to provide examples of round tripping as done in the freme project [recorded in [15]http://www.w3.org/2016/11/25-rax-minutes.html#action01] [15] http://www.w3.org/2016/11/25-rax-minutes.html#action01 Summary of Resolutions [End of minutes] __________________________________________________________ Minutes formatted by David Booth's [16]scribe.perl version 1.148 ([17]CVS log) $Date: 2016/11/25 13:41:09 $ __________________________________________________________ [16] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm [17] http://dev.w3.org/cvsweb/2002/scribe/ Scribe.perl diagnostic output [Delete this section before finalizing the minutes.] This is scribe.perl Revision: 1.148 of Date: 2016/10/11 12:55:14 Check for newer version at [18]http://dev.w3.org/cvsweb/~checkout~/2002/ scribe/ [18] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/this/this, christoph/ No ScribeNick specified. Guessing ScribeNick: fsasaki Inferring Scribes: fsasaki Present: philr felix timea christoph Regrets: christian gerard jose Agenda: [19]https://lists.w3.org/Archives/Public/public-rax/2016Nov/0008 .html Got date from IRC log name: 25 Nov 2016 Guessing minutes URL: [20]http://www.w3.org/2016/11/25-rax-minutes.html People with action items: felix [19] https://lists.w3.org/Archives/Public/public-rax/2016Nov/0008.html [20] http://www.w3.org/2016/11/25-rax-minutes.html [End of [21]scribe.perl diagnostic output] [21] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
Received on Friday, 25 November 2016 13:46:26 UTC