- From: Timothy Holborn <timothy.holborn@gmail.com>
- Date: Wed, 25 Aug 2021 19:31:44 +1000
- To: Dave Raggett <dsr@w3.org>
- Cc: public-cogai <public-cogai@w3.org>
- Message-ID: <CAM1Sok0eie+TG4otxx7R7HZWkbkyRn=GUvbep3SSuky+LRmu8g@mail.gmail.com>
Cheers... On Wed, 25 Aug 2021, 6:43 pm Dave Raggett, <dsr@w3.org> wrote: > A well defined subset of ISO8601 is used in “chunks” as a convenient > data/time format that is mapped into its constituent properties. > > If you want to model provenance and data quality etc. you can use chunk > models as appropriate. This also relates to the role of @context for > expressing statements about statements, e.g. “John said Mary is at work”, > see: > > > https://github.com/w3c/cogai/blob/master/chunks-and-rules.md#statements-about-statements > > > > On 24 Aug 2021, at 15:48, Timothy Holborn <timothy.holborn@gmail.com> > wrote: > > I noted the defined statement about ISO8601... > > How (when applicable) does the emerging standard seek to consider 'query > state evidence' meaning, different sources (depended upon for query > outcomes) having some sort of 'state' evidence / check-sum, > common-compliance with ISO8601 (date/time/timezone)? > > ie: laws change.... provenance (and insights) evolves. (ie: > https://twitter.com/BasilMarte/status/1188098436861743104 occurred in > 2019, prior to the toilet-paper shortage pandemics of 2020). > > Timothy Holborn. > > On Tue, 24 Aug 2021 at 22:22, Timothy Holborn <timothy.holborn@gmail.com> > wrote: > >> Hi Dave (/list) >> >> I started working on an email response; but its turning into a sort of >> paper expanding upon my introduction to this group given the topic, so, >> will post seperately and its still a work in progress. >> >> >> On Tue, 24 Aug 2021 at 00:52, Dave Raggett <dsr@w3.org> wrote: >> >>> I hope you have had a pleasant summer. I’ve been doing some background >>> reading, and am getting closer to starting some further implementation >>> experiments. The aim is to develop a good enough solution for transforming >>> natural language into graph representations of the meaning. By good enough, >>> I mean good enough to enable work on using natural language in experiments >>> on human-like reasoning and learning. >>> >>> Natural language understanding can be broken down into sub-tasks, such >>> as, part of speech tagging, phrase structure analysis, semantic and >>> pragmatic analysis. The difficulties mainly occur with the semantic >>> processing. Many words can have multiple meanings, but humans effortlessly >>> understand which meaning is intended in any given case. Semantic processing >>> is also needed to figure out prepositional attachments, and determine what >>> pronouns, and other kinds of noun phrases, are referring to. >>> >> >> I'm led to believe some of the inferencing aspects link to Semiotics ( >> https://en.wikipedia.org/wiki/Semiotics ) >> >> back in 2001, had a minor involvement associated to cataloguing and >> making searchable digibetas to phonetically transcribed MPEG2 files in a >> database. From memory >> https://en.wikipedia.org/wiki/Nuance_Communications was the leader in >> phonetic analysis at that time. >> >> Early last decade i learned of https://www.mico-project.eu/ and with >> that sparql-mm, that i hoped could provide an open standards based >> methodology / tooling / reference platform. I am not aware presently of >> more advanced works done since. >> >> QUESTION; How may the outcome support 'freedom of thought' and how does >> that relate to the W3C patent Pool related mandates / membership interests, >> etc.? >> >>> >>> Two decades ago work on word sense disambiguation focused on n-gram >>> statistics for word collocations. More recently, artificial neural networks >>> have proved to be very effective at unsupervised learning of statistical >>> language models for predicting what text is likely to follow on from a >>> given text passage. Unfortunately, marvellous as this is, it isn’t >>> transferrable to tasks such as word sense disambiguation and measuring >>> semantic consistency for deciding on prepositional attachments etc. >>> >>> I am therefore still looking for practical ways to exploit natural >>> language corpora to determine word senses in context. The intended sense of >>> a word is correlated to the words with which it appears in any given >>> utterance. The accompanying words vary in their specificity for >>> discriminating particular word senses. However, strongly discriminating >>> words may be found several words away from the word in question. A simple >>> n-gram model would require an impractical amount of memory to capture such >>> dependencies. We therefore need a way to learn which words/features to pay >>> attention to, and what can be safely forgotten as a means to limit the >>> demand on memory. >>> >>> I rather like the 1995 paper by David Yarowksy “Unsupervised words sense >>> disambiguation rivalling supervised methods”. This assumes that words have >>> one sense per discourse and one sense per collation, and exploits this in >>> an iterative bootstrapping procedure. Other papers exploit linguistic >>> resources like WordNet. I am now hoping to experiment with Yarowksy’s ideas >>> using loose parsing for longer range dependencies, together with heuristics >>> for discarding collocation data with weak discrimination. >>> >>> I’ve downloaded free samples of large corpora from www.corpusdata.org >>> as a basis for experimentation. >>> >> >> perhaps creating some sort of github file or solution, that provides >> reference to an array of open resources, could be useful? >> >> >> >>> Each word is given with its lemma and part of speech, e.g. "announced", >>> "announce", "vvn”. This will enable me to apply shift-reduce parsing to >>> build phrase structures as an input to computing collocations. Further work >>> would address the potential for utilising prior knowledge, e.g. from >>> WordNet, and how to compute measures of semantic consistency for resolving >>> noun phrases and attachment of prepositions as verb arguments. An open >>> question is whether this can be done effectively without resorting to >>> artificial neural networks. >>> >>> Anyone interested in helping? >>> >> >> yes. but 'not ready yet' (personally)... noting i do not have all the >> necessary skills to support the underlying scope of works, without help / >> cooperative collaboration with others, etc. >> >> also - isn't this sort of stuff computationally intensive? how can / are >> experiments (be) funded? is there a schema about how projects be defined >> by the scope that is incorporated and aspects set-aside? >> >> q: how and what in the proposed specification supports temporal >> considerations? including but not limited to if an inference has a >> dependency upon an API and/or 3rd party query service... >> >> therein - inferences based on 'half truths' (in simple language) are >> likely to be different to inferences / dermination (causality) linked to >> having a better means to form opinions. like mindfulness / consciousness >> and related facets; this isn't necessarily about result that are bad for >> purposes intended by others; but sometimes that is the case. The >> underlying concept linking to the idea of 'the status of the observer'; >> https://www.youtube.com/watch?v=ZYPjXz1MVv0&list=PLCbmz0VSZ_voTpRK9-o5RksERak4kOL40&index=4&t=5s >> <https://www.youtube.com/watch?v=ZYPjXz1MVv0&list=PLCbmz0VSZ_voTpRK9-o5RksERak4kOL40&index=4&t=5s> >> - my much longer writing (been working on it for a couple of hours so far, >> for the express purpose of this group work) will go into considerations / >> deliberations in more detail, suffice to say for now - IMO, its quite >> complicated stuff... >> >> i see here: >> https://github.com/w3c/cogai/blob/master/demos/decision-tree/rules.chk a >> series of considerations about a 'way of thinking' of a particularly >> illustrated underlying concept. It seems obvious to consider that some >> such examples are based on physics or similar (ie: gravity, amongst others) >> others may be more subjective (ie; linked to religious / worship related / >> spiritual belief's, or medical procedures (including but not limited to >> OSCE's ( >> https://en.wikipedia.org/wiki/Objective_structured_clinical_examination >> ); does the present scope of works have a concept of 'libraries' or >> 'sources' or similar? The Sci-Fi example would be Neo uploading knowledge >> https://www.youtube.com/watch?v=w_8NsPQBdV0 >> <https://www.youtube.com/watch?v=w_8NsPQBdV0> the more pragmatic >> example, would be virus signature libraries uploaded (or downloaded, >> depending on how you think about it) into anti-virus programs... >> >> part of the underlying thought is about 'computational load' which will >> likely have an impact (various implications) on how solutions can be >> deployed (how well they may be 'democratised', or similar). >> >> also; what consideration has been given on storing resources on DLTs (ie: >> blockchains, DHTs, cryptographically signed (tamper evident), decentralised >> resources)? >> >> Timothy Holborn. >> >> >>> Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett >>> W3C Data Activity Lead & W3C champion for the Web of things >>> >>> >>> >>> >>> > Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett > W3C Data Activity Lead & W3C champion for the Web of things > > > > >
Received on Wednesday, 25 August 2021 09:32:09 UTC