RE: Sense work - update

Namaste Tim 

 

1. Thanks for the good work and plan for Data Format compilation  addressing:

 

       The container  design- format which would include;

         - all alphabetic characters, including vector representation and likely also unicode 

           character information, etc...

 

        - all words, including the phonetic representation, geo-spatial history of words and 

          meanings, parts of speech, antonyms, etc. 

 

Note: What we are discussing here is pretty much highly advanced research themes, with multidisciplinary links, even at IVY league institutions. Such research  needs support and adequate proactive funding for a  ‘TEAM’  if the ideas need to go practical,  commercially valued and fulfil the  expressed global concerns in  Paul’s slides. Corporate funded research has always tags of IP and NDA and commercial interests to share and explore. Even in Open sources- NGO approach. Greed teams will always find a way to circumvent the ‘Value Centricity’ of work ! Irrespective of the fine prints on the document. To that extent, all this exchange is still  a ‘CONCERN-VOICING, NEEDING A STRONG BACKING’. I don’t have a clue on ‘how to’ of this point.  

 

Would ‘National Governance Regulation o and Nations Owning Responsibility of  ‘Localization of Technology in their Nation ’-  be an answer ? I don’t know. It is too complex.    

 

Who then should be proactive funding agencies here ?  The institutions –which suffer Identity Loss by Inappropriate Language Modelling by Techno-linguists.  

 

In other words, Technology is good for ‘Scripture-Prayer- Language’ as a vehicle for outreach convenience.  But ‘Techno-linguist’ cannot use ‘Technology’ to usurp the power of  ‘Religion- Scholar’ to ‘Interpret and Destroy the Essential message of Sacred Text, Create- Spread Confusion and Chaos by unfiltered- unaudited – damaging -search outputs’. This is where ‘ Techno-corporate glo-co-nomics [ global- corporate-economics] are at war with ‘ALL  RELIGIONS- Spiritual Sacred Scriptural Traditions at the base of Peace Infrastructure’. 

    

I as an insider of a specific ‘Religion- Tradition Community’ using the sacred text : Srimad Bhagavad-Gita : Book of Yoga’s in the ancient Language: Samskrutham.  I have  seen / continue to see the damage that has been caused/ being caused  by ISO -639 model approach and ‘Root design challenge of ‘ROMANIZED REPRESENTATION OF NON-ENGLISH LANGUAGES USED IN PRAYERS – RITUALS – SIRITUAL PRACTICES’. Technology of Language is NOT limited to  a partial side view of ‘ CHARACTER SCRIPT=  Visual of Character- Display form representation on screen for print / transmission needs  and ‘CHARACTER SOUND=  Voice of Character- in Speech  form of conversation for audio transmission needs  and dictation. OCR has serious limitation just like screen readers and voice transcribers / MT exercises.

 

2. I notice medical and linguists use a slightly different model of speech anatomy- linking to 

    character set. In this case ‘International Phonetic Alphabet / Roman Character set as anchor’.

 

3.  I am thinking deeply on First Issue of ‘Human Language Modelling’

        – as ‘Thought to Articulation’ [I speak my mind/ I say my thought] 

              Which finds its mechanistic manifestation of ‘Language elements and Processes’

                    Per Anatomy of Speaker (in this case Humans)  and 

                           Branching out to Specific Linguistic diversity and applications.

    Is this not the primary expectation of ‘Humanized Robo, running AGI ? understanding  -

     responding by Voiced Speech in Language context.

   This is ‘ Language Modelling used as Cognitive Linguistics  Paninian Samskruth- Language – 

   Grammar design’; which historically stands at the mother root of classical Greek, Latin and 

   deeply connected to Classical Hebrew .  [The Tower of Babel – Theory of Languages has no 

    relevance as far as this discussion is concerned. I prefer to hold on to ‘Logos/ Word’ concept 

    from Old Testament : John : 1-1.] The sacredness of the ‘alphabet needs to be respected as much 

    as the ‘sound of the alphabet’. This cannot be handled in isolated lanes.

 

4. May be we need a little extended thinking on ‘What aspects- elements of Human Languages 

     would we be able to ‘place in the container’ - as proposed to mimic ‘Human Agent- 

     Conversational- Intelligent -  response’ ?   

     The elevation 

          From A.I built on the primary equation ‘Word processing = Character set processing’  

            To AGI – seeking ‘SENSE- SENSITIVITY- INTELLIGENT AND ETHICAL  DECISION GUIDANCE’ 

              Needs ‘INTEGRATION BEYOND CONTAINERISED - POOLING’ 

                Of LANGUAGE ELEMENTS - handling < phonetic representation, geo-spatial history of 

             words and meanings, parts of speech, antonyms >, 

                  as APPROPRIATE-TECHNOLOGY TO COLLECT- CONNECT – LINK LINGUISTIC UNIVERSALS.

 

      I may still be dreaming or making a wish list. The work is on-going for several decades.

       But one thing I am certain from Language design Side: ‘Social English- Language Framework, 

      used in Large Language Modelling (roots going back to  Chomsky narrative of Universal 

      Language – Grammar).

 

What is Chomsky’s universal language grammar model ? How it connects to ‘Consciousness- Research and  disciplines like – Neuroscience, Physical Biology, mind Brain systems, Language – Gene and the like. 

 

Universal grammar - Wikipedia <https://en.wikipedia.org/wiki/Universal_grammar>  https://en.wikipedia.org/wiki/Universal_grammar  

 

Universal grammar (UG), in modern linguistics, is the theory of the innate biological component of the language faculty, usually credited to Noam Chomsky. The basic postulate of UG is that there are innate constraints on what the grammar of a possible human language could be. When linguistic stimuli are received in the course of language acquisition, children then adopt specific syntactic rules that conform to UG. The advocates of this theory emphasize and partially rely on the poverty of the stimulus (POS) argument and the existence of some universal properties of natural human languages. However, the latter has not been firmly established, as some linguists have argued languages are so diverse that such universality is rare, and the theory universal grammar remains controversial among linguists.

 

What is Language to ‘Genes connection’ – by ‘Carbon - Consciousness’- ? 

 

Language and genetics | Max-Planck-Gesellschaft (mpg.de) <https://www.mpg.de/19395/language-genetics>   https://www.mpg.de/19395/language-genetics 

Neuroscientists identify key role of language gene | MIT News | Massachusetts Institute of Technology <https://news.mit.edu/2014/language-gene-0915>  https://news.mit.edu/2014/language-gene-0915 

 

 

5. Right now the base reference for Techno-linguistics is still ‘Given Social- Historical - English language Characters’ placed in Unicode Standards – and ‘ Language Modelling’ is by  Spoken English in  ‘social media’ mode. This is a working business need in a social usage – economics of data with digital device, defined to  work  in a narrowly defined window. Would such a design be able to sustain scaling and diversity by Multilingualism and diversity of Language Applications as ‘Intelligent Use of Language’ ? I would ask: Would current ‘Watson’ answer the choices by ‘Value –ethics’? 

 

Can ‘Watson’ be programmed for Responsible Response -Human Centric AI - Ethics- Values’?

 

What is IBM Watson supercomputer? | Definition from TechTarget <https://www.techtarget.com/searchenterpriseai/definition/IBM-Watson-supercomputer>  

https://www.techtarget.com/searchenterpriseai/definition/IBM-Watson-supercomputer 

 

Thinking…. 

 

Regards

BVK Sastry 

     

 

From: Timothy Holborn [mailto:timothy.holborn@gmail.com] 
Sent: 24 June 2023 22:26
To: BVK Sastry
Cc: The Peace infrastructure Project; public-humancentricai@w3.org
Subject: Re: Sense work - update

 

Hi BVK,

 

I've made this video playlist about HDF5: https://www.youtube.com/watch?v=S74Kc8QYDac <https://www.youtube.com/watch?v=S74Kc8QYDac&list=PLCbmz0VSZ_vox6DMC33Jzo0suvoakmduY&index=1> &list=PLCbmz0VSZ_vox6DMC33Jzo0suvoakmduY&index=1 

 

i've had a bit more of a look at the NetCDF related works, and i'm not sure whether its the right path, but have a few things going on atm; so, i thought i'd cover it all via my reply.

 

RE: the HDF5 file structure / format,

 

i'll be using an example of the english language to create a POC that can be used for testing, examples & apps; where i'm just starting to think about how to define the mapping file,

 

The container would include;

- all alphabetic characters, including vector representation and likely also unicode character information, etc...

- all words, including the phonetic representation, geo-spatial history of words and meanings, parts of speech, antonyms, etc.

 

There's a few draft notes https://github.com/WebizenAI/sensedocs  but i've also got some updates that i haven't pushed (had a computer failure, i'll not go into detail); and, to some-degree, i'm not sure how useful those notes are anymore in anycase; as the works have advanced alot since i wrote them & set-up humancentricai.org <http://humancentricai.org>  as to ensure my efforts (webizen), didn't seek to own language or some such related field of moral concerns... then WSIS, UN, establishing this - its been a bit of a snowball... in anycase, 

 

There are various existing resources to produce this type of resource for English as well as many other languages already, however the process of defining the structure of this format would be the bigger implication for languages that are not already covered by unicode / web or 'ai' support...  generally otherwise.  by seeking to address english, this will end-up going into latin, old-norse, celtic and various other stem languages, and indeed also, i'm looking forward to encoding heraldry - but my cultural journey, somewhat inspired by the consequence of works with Australian indigenous efforts since 2009/10 (related to: https://www.virtualsonglines.org/ ) for people, with many languages - but none where books had anything to do with it, will in-turn be much like other use-cases i know you to be very focused on, alongside others in india and elsewhere; that, through the use of the english system (english is used by w3c) can be employed to advance works for other languages and applications...  seeking to advance standards, but also seeking to ensure we work to deliver solutions that support human dignity, ASAP, regardless...

 

Therein - in-effect, this HDF5 methodology should provide the ability to provide structured context about the different ways the dataset (in this case, the english language) may be employed, in relation to other documents, applications, programs and sensors; employing various comprehensive representations of that dataset, is what the objective for the file-format is seeking to address, notwithstanding the desirable means to also look at how to produce decentralised protocols as a complementary companion and/or alternative...  

 

In-turn that file can be downloaded, and importantly, there is an ability to use the contents of these files, without having to load the entire file into memory; which would be a massive barrier.  However the exact method for defining how to construct these files, optimally, is now only barely started...  It'll take some iterations, unless we find some sage, wise, existing experts, willing to lend a hand and help to accelerate the research to evaluate whether this is indeed a worthy pursuit / solution, or if there's a better, alternative approach.  Also, the ecosystem components envisaged, make a significant difference as to whether or not there's any useful purpose for these works at all.  If natural persons are defined by a wallet and all words are streamed in real-time, then arguably there's nothing needed at 'the edge', as we're often called...  

 

 

Historically, with respect to alternatives, 

 

Alternatives have included;

- RDF Documents (various notation formats)

- SQL or RDBMS databases 

- Graph Databases

- Vector Databases (emergent)

 

However, I have not found a solution that appears to be as fit-for-purpose as these HDF5 related R&D outcomes, which still also needs to be advanced and tested.  The n-dimensionality of representing social contexts (multi-agent systems) becomes incredibly complex...  if it can't be represented, then it can't be processed by systems that depend upon the data (or evidence) being provided to it - in-order to do its job, whatever that may be...  

 

other use-cases, beyond languages (noting, that they could be interdependent), will be objective purposes like a person's entire 'life log' as may be considered usefully required for medical purposes, or in a court of law or various other complex high-stakes situations where failure to provide situational awareness comprehensively as to communicate context, may lead to severe injustices, harms and indeed also - untimely deaths, and/or outcomes that have unnecessarily negative impacts upon the souls, the minds of clinicians, judges & other persons of 'trust'...

 

but overall, there's alot of use-cases...  many... 

 

With respect to logic representation, I haven't further developed any fixed view about it; other than considerations relating to,

 

- for support of a 'backwards compatibility' requirement as a safety protocol (inc. social security, digital prisons, etc.) the output should be backwards compatible with solid.  This appears to be viable atm.

- Prolog, Julia, Matlab, etc.  all provide sophisticated capabilities in fields relating to logic programming...

 

the need to support both subjective and objective realities, is a critically important factor; and, in my opinion, attempts to institute 'thought controls' upon people, which has the implication of enslavement, effectively... even if the barriers are up for purely commercial reasons (ie: like a tollgate); as asserted, impairing the ability to define or communicate 'truth' (objective reality); whether it be in relation to a dispute, that may end-up in a court of law, or more broadly, to determine rights, responsibilities, character, context, meaning, values, etc...  which is often also linked with business systems that seek to mute accountability, as to ensure gainful results without negative repercussions and at worst - thereafter also act as to seek to ensure that there are no other alternatives allowed.  This is also, to some-degree, a social and/or ideological position held by various groups, for various reasons...  

 

so, whilst i'm strongly opposed, and believe that there is a sufficiently significant market of others who are also very interested in solutions that can support 'reality check tech' features, as i believe becomes essential for providing a safe, useful and capability for 'dignity enhancing' foundations to ensure support private & personal AI systems, and by extension, bigger social / community (ie: commercial) systems; that supports for the social fabric brought about as that form interwoven dependencies, critical for productivity, common law, human rights, etc... 

 

There are alternative ideologies out there, and interoperability / portability (of human beings / souls) is an essential safety requirement for human centric ai systems, imo...  

 

Similar to - these use-cases, where I wish they had deployed 'mental health checks' for 'workers' in that field, far earlier, as to ensure people who might fail that test with their doctor, or not be able to go get the test - can be distinguished from those who do have them done...  the actual point - is about 'the journey out', the means to ensure that there is capacity to support human rights when the realities of whether or not agents do support them, or in-fact do not, is in-real terms - able to be tested... <https://www.youtube.com/watch?v=EV1NFYTwM3k> 
https://www.youtube.com/watch?v=EV1NFYTwM3k

https://www.unodc.org/unodc/en/human-trafficking/2009/anti-human-trafficking-manual.html

https://twitter.com/theprojecttv/status/1670361008760651777

 

like 'fair weather sailers', vs. those, you can depend upon in a storm..  something, mayn, when writing their wills just before their second deployment on behalf of their countries, understand well...  alongside, the importance, of ensuring best efforts are made to produce the peace infrastructure we need, to transformationally improve the lives of others, everywhere. life on earth.

 

SO,

 

The objective of this constituency; is to form a sufficiently comprehensive means to communicate the full, n-dimensional requirements of datasets requiring these complexities, including languages which are particularly important as a foundational requirement to support the development of personal ontology support systems, and that starts with the tooling needed to improve support for the means through which we may then be able to be made able to, better understand one-another...  via human centric ai systems...  

 

which can thereby be employed for defining various ontological systems; and in-turn also, support far richer foundational dataset requirements for other AI models, that are likely to be 'plugged in' via python, etc...  that may in-turn, act to support 'transformer' models (ie: like chat gpt) or other neural net, deep learning, machine learning, etc...  packages; and the means for others to produce those packages, by developing them in such a way that means they can employ these sorts of underlying data-packages that are more comprehensive than wordnet and other similar large language datasets; which as noted, 

 

is thought to be one application of many...

 

I also note, that whilst POC work could more easily be done by simply using something like: https://www.wordsapi.com/  

 

As we have discussed, there is a massive issue with respect to the challenges related to ensuring support for all languages of prayer, all mother tongue languages..  particularly for private & personal AI agents, as thought important for support human rights, right to self-determination, the right to be heard.

 

In consideration, given the very important nature of this problem, I have been working towards figuring out how to define a solution at this early stage, before I've got something that could otherwise be far simpler to demonstrate various other important considerations / qualities, etc...  due to my considerations about the level of importance that should be afforded to ensuring human centric ai works act to support the human dignity of all members of our human family, of which, language, is such a foundational construct to consciousness, selfhood and indeed also means to support personhood...  

 

noting;
sparsity and 'location': https://www.youtube.com/watch?v=6VQILbDqaI4 <https://www.youtube.com/watch?v=6VQILbDqaI4&list=PLCbmz0VSZ_voTpRK9-o5RksERak4kOL40&index=69&t=2407s> &list=PLCbmz0VSZ_voTpRK9-o5RksERak4kOL40&index=69&t=2407s 

quantum language processing: https://www.youtube.com/watch?v=X9uSV1YcOy4 <https://www.youtube.com/watch?v=X9uSV1YcOy4&list=PLCbmz0VSZ_voTpRK9-o5RksERak4kOL40&index=67> &list=PLCbmz0VSZ_voTpRK9-o5RksERak4kOL40&index=67

plausibility vs. understanding: https://www.youtube.com/watch?v=31VRbxAl3t0 <https://www.youtube.com/watch?v=31VRbxAl3t0&list=PLCbmz0VSZ_voTpRK9-o5RksERak4kOL40&index=72&t=1919s> &list=PLCbmz0VSZ_voTpRK9-o5RksERak4kOL40&index=72&t=1919s 

 

That whilst many are very focused on ensuring all members of our human family are issued a 'key', that if lost can be replaced - as the means to define their identity via a 'wallet', and thereby provide support for systems intended to be deployed for purposes relating to health, commerce, education and interactions between natural persons and incorporated entities, particularly governmental entities...  such forms of alternative 'visions' of the future, are believed incapable of supporting some of the detailed requirements considered both before, and from the beginnings of the w3c works in ~2012-3 that led to my involvement in creating some of those tools; and whilst it is most certainly important to ensure interoperability and portability, akin to the right for persons in relation to faith / religion (as defined by UDHR);  there are also many, many factors that still require so much work, notwithstanding testimonials by others in former years declaring that they were doing it all already, and now, well...  it is what it is, whilst various international stakeholders move swiftly to define frameworks for digital transformation based upon what it is they know now.  based upon what it is, that is available.  not ideas, that might happen sometime in the future...   as such, the hope is, that in future - so long as solutions improve support for human rights, rule of law, etc..  that these future alternatives, be allowed...

 

as such, 

 

there's alot that seemingly isn't best done in the W3C groups  and requires a follow-up on the old-work i did earlier for forming a means to support considerations via a global ISOC Topic SIG, that could in-turn act to work with the existing regional chapters around the world (~120 atm??); to be part of this process, you've got to join, here's the link,

https://portal.internetsociety.org/622619/form/join

 

Here's some links to some of the older work relating to these considerations,
Feb 2016 - Knowledge Banking SIG
https://docs.google.com/document/d/1DM3IW6xS2OIT5-OoHYZv3ra2BbGfma0l8EMVauT8KqU/edit?usp=sharing

30 Oct 2017 - Internet Society: Personhood and the Infosphere (A Human Centric Infosphere) Special Interest Group Terms of Reference v0.1
https://docs.google.com/document/d/1RpfRN3hFvmt1GQWdrnQeC060Wr439YELbeDuNJiePX0/edit?usp=sharing

May 2018 - Internet Australia Knowledge Banking SIG Slides
https://docs.google.com/presentation/d/1W-JcGcOZM8JfICTrJyolP3Iw9wWvrfk9UUgwu0gBe30/edit?usp=sharing

July 2018 - Knowledge Banking SIG TOR Draft
https://docs.google.com/document/d/1xKHONGoepiq29r7NMB9T6yd6kPcfWY2JsaDzK6OqnHE/edit?usp=sharing

April 2019 - Web Civics - Global SIG application
https://drive.google.com/file/d/1o1FrGelPmWfA6olhKik--UzSBGH1Rz4o/view?usp=sharing

 

I am also actively looking for support to advance works (resources);  however, it is very difficult to find people with the skills required to work without compensation, as has been the case for a decade or so, and indeed, is one of the many reasons seemingly contributing towards the consequence of technologies developing, yet still lacking functional capacities to better empower people to protect and support their own human rights via lawful means; which is in-turn, linked with the problems associated to corruption, that the UN Suggests is around 5% of GDP: https://press.un.org/en/2018/sc13493.doc.htm and i'm not presently sure how to best calculate the Co2 impacts nor the productivity impacts, nor the impacts on our ability to better strive to achieve the SDGs.

 

In summary;  

 

Work on this 'hdf5' research works, which as far as I'm aware has not been done before,  will take some weeks to advance; indeed, it may take some months, if not longer, to get to a point where a download link to working software can be sent to you...  so that your social experience, the way you experience and interact with the world online, your conscious experience of life, become far more greatly defined by you; and should you need to 'explain yourself' to whomever, contextually - that your means to do so, irrespective of the wealth that may or may not be found in your wallet, as to ensure peace - can be found and best supported, by law. 

 

As noted, sadly, there's still alot to do, but I'm working on it and I hope this helps.  the means to transform these works into something that properly defines software in a way that relates to the microsoft link you have forwarded, requires context; and the most important context to ensure is supported for you, is your context, not that of others who could create an alternative reality for you to live in, as a human resource for different sorts of socio-economic models; that may be very difficult to spot, unless, we can figure out the 'human' level personal ontology stuff, imo...

 

Best.

 

Timothy Holborn.

 

 

 

 

 

 

On Sat, 24 Jun 2023 at 23:30, BVK Sastry <yogasamskrutham@gmail.com <mailto:yogasamskrutham@gmail.com> > wrote:

Namaste

 

I came across a link of 1995 - from Microsoft Research - which could be worth a revisit in current discussion context.

 

https://www.microsoft.com/en-us/research/publication/the-death-of-computer-languages-the-birth-of-intentional-programming/ 

 

Regards

 

BVK Sastry 

On Wed, 21 Jun 2023, 5:33 am Timothy Holborn, <timothy.holborn@gmail.com <mailto:timothy.holborn@gmail.com> > wrote:

 

Received on Monday, 26 June 2023 08:33:49 UTC