RE: ideas for standard JSON-based semantic representation

Since there are no objections to publishing this draft, I’ll go ahead and publish it as a Community Group Report. 

Thanks to those of you who provided comments, and I’m looking forward to getting more public feedback.

 

 

From: Deborah Dahl <Dahl@conversational-Technologies.com> 
Sent: Thursday, February 07, 2019 10:07 AM
To: 'public-voiceinteraction@w3.org' <public-voiceinteraction@w3.org>
Subject: RE: ideas for standard JSON-based semantic representation

 

I think we should go ahead and publish this draft, (https://w3c.github.io/voiceinteraction/voice%20interaction20drafts/emmaJSON.htm ) even if there are still some changes we’d like to make. That way we can get some broader feedback.

Does anyone object to publishing this as a Community Group Draft? If you do object, please send a message to the group by Monday, February 11 describing your concerns, and we will discuss. If no one objects, we’ll proceed with publishing the draft.

Thanks,

Debbie

 

From: Deborah Dahl <Dahl@conversational-Technologies.com <mailto:Dahl@conversational-Technologies.com> > 
Sent: Friday, December 07, 2018 10:23 AM
To: 'Dirk Schnelle-Walka' <dirk.schnelle@jvoicexml.org <mailto:dirk.schnelle@jvoicexml.org> >
Cc: 'McTear, Mike' <mf.mctear@ulster.ac.uk <mailto:mf.mctear@ulster.ac.uk> >; public-voiceinteraction@w3.org <mailto:public-voiceinteraction@w3.org> 
Subject: RE: ideas for standard JSON-based semantic representation

 

Hi Dirk,

I think it’s a good idea to go into a little more depth on the details of the format. I didn’t want to include a complete definition (like a schema) because I wanted it to be clear that at this point this is a strawperson suggestion being proposed for comment. I can add some detail, though. Did you have any ideas about you would like to see more specifics about?

Thanks,

Debbie

 

From: Dirk Schnelle-Walka <dirk.schnelle@jvoicexml.org <mailto:dirk.schnelle@jvoicexml.org> > 
Sent: Wednesday, December 05, 2018 4:07 PM
To: Deborah Dahl <Dahl@conversational-Technologies.com <mailto:Dahl@conversational-Technologies.com> >
Cc: 'McTear, Mike' <mf.mctear@ulster.ac.uk <mailto:mf.mctear@ulster.ac.uk> >; public-voiceinteraction@w3.org <mailto:public-voiceinteraction@w3.org> 
Subject: RE: ideas for standard JSON-based semantic representation

 

Debbie,

 

I am thinking if it would make sense to define the expected format and data upfront before describing the example.

 

What do you think?

 

Dirk

 

Am 05.12.2018 21:48 schrieb Deborah Dahl <Dahl@conversational-Technologies.com <mailto:Dahl@conversational-Technologies.com> >:

Are there any more thoughts on this? We can publish this as a Community Group report when we agree on what to say, and then we can get some wider feedback.

 

From: McTear, Mike <mf.mctear@ulster.ac.uk <mailto:mf.mctear@ulster.ac.uk> > 
Sent: Wednesday, November 28, 2018 9:57 AM
To: Deborah Dahl <Dahl@conversational-Technologies.com <mailto:Dahl@conversational-Technologies.com> >; 'Dirk Schnelle-Walka' <dirk.schnelle@jvoicexml.org <mailto:dirk.schnelle@jvoicexml.org> >
Cc: public-voiceinteraction@w3.org <mailto:public-voiceinteraction@w3.org> 
Subject: Re: ideas for standard JSON-based semantic representation

 

Hi Debbie,

Yes, that is an excellent idea. It is essential to clear up these terminology issues and it would be useful to extend the terms to include greater level of detail as provided by EMMA.

 

Regards,

 

Michael McTear

Emeritus Professor of Knowledge Engineering

Ulster University 

 <https://www.ulster.ac.uk/staff/mf-mctear> https://www.ulster.ac.uk/staff/mf-mctear

 <http://www.spokenlanguagetechnology.com/> http://www.spokenlanguagetechnology.com/

 

Conversational Interaction Conference, San Jose, March 11-12, 2019

 <http://www.conversationalinteraction.com/program> http://www.conversationalinteraction.com/program

 

 

From: Deborah Dahl <Dahl@conversational-Technologies.com <mailto:Dahl@conversational-Technologies.com> >
Date: Wednesday, 28 November 2018 at 14:40
To: 'Dirk Schnelle-Walka' <dirk.schnelle@jvoicexml.org <mailto:dirk.schnelle@jvoicexml.org> >
Cc: "public-voiceinteraction@w3.org <mailto:public-voiceinteraction@w3.org> " <public-voiceinteraction@w3.org <mailto:public-voiceinteraction@w3.org> >
Subject: RE: ideas for standard JSON-based semantic representation
Resent-From: <public-voiceinteraction@w3.org <mailto:public-voiceinteraction@w3.org> >
Resent-Date: Wednesday, 28 November 2018 at 14:40

 

Thanks for your comments. Yes, we thought about a JSON serialization of EMMA in the last WD of EMMA, but we never got around to actually proposing what it might look like. This proposal is an attempt to move that idea forward, with the addition of  terminology for “intents” and “entities”, because all the toolkits use something like that (they might call “entities” “slots” or “concepts”, but I would argue that all those terms refer to basically the same thing). Because EMMA is explicitly agnostic as to the actual application-specific semantics, this proposal is a bit of an extension to EMMA.

 

From: Dirk Schnelle-Walka <dirk.schnelle@jvoicexml.org <mailto:dirk.schnelle@jvoicexml.org> > 
Sent: Tuesday, November 27, 2018 7:02 PM
To: Deborah Dahl <dahl@conversational-technologies.com <mailto:dahl@conversational-technologies.com> >
Cc: public-voiceinteraction@w3.org <mailto:public-voiceinteraction@w3.org> 
Subject: Re: ideas for standard JSON-based semantic representation

 

Thank you, Deborah. This looks like a great start.

 

Conceptually, I fully agree that the EMMA format is suited to transfer the semester interpretation from the various NLU toolkits that are available. In contrast to that these toolkits usually rely on the JSON format.

 

So, a mapping would be helpful and was already thought of when authoring the EMMA standard:

 

"Not addressed in this draft, but planned for a later Working Draft of EMMA 2.0, is a JSON serialization of EMMA documents for use in contexts were JSON is better suited than XML for representing user inputs and system outputs."

 

One of the questions that ought to be addressed is indeed: How to come up with an easy bridge among these 2 formats. Technically, this is pretty easy.

 

But how much of this mapping do we really need? Despite the fact that EMMA documents cannot be fully represented in JSON as stated above, EMMA is already prepared to carry JSON formatted semantic interpretation via emma:result-format="application/json"

 

Just some first basic ideas in a sleepless night.

 

Dirk

 

 

Am 27.11.2018 20:47 schrieb Deborah Dahl <dahl@conversational-technologies.com <mailto:dahl@conversational-technologies.com> >:

There are currently quite a few cloud-based natural language application development toolkits, all with their own proprietary result formats, even though their functionality doesn’t differ too much. Proprietary formats shouldn’t be necessary. It would be extremely useful to have a standard representation for natural language results for many reasons; for example, to make it easier to switch vendors and to encourage the development of third-party natural language development tools. The EMMA standard (https://w3c.github.io/emma/emma2_0/emma_2_0_editor_draft.html) was developed for representing semantic results and has the ability to represent a rich set of metadata about semantic processing. EMMA would be a good option for use as a standard with current toolkits. However, EMMA is an XML format and all of the current toolkit result formats are based on JSON, which is very popular with developers. I think it should be possible to develop a JSON format that captures the kind of information that’s contained in EMMA. To that end,  I put together a writeup with some suggestions for representing natural language results using JSON syntax and added it to the Voice Interaction GitHub repository

HTML rendered version: https://w3c.github.io/voiceinteraction/voice%20interaction%20drafts/emmaJSON.htm 

Repository: https://github.com/w3c/voiceinteraction/tree/master/voice%20interaction%20drafts/emmaJSON.htm  <https://github.com/w3c/voiceinteraction/tree/master/voice%20interaction%20drafts/emmaJSON.htm%20%0d> 

 

Please take a look and send comments to this list, or post them in the group wiki, https://github.com/w3c/voiceinteraction/wiki/Home/_edit 

We have the option to eventually publish some version of this as a Community Group report.

 

 



This email and any attachments are confidential and intended solely for the use of the addressee and may contain information which is covered by legal, professional or other privilege. If you have received this email in error please notify the system manager at postmaster@ulster.ac.uk <mailto:postmaster@ulster.ac.uk>  and delete this email immediately. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Ulster University. 
The University's computer systems may be monitored and communications carried out on them may be recorded to secure the effective operation of the system and for other lawful purposes. Ulster University does not guarantee that this email or any attachments are free from viruses or 100% secure. Unless expressly stated in the body of a separate attachment, the text of email is not intended to form a binding contract. Correspondence to and from the University may be subject to requests for disclosure by 3rd parties under relevant legislation. 
The Ulster University was founded by Royal Charter in 1984 and is registered with company number RC000726 and VAT registered number GB672390524.The primary contact address for Ulster University in Northern Ireland is Cromore Road, Coleraine, Co. Londonderry BT52 1SA 

 

Received on Tuesday, 12 February 2019 16:32:29 UTC