- From: Rob Manson <roBman@mob-labs.com>
- Date: Fri, 03 Sep 2010 10:05:17 +1000
- To: jacques.lemordant@inria.fr, "Public POI @ W3C" <public-poiwg@w3.org>
Hi Jacques, I was really pleased to read your ARA/Sound Object presentations [1] [2] as I think audio is really being treated as the poor cousin in the AR space at the moment. I'd be interested to hear your thoughts on how A2ML or other related options [3] could or could not work in the broader HTML, SVG, WebGL open stack I've outlined below. I'd especially be interested to hear about broadening it out from just sound files to including sound streams. Personally, if I look around, the white headphones I see everyone on the street wearing suggests to me that the iPod is now the currently dominant form of Augmented Reality/Wearable Computing. It's just that this audio has no awareness of the context around it and is not integrated with sensor data in any way. I'll look forward to hearing your thoughts on this... roBman [1] http://www.w3.org/2010/06/w3car/basic_concepts_in_ar_audio.pdf [2] http://www.w3.org/2010/06/w3car/pres/ARA_Lemordant.pdf [3] http://createdigitalmusic.com/2010/05/03/real-sound-synthesis-now-an-open-standard-in-the-browser/ On Fri, 2010-09-03 at 09:40 +1000, Rob Manson wrote: > Hi Alex, > > one more small point I forgot to make too. > > The bulk of what I've raised is not directly about the POI standard > itself. My starting point is to try to clarify the high level value > chain or the new environment that I think this new standard will be > operating in. > > So, I'm aware that at first read my initial proposal [1] and the diagram > [2] I attached to my last email [3] may seem like it is opening a can of > worms and covering a really broad range of new technologies that could > be distracting the list from the discussion of a simple and tractable > solution. This is not my goal at all. > > If we can agree on and document the general structure of this new > environment (which I think there is quite some agreement on already) > then we will all be in a better position to focus in on a "fit for > purpose" standard. > > > roBman > > > [1] http://lists.w3.org/Archives/Public/public-poiwg/2010Aug/0053.html > [2] http://lists.w3.org/Archives/Public/public-poiwg/2010Sep/att-0006/OpenARClientStack-01.png > [3] http://lists.w3.org/Archives/Public/public-poiwg/2010Sep/0006.html > > > On Wed, 2010-09-01 at 10:01 -0400, Alex Hill wrote: > > If I understand correctly, you are suggesting that "triggers" should > > be formulated in a flexible pattern language that can deal with and > > respond to any form of sensor data. > > This would be in contrast to the strictly defined "onClick" type of > > events in JavaScript or the existing VRML trigger types such as > > CylinderSensor [1]. > > I think this idea has merit and agree that some significant > > flexibility in the way authors deal with the multiple visual and > > mechanical sensors at their disposal is vital to creating compelling > > AR content. > > However, the flexibility that this approach would give, seems at first > > glance, to take away some of the ease of authoring that "baked" in > > inputs/triggers give. > > And, I it is not obvious to me now how one incorporates more general > > computation into this model. > > Take the aforementioned CylinderSensor; how would you describe the > > behavior of this trigger using patterns of interest? > > While there may be standards that will eventually support this (i.e. > > the W3C Sensor Incubator Group [2]), I wonder if this type of "sensor > > filtering language" is beyond our scope. > > > > > > The second main point you make is that we should reconsider the > > request-response nature of the internet in the AR context. > > Again, this is an important idea and one worth seriously considering. > > But in a similar fashion to my concerns about pattern of interest > > filtering, I worry that this circumvents an existing model that has > > merit. > > The data-trigger-response-representation model you suggest already > > happens routinely in rich Web 2.0 applications. > > The difference is that it happens under the programatic control of the > > author where they have access to a multitude of libraries and > > resources (i.e. jQuery, database access, hardware, user settings, > > etc.) > > (this point is related to another thread about (data)<>-(criteria) [3] > > where I agree with Jens that we are talking about multiple > > data-trigger-reponses) > > I may need some tutoring on what developing standards means, but in my > > view, things like ECMA scripting are an unavoidable part of complex > > interactivity. > > Perhaps you can give an example where the cutoff between the current > > request-response model ends and automatic > > data-POI-response-presentation begins? > > > > > > On Aug 20, 2010, at 10:19 AM, Rob Manson wrote: > > > > > Hi, > > > > > > great to see we're onto the "Next Steps" and we seem to be > > > discussing > > > pretty detailed structures now 8) So I'd like to submit the > > > following > > > proposal for discussion. This is based on our discussion so far and > > > the > > > ideas I think we have achieved some resolution on. > > > > > > I'll look forward to your replies... > > > > > > roBman > > > > > > PS: I'd be particularly interested to hear ideas from the linked > > > data > > > and SSN groups on what parts of their existing work can improve this > > > model and how they think it could be integrated. > > > > > > > > > > > > What is this POI proposal? > > > A simple extension to the "request-response" nature of the HTTP > > > protocol > > > to define a distributed Open AR (Augmented Reality) system. > > > This sensory based pattern recognition system is simply a structured > > > "request-response-link-request-response" chain. In this chain the > > > link > > > is a specific form of transformation. > > > > > > It aims to extend the existing web to be sensor aware and > > > automatically > > > event driven while encouraging the presentation layer to adapt to > > > support dynamic spatialised information more fluidly. > > > > > > One of the great achievements of the web has been the separation of > > > data > > > and presentation. The proposed Open AR structure extends this to > > > separate out: sensory data, triggers, response data and > > > presentation. > > > > > > NOTE1: There are a wide range of serialisation options that could be > > > supported and many namespaces and data structures/ontologies that > > > can be > > > incorporated (e.g. Dublin Core, geo, etc.). The focus of this > > > proposal > > > is purely at a systemic "value chain" level. It is assumed that the > > > definition of serialisation formats, namespace support and common > > > data > > > structures would make up the bulk of the work that the working group > > > will collaboratively define. The goal here is to define a structure > > > that enables this to be easily extended in defined and modular ways. > > > > > > NOTE2: The example JSON-like data structures outlined below are > > > purely > > > to convey the proposed concepts. They are not intended to be > > > realised > > > in this format at all and there is no attachment at this stage to > > > JSON, > > > XML or any other representational format. They are purely > > > conceptual. > > > > > > This proposal is based upon the following structural evolution of > > > devices and client application models: > > > > > > PC Web Browser (Firefox, MSIE, etc.): > > > mouse -> sensors -> dom -> data > > > keyboard -> -> presentation > > > > > > Mobile Web Browser (iPhone, Android, etc.): > > > gestures -> sensors -> dom -> data > > > keyboard -> -> presentation > > > > > > Mobile AR Browser (Layar, Wikitude, Junaio, etc.): > > > gestures -> sensors -> custom app -> presentation > > > [*custom] > > > keyboard -> -> data [*custom] > > > camera -> > > > gps -> > > > compass -> > > > > > > Open AR Browser (client): > > > mouse -> sensors -> triggers -> dom -> presentation > > > keyboard -> -> data > > > camera -> > > > gps -> > > > compass -> > > > accelerom. -> > > > rfid -> > > > ir -> > > > proximity -> > > > motion -> > > > > > > NOTE3: The key next step from Mobile AR to Open AR is the addition > > > of > > > many more sensor types, migrating presentation and data to open web > > > based standards and the addition of triggers. Triggers are explicit > > > links from a pattern to 0 or more actions (web requests). > > > > > > Here is a brief description of each of the elements in this high > > > level > > > value chain. > > > > > > clients: > > > - handle events and request sensory data then filter and link it to > > > 0 or > > > more actions (web requests) > > > - clients can cache trigger definitions locally or request them from > > > one > > > or more services that match one or more specific patterns. > > > - clients can also cache response data and presentation states. > > > - since sensory data, triggers and response data are simply HTTP > > > responses all of the normal cache control structures are already in > > > place. > > > > > > infrastructure (The Internet Of Things): > > > - networked and directly connected sensors and devices that support > > > the > > > Patterns Of Interest specification/standard > > > > > > > > > patterns of interest: > > > The standard HTTP request response processing chain can be seen as: > > > > > > event -> request -> response -> presentation > > > > > > The POI (Pattern Of Interest) value chain is slightly extended. > > > The most common Mobile AR implementation of this is currently: > > > > > > AR App event -> GPS reading -> get nearby info request -> Points Of > > > Interest response -> AR presentation > > > > > > A more detailed view clearly splits events into two to create > > > possible > > > feedback loops. It also splits the request into sensor data and > > > trigger: > > > > > > +- event -+ +-------+-- event --+ > > > sensor data --+-> trigger -> response data -> presentation -+ > > > > > > - this allows events that happen at both the sensory and > > > presentation > > > ends of the chain. > > > - triggers are bundles that link a pattern to one or more actions > > > (web > > > requests). > > > - events at the sensor end request sensory data and filter it to > > > find > > > patterns that trigger or link to actions. > > > - these triggers or links can also fire other events that load more > > > sensory data that is filtered and linked to actions, etc. > > > - actions return data that can then be presented. As per standard > > > web > > > interactions supported formats can be defined by the requesting > > > client. > > > - events on the presentation side can interact with the data or the > > > presentation itself. > > > > > > sensory data: > > > Simple (xml/json/key-value) representations of sensors and their > > > values > > > at a point in time. These are available via URLs/HTTP requests > > > e.g. sensors can update these files on change, at regular intervals > > > or > > > serve them dynamically. > > > { > > > HEAD : { > > > date_recorded : "Sat Aug 21 00:10:39 EST 2010", > > > source_url : "url" > > > }, > > > BODY : { > > > gps : { // based on standard geo data structures > > > latitude : "n.n", > > > longitude : "n,n", > > > altitude : "n", > > > }, > > > compass : { > > > orientation : "n" > > > }, > > > camera : { > > > image : "url", > > > stream : "url" > > > } > > > } > > > } > > > NOTE: All sensor values could be presented inline or externally via > > > a > > > source URL which could then also reference streams. > > > > > > trigger: > > > structured (xml/json/key-value) filter that defines a pattern and > > > links > > > it to 0 or more actions (web requests) > > > [ > > > HEAD : { > > > date_created : "Sat Aug 21 00:10:39 EST 2010", > > > author : "roBman@mob-labs.com", > > > last_modified : "Sat Aug 21 00:10:39 EST 2010" > > > }, > > > BODY : { > > > pattern : { > > > gps : [ > > > { > > > name : "iphone", > > > id : "01", > > > latitude : { > > > value : "n.n" > > > }, > > > longitude : { > > > value : "n.n" > > > }, > > > altitude : { > > > value : "n.n" > > > } > > > }, > > > // NOTE: GPS value patterns could have their own ranges > > > defined > > > // but usually the client will just set it's own at the > > > filter level > > > // range : "n", > > > // range_format : "metres" > > > // This is an area where different client applications can > > > add their unique value > > > ], > > > cameras : [ > > > { > > > name : "home", > > > id : "03", > > > type : "opencv_haar_cascade" > > > pattern : { > > > ... > > > } > > > } > > > ] > > > }, > > > actions : [ > > > { > > > url : "url", > > > data : {..}, // Support for referring to sensor values > > > $sensors.gps.latitude & $sensors.compass.orientation > > > method : "POST" > > > }, > > > ] > > > } > > > ] > > > > > > data > > > HTTP Responses > > > > > > presentation > > > client rendered HTML/CSS/JS/RICH MEDIA (e.g. Images, 3D, Video, > > > Audio, > > > etc.) > > > > > > > > > > > > At least the following roles are supported as extensions of today's > > > common "web value chain" roles. > > > > > > publishers: > > > - define triggers that map specific sensor data patterns to > > > useful actions (web requests) > > > - manage the acl to drive traffic in exchange for value > > > creation > > > - customise the client apps and content to create compelling > > > experiences > > > > > > developers: > > > - create sensor bundles people can buy and install in their > > > own > > > environment > > > - create server applications that allow publishers to > > > register > > > and manage triggers > > > - enable the publishers to make their triggers available to > > > an > > > open or defined set of clients > > > - create the web applications that receive the final actions > > > (web requests) > > > - create the clients applications that handle events and map > > > sensor data to requests through triggers (Open AR browsers) > > > > > > > > [1] http://www.web3d.org/x3d/wiki/index.php/CylinderSensor > > [2] http://www.w3.org/2005/Incubator/ssn/charter > > [3] > >
Received on Friday, 3 September 2010 00:08:43 UTC