Re: thoughts towards a draft AR WG charter from Jens de Smit on 2010-07-28 (public-poiwg@w3.org from July 2010)

From: Jens de Smit <jens.desmit@surfnet.nl>
Date: Wed, 28 Jul 2010 14:13:32 +0200
To: public-poiwg@w3.org
Message-ID: <4C501EEC.4090904@surfnet.nl>
On 09/07/2010 18:23, Christine Perey wrote:
> I feel that a new WG is what is needed in order for the momentum we have
> established to really build up steam among the AR community members and
> to achieve the contribution it has the potential to make for the
> publishers and the consumers of digital content.
> 
> The focus of a W3C WG on defining an AR data format (I am a little
> uncomfortable calling this just "POI") is (I recommend):
> 
>  to define new specifications or extensions to specifications (which
> exist and already work on the Web) in the area of metadata for AR and
> ensuring that when a publisher associates content with "triggers" (of
> any kind here: geo-spatial, visual/image, audio, etc), alone or in
> combination, there is the most efficient "serving up" of the most
> appropriate form of the associated content.
> 
> This as a *MANY possible triggers* (sensed by sensors in any
> device-fixed or mobile--and very soon these will be the results of
> sensor fusion which will make matters more complicated) to *MANY
> possible outputs* problem.
> 
> For example, one possible output could be a 3D object, if that is what
> was published and the device can display it, and here there are many
> resolutions possible. If the device can only produce text or sounds to
> augmented the experience, and there is a sound file published in
> association with that trigger, then it would be the output displayed for
> the user.
> 
> At the end of the day the WG's work must assure three outcomes:
> 
> 1. any publisher can "prepare" content in a data format which is "AR
> ready" or AR enhanced and
> 
> 2. any user can have (reach the data for) the AR experience which is
> most appropriate given the limitations of device, network, license
> agreement with the publisher, etc.
> 
> 3. the AR experience produced is a "true AR" outcome, meaning that the
> digital is fully inserted or otherwise overlaying/sensed by the user in
> the real world, not a visual search.
> 
> To achieve the above means creating specifications which are usable by
> the existing AR SDKs and AR development platforms, with modifications,
> of course.
> 
> In parallel, the work in the graphics community around X3D and Web3D
> Consortium will focus on the "representation problem" of making sure
> that objects look and "act" correctly in the real world.
> 
> There would also be liaisons with ISO, OMA and other organizations.
> 

Hello all,

Since it's been a bit quiet after Christine's nice opening e-mail I'd
like to talk about the aspects that I think should be covered by the AR
data format (which, mind you, will be only part of the WG's charter but
also the most substantial deliverable).

I think it is important to realize that, as AR is a fusion of many
fields, a data format for AR will be a fusion of many existing
specifications already available. What _we_ need to define is a formal
way to describe the relation between contextual input and contextual
output. More concrete, I feel the following items need to be addressed:

- a way to identify and work with triggers. I'm having some trouble
putting this into words but Christine has been using the word trigger a
few times and I think AR is indeed all about triggers: any input that
warrants augmentation of the environment is a trigger in my book. If the
trigger is a geolocated point we're talking about a POI, but it can also
be a fiducial marker or something else (a more sophisticated image, an
object (which includes humans), perhaps a sound, a set of coordinates
delineating a 2D or 3D space, heliocentric coordinates, etc...). I agree
with Christine that we should probably not use the term POI as much as
we did, even though when we start working on a spec I can see POI be one
of the first triggers to get specified completely

- a way to link data to triggers. Like triggers, data can take many
forms: text, 2D images, 3D objects, animated images (aka silent movies
:), sounds and any combination thereof. I would vote for the possibility
to both use linked data and inline data.

- data formatting itself seems to be well covered by available standards
and really not our concern anyway

- a way of aggregating triggers. Aggregation by grouping is used to
create layers/worlds/watchamacallits as well as a way to package the
result of searches and a way to store data sets before transport to a
user agent. Another important form of aggregation may be nesting:
multiple triggers combined to form a new trigger, such as a "person"
with a "red shirt" being one trigger or a "doorway" near "(54.2863,4.29346)"

- interactions, presumably in the form of event triggers like classical
onfocus and onclick, but perhaps also onnearby, onviewed, triggering
flow of control in a programming language; JavaScript comes to mind but
we need not be limited to that if browsers decide to support other stuff
as well


I realize I may have gone overboard here because this is probably a big
step away from "defining a POI format, perhaps based on the work done in
ARML/KARML" which has also been suggested, but I've also tasted some
sentiment on the list to very consciously broaden the WG's scope beyond
geolocation-based POI listing, especially after the offer of the
Geolocation WG to incorporate our proposed work into their next charter.

So, I would dearly like to hear from you all how _you_ feel about this
group's future and direction.

Best regards,

Jens
Received on Wednesday, 28 July 2010 12:14:01 UTC