Re: thoughts towards a draft AR WG charter from Thomas Wrobel on 2010-08-09 (public-poiwg@w3.org from August 2010)

From: Thomas Wrobel <darkflame@gmail.com>
Date: Mon, 9 Aug 2010 17:09:48 +0200
To: Dan Brickley <danbri@danbri.org>
Cc: Matt Womer <mdw@w3.org>, public-poiwg@w3.org
Message-ID: <AANLkTikgUidSLx_vuDKoXyoq4XYEncXA1hvc7jUBmkog@mail.gmail.com>
Sorry for my lack of much thoughts in this thread. Theres stuff Id
like to comment on but haven't had the time.
Things are generaly moving in ways I agree with however.
Just a few thoughts;


>> Shape information, on the other hand, is more complex.  For starters, not all POIs have a shape (e.g. points such as 'center of >> this room', 'corner of Mass Ave and Vassar St'), or maybe a point is a shape here?  Some POIs may have a shape that's
 >> sufficiently described by a simple 2d circle, bounding box or
polygon (e.g. "Empire State Building" may have a rectangle
>> representing it's base at ground level), or 2d with a height (e.g. "the Empire State Building" is a rectangle that is 443 meters tall).
>>   Some POIs might best be represented by a three-dimensional model of varying levels of complexity and detail  (e.g. "the
>>  Empire State Building" is a rectangle with a pyramid on top or a complex CAD-like model).  I'm not sure we can standardize
>>  each of those cases immediately, but a polygon with a height seems doable, and again could be something that is available
>>  elsewhere.
>
> The Empire State Building isn't itself a rectangle with a pyramid on
> top; but in some circumstances it's useful to describe it as such. I
> suggest one thing we'll run into quite early (due to the generally
> aggregation-friendly nature of geo data) is that we'll encounter
> multiple independent descriptions of the same real-world thing, but
> [awkwardly] at different levels of abstraction. So a blob on a map
> might 'be' the empire state building; but so might a detailed 3d model
> + floor plan. Should it be a requirement that the POI data format
> handle the ability to express this same-ness across levels of detail?

I'm not sure we need to worry about this separately.

I think theres certainly many ways to describe the same data, both in
terms of form, and precision. But these would come under 3 different
case's;
a) When the data is fundamental different, potentially from a different source.
b) When the data is the same, but dynamically switched in form because
the viewing device is different.
c) When its the same data at a different detail level.

So "a" would be two different 3d models made by different people,
associated with the same building.
Which the user see's would be simply down to the feeds they subscribed
too. (if they are subscribed to both, I think it should be left to the
client software to pick a way to display both at once....we dont want
to get too deep into dictating user-interface case's here, we should
merely say "both these things are here" and left the software decide
how it wants to represent that).

"b" would be an example of two different bits of data being associated
with the same source, but designed to be viewed under different
conditions. So you could have a nice 3d model of the Empire State
Building for use on a mobile AR device, but also a simple static
overhead image for use on a map style view. As we talked about earlier
this could just be picked by conditional specifications, much like the
@media in normal  html.

"c"  would be one 3d file from one source, but within it specifying
different Level-of-detail.  I'm *sure* there must already be plenty of
LOD solutions out there in existing 3d formats no?


I think that somes up all the case's, and I think all of those would
have to be done anyway (or are already done).

Certainly I hope at least we can all agree (with "a") that anyone
should be allowed to associate their data with any location or image,
and not have one authority or database dictating what goes where.
And because of that, it follows that end-users should be able to pick
what sources data they have chosen to see in their FOV/display at the
same time. So from that there might naturally emerge different
providers with different detail levels of models.

============
>>The first one could, perhaps, be built into the browser (although it would be "alert me when my lat, long and altitude is within >>these parameters"). Equally, this could be accomplished with a bit of JavaScript that could be made into an efficient, minified, re->>usable library, just using the GeoLoc API.

It could be, but Id really think that should only be an optional thing
Javascript code tap into if it wish's.
I dont like the idea of anyone wanting to contribute AR works would be
required to mess about with Javascript.

For that mater I'm not even sure Javascript should be a requirement at
all for an AR viewer. The client end should be able to know how to
handle various sorts of critera checking based on static associations
it gets from its feeds. (I'm using feed to mean any "source" for the
[critera]<>[data] links here). Given the diversity of clients and the
various processing power/bandwidth I think they would be better suited
to pick the "update interval". (also gives them more flexibility as
regards to cashing)

Of course, ways for javascript to tap into various geolocation data is
a must for traditional webbrowser, but I think things are moving
nicely there already arn't they?

>>The other one - tell me when object X is seen - is achievable if you have a universal database of things with an agreed identifier ??>>structure that is used by all image recognition software."

Indeed.
Preferably more then one database too and people have the option of
picking which they are subscribed too.
The two ways of doing it I see are either;

a) Some sort of API standard where the client sends an image to the
server and the server gives a reply of what it is. (Basicly Google
Goggles style...they were going to have an API at some point).

b) Some way to encode a image into a hash string that the browser
itself could compare? Not sure of the science behind this really, but
if theres some way of publishing images as a list of strings to
compare too, then the client could evaluate "similarity" and above a
certain threshold it considers a match?
This is obviously more scalable and independent, but probably not
achievable on the current spec of mobile devices?
Certainly not against a global database anyone! (However, this could
be very usefull against a smaller feed of image<>data associations. If
you and your friends have tagged just a few dozen things, or your
looking at a knowen collection of image markers, you probably dont
want to use a global database at all).

In either of these two case's, the association between the image and
the data needs to be a standard, and if there needs to be a remote
database for the look-up, the communication two/from the database
would also need to be a standard.


=====



A part from that, I'd like like to say +1 for the "Patterns Of
Interest" concept in the other thread of emails.
:)
Received on Monday, 9 August 2010 15:10:22 UTC