- From: Phil Archer <phila@w3.org>
- Date: Wed, 04 Aug 2010 09:33:35 +0100
- To: roBman@mob-labs.com
- CC: public-poiwg@w3.org
Very interesting, thanks, Rob. I think I've said enough for now and am keen to step back into the shadows again. All I would emphasise is that there is "violent agreement" that POIs are not necessarily geolocated. A POI, be it on object, a building or the car in front, can become relevant through any number of sensory inputs. Cheers Phil Rob Manson wrote: > On Tue, 2010-08-03 at 15:42 +0100, Phil Archer wrote: >> A location, even one that is defined relative to something else as >> opposed to a fixed point in space, is not a trigger. The trigger is that >> a device has arrived at that location (or that a recognisable >> object/marker has come into view, or that a time has arrived or whatever). > > Hi Phil...I think I have quite a different cultural perspective...so > forgive me while I dance around a bit and try to clarify my language. > > A location from my perspective is an abstract concept that can be > represented in a coordinate system such as lat/lon. Without other > related information this is simply and abstract concept. > > The number of "or"'s used in the second part of your sentence is [I > believe] a clear hint that a broader "sensor bundle" model is needed. > It's such early days that I'd hate for a standard to get fixated just on > POIs. > > >> In Web terms, we're talking about events, no? Things like onclick, >> onchange etc. /Those/ are the triggers. A Web App that's triggered by >> absolute geolocation might query the GeoLoc API regularly (say every >> second) and then you'd query a POI database with a machine version of >> "what do we know about this place?" That could be enriched with >> directional and temporal info etc. of course. But, as you say, Matt, >> that's a /query/ against a dataset. > > Well...to step back for a second...what I really honestly think AR is, > is a form of "digital sensory perception". The term "sensory > perception" can be broken down into two clear concepts. > > 1. sensory data > This is the raw data collected from an environment by sensors/sensory > organs. > > 2. perception > After the data is processed a number of "recognised" features, > affordances or "things" are extracted and re-presented. Perception is > fundamentally a "representational" process that turns raw data into > meaning. It's also important/relevant to note that in our biological > world "perception" very likely occurs in at least 2 places. > > 1. in our sensory organs > 2. in our mind > > Some would even say it may happen IN the environment before we sense it > too. This multi-layered approach maps well to what we're discussing. > Raw data may be turned into perceptible [triggers] either in the > sensors, in the application, in an online service or really anywhere > within the data processing chain. > > So I think this is a completely new approach to events. I would hate to > think we had to keep stacking on different "onVerb" bindings every time > someone wanted to add a new type of sensor interaction/event. > > >> The term 'trigger' implies dynamism and these are covered by existing >> working groups who are in various stages of looking at things like >> GeoLoc, camera, clock etc.). > > True...however many of these are not looking at it from the perspective > we are discussing at all (at least that's how it appears from the > outside). For example, the camera/capture API [1] simply seems to deal > with embedding a raw camera view into a browser. > > The API itself has a gaping hole from my perspective. There's a call to > start capturing video and then a call/callback when that is complete. >>From my experience, AR happens almost exclusively BETWEEN those 2 calls. > > NOTE: I'm not criticising this groups work, just pointing out our > cultural differences. This is why I listed them as one of the groups > that I think needs to be intimately engaged in this discussion [2] > > >> I believe Point of Interest data should be thought of as static. What >> data you take from that data is triggered by the user's actions, >> location, time zone or whatever. > > I agree with the general point you're making...but perhaps the word > "static" is a bit misleading here. "Tends to be static"...but not > necessarily. e.g. a User can just as easily be a POI as a building can > be. We've done this in systems we've built. And this User may be > moving, or even be in the past or future (or even through time!). > > [1] http://www.w3.org/TR/2010/WD-capture-api-20100401/#uiexamples > [2] http://lists.w3.org/Archives/Public/public-poiwg/2010Jul/0048.html > > > NOTE: I've included 2 responses in 1 email to reduce my overall SPAM > rating 8) > > On Tue, 2010-08-03 at 17:14:54 +0100, Phil Archer wrote: >> I've not heard it used elsewhere (that doesn't mean that it's not used >> elsewhere of course, just that I've not heard it!) > > I'm really not tied to the word [trigger]. I've been putting it in [] > where possible to denote I'm just using it as a placeholder. > > >> It's clear to me that that a point of interest does not and should not >> necessarily refer to a location on the Earth's surface (or any other >> planet for that matter). Recognising an image (such as a label on a >> bottle of wine) does not imply a geographical location other than >> one's proximity to said bottle. > > This is the key point about [sensor data bundle] vs [location]. > Location coordinates are just the archetypal AR [sensor data bundle] but > definitely should not be the only ones. > > If this point is accepted then POI is relegated to just the archetypal > AR content type with the door being left open to a rich new set of > content/concept types after that. As I said before, AR that only > supports POIs would be like HTTP that only supports HTML documents. > That's useful/necessary for the web, but not sufficient. > > >> The trigger here is that an image captured by the device has been >> passed to some sort of image recognition software that has been able >> to associate it with an identifier that can then be used to access the >> (static) data about that particular bottle. > > See my point above about "sensory perception". And again I'd call out > your use of the word "static". > > >>> You could, I suppose, think of them as "auto triggers". Triggered >>> by the user moving, or things coming into their FOV rather then a >>> click which is a more active form of user triggering. As you say, >>> these would involving query a database at an interval, but it >>> would be something automatically done by the browser, and not >>> something specifically coded for like with onClick style javascript >>> events. >> Not sure I quite agree here. The query (to the browser) might be >> "where am I and which way and I pointing?" That's what GeoLoc >> does/will enable. > > This depends upon your cultural perspective. > You could see this as: > > - user behaviour (e.g. movement) drives > - a query to the browser using the GeoLoc API > - that returns lat/lon/orientation > > But that's simply the first step in the sensory perception process. > This is extracting raw data from the environment at a point in time. > This raw data then needs to be processed into perception in some way to > make "meaning". This is essentially what any of the Layar developers do > when they create a service that responds to a getPointsOfInterest() > request. > > So I think GeoLoc API fits perfectly into the chain...but again this is > only useful/necessary for AR but not sufficient. > > >> I might like to have a slightly different query that said "alert me >> when we get to Trafalgar Square" or "alert me if a bottle of 2000 St >> Emillion Grand Cru passes within camera range. > > I think "alertness" is a great layer on top of what we have discussed so > far that matches "sensory perception" to "goal seeking behaviour". But > this is definitely something layered on top. To apply this to the > [sensory data bundle]/[trigger] model discussed so far this would be a > [trigger] defined by the user as opposed to by the content creator. > > BTW: I really strongly agree with Andy Braun's point that the term > "publisher" should be defined as broadly as possible and that the > [trigger] creator may be separate from that as well. > > >> The other one - tell me when object X is seen - is achievable if you >> have a universal database of things with an agreed identifier >> structure that is used by all image recognition software. The internet >> of things and all that. The browser API could then have two functions >> "can you give me an identifier for the thing the camera is looking >> at?" and "tell me when an object with x identifier comes into view." > > I'm honestly not trying to be argumentative here 8) > Perhaps I'm reading your language too literally, but it seems to be > hardcoding in a form of tunnel vision. While we could allow people to > set "alertness" for when a certain object is "seen" within a "camera > view"...the near-term real-world likelihood is that object recognition > will be correlated against a number of parallel sensor input streams. > > e.g. The stereo cameras on my wearable display (2 cameras) and the CCTV > camera in the Walmart I'm standing in (1 camera) and the RFID sensor in > my belt (yes...I am batman 8) - 1 sensor) and the RFID sensors on the > shelves (n sensors) all collaborate to tell me that the bottle in front > of me is "over priced plonk". > > This is exactly how our biological sensory perception works. > > >>> I think Rob Manson expressed it well as a form of triplet; >>>> "Well...if we did use the "trigger" model then I'd express this >>>> as the following RDFa style triplet: >>>> this [location] is a [trigger] for [this information] >> I agree with the sentiment here if not the syntax. >>>> POIs in this format would then become the archetypal AR >> relationship. >>>> The most critical and common subset of the broader relationship: >>> > this [sensor data bundle] is a [trigger] for [this >> information] >>>> In the standard POIs case the minimum [sensor data bundle] is >>>> "lat/lon" and then optionally "relative magnetic orientation"." >> That doesn't work as a triple as you have a literal (the sensor data) >> as a subject. > > Hrm...first...I was really just using this as a simplified way to convey > the linking concept 8) > > Second...isn't that a matter of perspective. If the location extracted > as sensory data (lat/lon) from the user is seen as a property of the > user (or object) then I agree with you. In this case it would be: > > [user] [is at] [location] > > But linguistically it would be equally valid to see this from a sensory > perception perspective. Where an idealised [sensory data bundle] IS the > subject. It is literally turned into a "thing". A pattern to be > recognised. It is turned from raw data into a specific pattern. > > I do agree however that this would often be a range or a more complex > pattern definition which is where the triplet analogy probably falls > over. Anyway...based on where our discussion is, I think it's the > linking concept that's important not this specific syntax. > > >> Yes. I agree. We're linking criteria to data. The data is static. The >> trigger is that the criteria have been met. > > There's that s word again 8) > > >> Thinking about aggregated streams of triggers might be useful in >> future. i.e. a way to say "tell me when my location is within 200 >> metres of a shop that sells 2000 St Emillion Grand Cru for less than >> €50. What's aggregated here is the list of criteria and they might >> only be accessible by different sensors and data sources. > > This is an excellent example of distributing the "perception" part of > the "sensory perception" across multiple points in the data processing > chain. > > >> I have no problem at all with the word trigger, I think it's useful. >> My only real point is that data about an object, be it geographically >> fixed or otherwise, is not a trigger. The trigger is that the user has >> moved into a context in which a defined subset of the available data >> is relevant and needs to be displayed. > > I was with you right up to the point you said "user". [sensory data > bundle] is flexible and also covers/enables User-less agents that are > just as feasible and relevant. > > > I see how some people could see my points as a dilution of the concept > of POIs and some may even see this type of discussion as a distraction > from "the goal". > > My simple perspective is that with a little reframing we can get all of > the benefits of the POI model while leaving the door open for really > innovative new concepts that are only just now starting to be explored. > > But I would re-iterate my point: > > "AR is digitally mediated sensory perception" > > And NOT just: > > "A geolocated or camera based POI system" > > > Looking forward to hearing people's responses 8) > > > roBman > > > > -- Phil Archer W3C Mobile Web Initiative http://www.w3.org/Mobile http://philarcher.org @philarcher1
Received on Wednesday, 4 August 2010 08:34:19 UTC