- From: Rob Manson <roBman@mob-labs.com>
- Date: Wed, 04 Aug 2010 15:58:22 +1000
- To: public-poiwg@w3.org
On Tue, 2010-08-03 at 15:42 +0100, Phil Archer wrote: > A location, even one that is defined relative to something else as > opposed to a fixed point in space, is not a trigger. The trigger is that > a device has arrived at that location (or that a recognisable > object/marker has come into view, or that a time has arrived or whatever). Hi Phil...I think I have quite a different cultural perspective...so forgive me while I dance around a bit and try to clarify my language. A location from my perspective is an abstract concept that can be represented in a coordinate system such as lat/lon. Without other related information this is simply and abstract concept. The number of "or"'s used in the second part of your sentence is [I believe] a clear hint that a broader "sensor bundle" model is needed. It's such early days that I'd hate for a standard to get fixated just on POIs. > In Web terms, we're talking about events, no? Things like onclick, > onchange etc. /Those/ are the triggers. A Web App that's triggered by > absolute geolocation might query the GeoLoc API regularly (say every > second) and then you'd query a POI database with a machine version of > "what do we know about this place?" That could be enriched with > directional and temporal info etc. of course. But, as you say, Matt, > that's a /query/ against a dataset. Well...to step back for a second...what I really honestly think AR is, is a form of "digital sensory perception". The term "sensory perception" can be broken down into two clear concepts. 1. sensory data This is the raw data collected from an environment by sensors/sensory organs. 2. perception After the data is processed a number of "recognised" features, affordances or "things" are extracted and re-presented. Perception is fundamentally a "representational" process that turns raw data into meaning. It's also important/relevant to note that in our biological world "perception" very likely occurs in at least 2 places. 1. in our sensory organs 2. in our mind Some would even say it may happen IN the environment before we sense it too. This multi-layered approach maps well to what we're discussing. Raw data may be turned into perceptible [triggers] either in the sensors, in the application, in an online service or really anywhere within the data processing chain. So I think this is a completely new approach to events. I would hate to think we had to keep stacking on different "onVerb" bindings every time someone wanted to add a new type of sensor interaction/event. > The term 'trigger' implies dynamism and these are covered by existing > working groups who are in various stages of looking at things like > GeoLoc, camera, clock etc.). True...however many of these are not looking at it from the perspective we are discussing at all (at least that's how it appears from the outside). For example, the camera/capture API [1] simply seems to deal with embedding a raw camera view into a browser. The API itself has a gaping hole from my perspective. There's a call to start capturing video and then a call/callback when that is complete. >From my experience, AR happens almost exclusively BETWEEN those 2 calls. NOTE: I'm not criticising this groups work, just pointing out our cultural differences. This is why I listed them as one of the groups that I think needs to be intimately engaged in this discussion [2] > I believe Point of Interest data should be thought of as static. What > data you take from that data is triggered by the user's actions, > location, time zone or whatever. I agree with the general point you're making...but perhaps the word "static" is a bit misleading here. "Tends to be static"...but not necessarily. e.g. a User can just as easily be a POI as a building can be. We've done this in systems we've built. And this User may be moving, or even be in the past or future (or even through time!). [1] http://www.w3.org/TR/2010/WD-capture-api-20100401/#uiexamples [2] http://lists.w3.org/Archives/Public/public-poiwg/2010Jul/0048.html NOTE: I've included 2 responses in 1 email to reduce my overall SPAM rating 8) On Tue, 2010-08-03 at 17:14:54 +0100, Phil Archer wrote: > I've not heard it used elsewhere (that doesn't mean that it's not used > elsewhere of course, just that I've not heard it!) I'm really not tied to the word [trigger]. I've been putting it in [] where possible to denote I'm just using it as a placeholder. > It's clear to me that that a point of interest does not and should not > necessarily refer to a location on the Earth's surface (or any other > planet for that matter). Recognising an image (such as a label on a > bottle of wine) does not imply a geographical location other than > one's proximity to said bottle. This is the key point about [sensor data bundle] vs [location]. Location coordinates are just the archetypal AR [sensor data bundle] but definitely should not be the only ones. If this point is accepted then POI is relegated to just the archetypal AR content type with the door being left open to a rich new set of content/concept types after that. As I said before, AR that only supports POIs would be like HTTP that only supports HTML documents. That's useful/necessary for the web, but not sufficient. > The trigger here is that an image captured by the device has been > passed to some sort of image recognition software that has been able > to associate it with an identifier that can then be used to access the > (static) data about that particular bottle. See my point above about "sensory perception". And again I'd call out your use of the word "static". > > You could, I suppose, think of them as "auto triggers". Triggered > > by the user moving, or things coming into their FOV rather then a > > click which is a more active form of user triggering. As you say, > > these would involving query a database at an interval, but it > > would be something automatically done by the browser, and not > > something specifically coded for like with onClick style javascript > > events. > > Not sure I quite agree here. The query (to the browser) might be > "where am I and which way and I pointing?" That's what GeoLoc > does/will enable. This depends upon your cultural perspective. You could see this as: - user behaviour (e.g. movement) drives - a query to the browser using the GeoLoc API - that returns lat/lon/orientation But that's simply the first step in the sensory perception process. This is extracting raw data from the environment at a point in time. This raw data then needs to be processed into perception in some way to make "meaning". This is essentially what any of the Layar developers do when they create a service that responds to a getPointsOfInterest() request. So I think GeoLoc API fits perfectly into the chain...but again this is only useful/necessary for AR but not sufficient. > I might like to have a slightly different query that said "alert me > when we get to Trafalgar Square" or "alert me if a bottle of 2000 St > Emillion Grand Cru passes within camera range. I think "alertness" is a great layer on top of what we have discussed so far that matches "sensory perception" to "goal seeking behaviour". But this is definitely something layered on top. To apply this to the [sensory data bundle]/[trigger] model discussed so far this would be a [trigger] defined by the user as opposed to by the content creator. BTW: I really strongly agree with Andy Braun's point that the term "publisher" should be defined as broadly as possible and that the [trigger] creator may be separate from that as well. > The other one - tell me when object X is seen - is achievable if you > have a universal database of things with an agreed identifier > structure that is used by all image recognition software. The internet > of things and all that. The browser API could then have two functions > "can you give me an identifier for the thing the camera is looking > at?" and "tell me when an object with x identifier comes into view." I'm honestly not trying to be argumentative here 8) Perhaps I'm reading your language too literally, but it seems to be hardcoding in a form of tunnel vision. While we could allow people to set "alertness" for when a certain object is "seen" within a "camera view"...the near-term real-world likelihood is that object recognition will be correlated against a number of parallel sensor input streams. e.g. The stereo cameras on my wearable display (2 cameras) and the CCTV camera in the Walmart I'm standing in (1 camera) and the RFID sensor in my belt (yes...I am batman 8) - 1 sensor) and the RFID sensors on the shelves (n sensors) all collaborate to tell me that the bottle in front of me is "over priced plonk". This is exactly how our biological sensory perception works. > > I think Rob Manson expressed it well as a form of triplet; > >> "Well...if we did use the "trigger" model then I'd express this > >> as the following RDFa style triplet: > >> this [location] is a [trigger] for [this information] > I agree with the sentiment here if not the syntax. > >> POIs in this format would then become the archetypal AR > relationship. > >> The most critical and common subset of the broader relationship: > > > this [sensor data bundle] is a [trigger] for [this > information] > >> In the standard POIs case the minimum [sensor data bundle] is > >> "lat/lon" and then optionally "relative magnetic orientation"." > > That doesn't work as a triple as you have a literal (the sensor data) > as a subject. Hrm...first...I was really just using this as a simplified way to convey the linking concept 8) Second...isn't that a matter of perspective. If the location extracted as sensory data (lat/lon) from the user is seen as a property of the user (or object) then I agree with you. In this case it would be: [user] [is at] [location] But linguistically it would be equally valid to see this from a sensory perception perspective. Where an idealised [sensory data bundle] IS the subject. It is literally turned into a "thing". A pattern to be recognised. It is turned from raw data into a specific pattern. I do agree however that this would often be a range or a more complex pattern definition which is where the triplet analogy probably falls over. Anyway...based on where our discussion is, I think it's the linking concept that's important not this specific syntax. > Yes. I agree. We're linking criteria to data. The data is static. The > trigger is that the criteria have been met. There's that s word again 8) > Thinking about aggregated streams of triggers might be useful in > future. i.e. a way to say "tell me when my location is within 200 > metres of a shop that sells 2000 St Emillion Grand Cru for less than > €50. What's aggregated here is the list of criteria and they might > only be accessible by different sensors and data sources. This is an excellent example of distributing the "perception" part of the "sensory perception" across multiple points in the data processing chain. > I have no problem at all with the word trigger, I think it's useful. > My only real point is that data about an object, be it geographically > fixed or otherwise, is not a trigger. The trigger is that the user has > moved into a context in which a defined subset of the available data > is relevant and needs to be displayed. I was with you right up to the point you said "user". [sensory data bundle] is flexible and also covers/enables User-less agents that are just as feasible and relevant. I see how some people could see my points as a dilution of the concept of POIs and some may even see this type of discussion as a distraction from "the goal". My simple perspective is that with a little reframing we can get all of the benefits of the POI model while leaving the door open for really innovative new concepts that are only just now starting to be explored. But I would re-iterate my point: "AR is digitally mediated sensory perception" And NOT just: "A geolocated or camera based POI system" Looking forward to hearing people's responses 8) roBman
Received on Wednesday, 4 August 2010 06:01:14 UTC