- From: Rotan Hanrahan <rotan.hanrahan@mobileaware.com>
- Date: Thu, 8 Jan 2009 10:20:22 -0000
- To: <public-bpwg-ct@w3.org>
- Message-ID: <D5306DC72D165F488F56A9E43F2045D301DA4DB1@FTO.mobileaware.com>
According to Brian Wilson [1], MAMA identified itself as an Opera 9.10 browser when gathering data from 3.5 million Web sites. This effectively eliminates all mobile content from adaptive sites, such as those that might use MobileAware technology or equivalent. Brian further reveals [2] that the URLs were initially derived from the DMoz collection, a publically edited collection of Web addresses and predominantly desktop sites. He then added the URLs of the W3C members (also desktop sites, though a few are also adaptive). Finally, he added the top Alexa sites, a list generated by a toolbar installed on desktop browsers. I therefore see this collection as being somewhat representative of the desktop Web, and completely ignorant of anything that may be present in the mobile aspect of the Web. Luca Passani's recent suggestion [3] to examine the almost 1000 "xml+xhtml" Web sites from the collection is interesting. If the collection was a more representative sample of the whole Web (including an appropriate proportion of mobile sites) and the analysis had been conducted with a range of User Agents (weighted according to prevalence) then perhaps Luca's examination of the "xml+xhtml" sites would be revealing. And if a statistically supported correlation between "xml+xhtml" and "mobile" was thereby demonstrated then "Luca's Heuristic" would have more solid grounds for adoption. Unfortunately, the MAMA statistics and analysis are insufficient to provide the justification that Luca's Heuristic requires. In fact, given the way the URL set was created and pages retrieved, an analysis of the data might reveal that they are not as mobile as Luca suggests (or hopes). Perhaps if someone were to provide Brian with the additional resources, or collaboration, then we might get what we need. He needs to create a more representative URL set, and a weighted sample of User Agents, then cross the two. We should acknowledge that Opera have already been very generous in providing Brian resources to date, and making the results available to all, but we cannot expect Opera to carry all the burden, no matter how beneficial the results might be. It may be in the best interest of the providers of transcoding proxies to conduct and publish this analysis. If they can demonstrate one or more clear heuristics for identifying mobile sites (without requiring any changes to be made by site administrators, page authors or Web application developers) then they can use this as fairly concrete means of supporting their operating models. Whether or not they should actually intervene in the traffic between site and client without the approval of the site/author/user is a separate matter that is beyond technology or statistics, but at least with some solid data and viable acceptable heuristics they would have a basis for executing a policy that would be seen as "best practice". Regarding the Web technology itself, we must accept that (absent any security mechanism) any URL that is accessible via HTTP essentially makes the corresponding resources fair game for public use. We have higher layers of control in the form of copyright (not universally accepted, but at least generally understood) that can be associated with the resources to limit what polite society can do with the "property". Sometimes copyright is unambiguous, and sometimes it is merely inferred. Detecting that the authored work was intended for desktop, mobile, automobile, or whatever, has no bearing on the protection afforded by copyright. So even if we do identify good site-type detection mechanisms, this argument about whether or not a transcoding proxy should intervene is irrelevant. What we need is more general guidelines about what is appropriate/permitted once a representation of the authored work has been extracted from a machine via HTTP. As a Web author, I would have an expectation that my work would be somewhat adapted by the browser used to view it. The user may change the dimensions of the window, change the zoom, maybe even change the fonts if the browser has some limitations. If I were a more knowledgeable Web author, I would probably also be aware of adaptation of my text to spoken form via synthesis, as a service to people with visual problems. Would I see this as a derivative work? (If I published a written version of my music and then someone synthesised it back to audio, would I see this as a derivative work competing with my published music CD?) Things get a little grey when we acknowledge that some adaptation is going to happen when we publish via the Web. As Web authors, we should know that this is the case, and should accept it as part of the normal process of Web publication. Therefore there should be no concerns regarding copyright for such basic forms of adaptation. But what if the browser were to make major changes to my published work? What if it removed major pieces that I initially considered essential to the intellectual property in my published work? Would I still be as tolerant, or would I feel that somehow my work is being violated? I might, or I might not. But if I didn't anticipate this could happen, then I might never have taken steps to make my feelings known. I might never have thought it necessary to say "do not mangle this content". Yet this is exactly the situation we find with some mobile browsers and, of course, transcoding proxies. Content is being seriously adapted, often without consultation with the authors or users. (The underlying motivations for doing this are irrelevant to the argument, so I won't pursue them.) The heuristics we seek might enable us to detect that a site was designed for use on mobile devices. This is essentially a signal from the author to say "my content is this way because this is how I want it delivered to mobile devices". The question now is whether we should use this signal to prevent further adaptation. Given that my company provides adaptive technology to deliver content to mobile devices, I would certainly like this signal to be respected, fully. However, I am going to briefly take the side of the transcoding proxy and present an argument in their favour. Up to fairly recently, when an author placed content on a Web site, this was a signal that "my content is this way because I expect it to be viewed by Web browsers". Following an earlier argument, there is also the inference that "I also expect that a few people will resize, zoom or otherwise adapt it to their particular requirements". This is quite normal, and basically how people expected the Web to work. Now (perhaps via Luca's Heuristic) the authors can say "my content is this way because I expect it to be viewed on mobile browsers," and perhaps it is also reasonable to infer that "I also expect that a few people with resize, zoom or otherwise adapt it to their particular requirements." Exactly the same inference we had in the legacy Web. Given that adaptation of retrieved resources is an anticipated possibility, and given that some users will not be able to perceive the content acceptably without adaptation, the role of the transcoding proxy is acceptable. So, perhaps what is missing is not just a signal for the "this is mobile content" but also a signal to adjust the inference. An author should be able to add "and I do not want any adaptation to be performed" (or possibly some refinements to this). I conclude therefore that we have two requirements: 1. A reliable means of identifying that the author's delivery intentions (i.e. "this is mobile/desktop/non-visual/... content"). 2. A reliable means of identifying additional constraints (i.e. "do not further adapt"). To date, these requirements have been somewhat conflated, particularly as we may have assumed that declaring content to be mobile implies that we are also declaring that no further adaptation is allowed. Perhaps a better inference is that further adaptation of mobile content should be avoided if it is being delivered to a mobile device. That's possibly something that I could live with, as both an author and a provider of adaptive technology. Finally, as a Web user, I would still like to have a say in any adaptation. On the desktop I have direct control. I can resize the window, zoom the page, add a speech synthesis plug-in etc. But on the mobile devices I have less control. Sometimes my browser gives me some optional features, such as "fit to page". I'm sure the page authors won't mind me doing a little bit of fit-to-page with their sites. What I don't like is someone in the middle doing adaptation without my knowledge. If I have the option to permit/prevent this "feature", then I'm happy. I would use it when necessary. And again, I'm sure the content authors won't mind me doing this. I'm just not happy if some uninvited gooseberry* decides to make it a three-some. ---Rotan [1] http://dev.opera.com/articles/view/mama-methodology/ [2] http://dev.opera.com/articles/view/mama-the-url-set/ [3] http://lists.w3.org/Archives/Public/public-bpwg-ct/2009Jan/0022.html * Possibly a particularly Irish phrase... :)
Received on Thursday, 8 January 2009 10:21:11 UTC