Re: Initial research into installable web apps from Marcos Caceres on 2013-11-12 (public-web-mobile@w3.org from November 2013)

From: Marcos Caceres <w3c@marcosc.com>
Date: Tue, 12 Nov 2013 15:21:58 +0000
To: Ernesto Jiménez <erjica@gmail.com>
Cc: public-web-mobile@w3.org
Message-ID: <C3372BA068104863BB65A4B9CBDC9B4B@marcosc.com>
On Tuesday, November 12, 2013 at 12:36 PM, Ernesto Jiménez wrote:

> On Mon, Nov 11, 2013 at 8:27 PM, Marcos Caceres <w3c@marcosc.com (mailto:w3c@marcosc.com)> wrote:
> >  
> > On Saturday, November 9, 2013 at 11:50 PM, Ernesto Jiménez wrote:
> > >  
> > > I did a quick & dirty extraction from the webdevdata.org (http://webdevdata.org) (http://webdevdata.org) to have a go at it. Just extracted the top meta header names in order to see how popular are the app-specific ones such as application-name or apple-mobile-web-app-capable. Given that the viewport tag seems to be extended, I also extracted the top properties in viewport.
> > >  
> > > https://gist.github.com/ernesto-jimenez/7390115
> > >  
> > > It has been quick, so numbers are not accurate.
> >  
> > This is great, but yeah… I’m getting different results on the same data set (you are using the Oct 30 set, right?). For apple-mobile-web-app-capable, I get 1163 sites. Results are here:
> > https://gist.github.com/marcoscaceres/7419589
> >  
> > Searches I’m doing are just grepping:
> >  
> > find ./ -name "*ml.txt" | xargs grep -l "apple-mobile-web-app-capable”
> >  
> > What method are you using to get your results?  
>  
> I used a different method. Rather than just using grep I wrote 60 lines of Go to do a command line tool that would read a file and print all meta tag names it found.  
>  
> Then I do:
> $ find ./ -name "*ml.txt" | parallel "print_meta_tags {} >> meta_tags"


Be great to share the Go script with WebDevData. I can set up a repo for you in that org. If that sounds good, please Send me your GH username.    
>  
> That results in a 316,535 line long CSV containing filename and meta tag name.  
>  
> $ cat meta_tags | grep apple-mobile-web-app-capable | wc -l
> 1149
>  
> I've checked your method and the discrepancies seem to come mainly from:
> * Commented meta tags. e.g: 16/nationchannel.com_167cb1ae269bf0c09ae5fd3496e26848.html.txt
> * Meta tags added in JS. e.g: d9/harristeeter.com_d977ea7bae0fa6a33de5fef5c5e1efd7.html.txt
>  
>  
>  
> BTW, what user-agent was used to create the dataset? Some sites might be doing user-agent sniffing to switch between their mobile and desktop sites and we might be missing some data.

None. I also raised this as an issue. However, I’m inclined for us to use this data at least for now.  

At the end of november we are set to do another round of collection (it’s my turn to do it - we are taking it in turn). I’m thinking we will then do one with no user agent + one as Safari/iPhone. Plus we are trying to get the data to hold 100K sites on each round, so we can do better longitudinal data analysis. We are hoping to get the HTTP Archive to hold the data for us.  
  
>  
>  
> > > I'm happy to help with the draft, but I'm not doing a pull request yet. I should probably fix my W3C account first, since it's still linked to my previous company. I did send an application to join the group as invited expert, in case I can help out.
> >  
> >  
> > That would be awesome if you could. My idea is to now take a sample of about ~250 sites (for a confidence of 95% given the dataset size) and see if they are using the tag “properly”. That is, and I strongly suspect, that very few sites that claim to be “installable” actually function as installable web apps.
> >  
> > I would really need help with this. I would like to split this task amongst 2-5 people, each of us looking at if these sites actually work as applications once installed. We would need to come up with some simple criteria for that… it’s pretty self evident, but with some caveats. For example:
> >  
> > 1. forecast.io (http://forecast.io) - yes, works as application, but it’s not useable as a Website on the iPhone!
> > 2. variety.com (http://variety.com) - only “installed" page “works”, but clicking on *any link* (even same domain) breaks the “installed app” illusion.
> > 3. squawka.com (http://squawka.com) - declares to be capable, but presents the Desktop site.
> > …
> >  
> > WDYT?
>  
> That sounds good, but it's going to be time-consuming. I would rather do a first pass on the data to review stats on what tags are in use and how. After that, based on the data, we can get into more time-consuming research based on what we observe.  
>  
> In my opinion, the first pass on data will already give us some info on what developers intend to support, even if they don't implement it properly. Then we can dig deeper on how they are actually implementing it.  
Ok, let’s get the high level stuff first.   

Icons are going to be tricky (it’s basically favicon vs Apple’s icon + the “sizes” attribute they added to the platform). Need to describe the problem more fully in the document.
Received on Tuesday, 12 November 2013 15:22:32 UTC