Re: Provbench submission request - wikipedia edits

On 27/06/2014 12:53, Tim Lebo wrote:
> That's Tim's 'XSL' (not SSL).
of course. No acrimony over acronyms, I hope :-)

I do get a clean mvn install from within eclipse, but obviously my local repo may be hiding the key libs.  In my earlier mail to Tim
I pointed to the revelant maven repo entries though?

--Paolo

>
> Anyone should be able to join the comments list now. Give it a try.
>
> Please let's us know of you get the crawler working, Jun. I haven't *lost* interest, it's just been displaced ;)
>
> Tim
>
> Sent from my iPhone
>
>> On Jun 27, 2014, at 7:40, "Zhao, Jun" <j.zhao5@lancaster.ac.uk> wrote:
>>
>> Hi Paolo,
>>> i started documenting (1) but then Tim lost interest :-) and it went back to the back burner. Should I resume? I would love to see
>>> this used/tested!
>> Yes, please continue. I am building Matt’s atm.
>>
>> ps. guys, I think our old prov-comments are back to life. I will start to use that in the future. I am not sure whether Hugo and Age are on the list though.
>>
>> Cheers,
>>
>> Jun
>>
>>> --Paolo
>>>
>>>
>>>> On 27/06/2014 11:30, Zhao, Jun wrote:
>>>> Hi guys,
>>>>
>>>> I’ve been keeping an eye on this because I also want to reproduce the wikipedia prov. 
>>>>
>>>> I am at lost which one I should go for now. Spoiled with choices :)
>>>>
>>>> btw, I am cc’ing prov-commnets to see whether it is now working.
>>>>
>>>> Cheers,
>>>>
>>>> — Jun
>>>>
>>>>
>>>>> On 25 Jun 2014, at 20:37, Timothy Lebo <lebot@rpi.edu> wrote:
>>>>>
>>>>> Hi, Paolo.
>>>>>
>>>>> I just read the paper, and I think the method of crawling is clear.
>>>>> Interesting approach.
>>>>>
>>>>> I’ve passed my need for a crawl at the moment, but I’ll let you know when I find another excuse to try out your stuff.
>>>>>
>>>>> Best,
>>>>> Tim
>>>>>
>>>>>
>>>>>> On Jun 23, 2014, at 3:16 PM, Paolo Missier <Paolo.Missier@ncl.ac.uk> wrote:
>>>>>>
>>>>>> Tim,
>>>>>>
>>>>>>> On 19/06/2014 13:38, Timothy Lebo wrote:
>>>>>>> Paolo,
>>>>>>>
>>>>>>>> On Jun 19, 2014, at 2:32 AM, Paolo Missier <Paolo.Missier@ncl.ac.uk> wrote:
>>>>>>>>
>>>>>>>> Hi Tim
>>>>>>>>
>>>>>>>> haven't looked at the source -- is this your own code or a fork of Matt's?  (or our old one?)
>>>>>>> It’s my own home-grown code.
>>>>>>> (I couldn’t get either of your repositories to work.)
>>>>>> I got Matt's version to build and work with just maven (from within eclipse)
>>>>>> what was the problem?
>>>>>>
>>>>>>>> in fact is this a prov-o of a wiki dump (as opposed to a crawler?)
>>>>>>> Yes, it’s an XSL of the wiki XML dump to produce PROV-O.
>>>>>>> You feed it the page names to grab, so no crawling.
>>>>>>>
>>>>>>> It’d be nice to understand how your crawler works. It’s insightful than my approach.
>>>>>>> Wiki page? :-)
>>>>>> I got Matt's version to work but I need a little time to document -- there is a GUI (by the student) and a command line (which I remember adding myself).
>>>>>> I shall document... but basically you can control
>>>>>> - the max number of revisions you traverse for each page
>>>>>> - the max number of contributions by any editor
>>>>>> - and a "depth" field which I forgot about :-)
>>>>>>
>>>>>> have you read the short paper? https://github.com/provbench/Wikipedia-PROV/blob/master/wikipediaTraces.pdf
>>>>>>
>>>>>> Three parameters are used to control the extent of the user/ar- ticle spaced visited by the crawler. Firstly, the revision length determines the max. number of wasRevisionOf re- lations traversed, towards the past, from a landing revision page. Secondly, the max users parameter determines the max number of wasAssociatedWith relations, i.e., the max number of contributions explored per user. Thirdly, the depth parameter determines how many times the switch- over between article space and user space may occur. For example, setting depth = 3 results in the exploration of re- visions for articles that are connected to the original seed article through at most 2 intermediate users: base article → user1 → article2 → user2 → article3.
>>>>>>
>>>>>>
>>>>>>>> sorr yI am rushing to ask before I look :-)
>>>>>>> No worries. I’d do the same ;-)
>>>>>>>
>>>>>>> Best,
>>>>>>> Tim
>>>>>>>
>>>>>>> p.s. are we ready to “go live” with this kind of discussion on prov-comments?
>>>>>> sure why not -- 
>>>>>>
>>>>>> -Paolo
>>>
>>> -- 
>>> Paolo Missier - Paolo.Missier@newcastle.ac.uk, pmissier@acm.org 
>>> School of Computing Science, Newcastle University,  UK
>>> professional: http://www.cs.ncl.ac.uk/people/Paolo.Missier
>>> photography: http://scattidistratti.smugmug.com/
>>> PGP Public key: 0x45596549  - key servers: pool.sks-keyservers.net
>>> =--= Tempus fugit =--=


-- 
Paolo Missier - Paolo.Missier@newcastle.ac.uk, pmissier@acm.org 
School of Computing Science, Newcastle University,  UK
professional: http://www.cs.ncl.ac.uk/people/Paolo.Missier
photography: http://scattidistratti.smugmug.com/
PGP Public key: 0x45596549  - key servers: pool.sks-keyservers.net
=--= Tempus fugit =--=

Received on Friday, 27 June 2014 12:08:18 UTC