RE: Introduction / MSDN-js

Thanks, Max!

If you have any questions that are specific to the MSDN source material, please let me know.



-----Original Message-----
From: Doug Schepers [] 
Sent: Sunday, April 28, 2013 10:59 PM
To: Max Polk
Subject: Re: Introduction / MSDN-js

Hi, Max-

Welcome to the WebPlatform project!

On 4/28/13 6:50 PM, Max Polk wrote:
> I can help with the script if it's XSLT or Python or Perl, or if 
> Alexander is doing it I could be a tester.  Using XSLT you could turn 
> it directly into MediaWiki markup.


I don't think Alex (I assume you're talking about Alex Komoroske?) is interested in doing the conversion script. He's definitely in the critical loop for the template structure, though.

So, if you're willing to take charge of making a conversion script, and iterating with it to refine the output, that would be most welcome!

> However, I foresee lots of
> problems with links and references, so even if you start with XSLT
> there will likely end up being a need for script code like Python to
> to hard things like cross reference the whole bundle, finding dups,
> errors, missing links, rewriting footnotes and anchors and whatnot.

I don't have a strong feeling about what language we use.

If we maintain a consistent link structure, I would expect that the 
links wouldn't cause much problem, but maybe I'm overlooking something.

In any case, one of the additional requirements is not just converting 
the HTML to wikitext, but writing it into the wiki itself. Boris Smus 
wrote a command-line tool that could help with that [1].

> Instead of a one-by-one conversion, an exactly opposite idea is an
> all-at-once conversion, a processing pipeline, where you benefit from
> one or more intermediate formats, that always takes the pristine
> original and always creates a completed finished product to be
> reviewed.  When any error at all is found, open an issue, and the
> script or scripts are tweaked, we redo the whole thing, and continue
> iteratively until it's all perfectly converted.

I think this is a good approach, and we used it once before (in the 
original MSDN import before launch). There are a couple of downsides to it:

1) it may require more review cycles (which are an unpopular task)

2) it doesn't empower individuals much to take ownership of a page, and 
that deprives us from getting multiple people involved who might have 
interesting ideas and variations for improvements we can bring back and 
make to other pages as well.

That said, I certainly wouldn't turn down an offer to do an iterative 
mass-conversion and insertion.

Either way, we should do the test runs in the test wiki [2].

> As to the three points made at
> :
> 1.  URL structure - nice to have contributed content stay together as
> a cohesive unit in the wiki and not try to merge it into something
> else for now.  You're not doing a rewrite and mass edits just yet.
> Keep it together at first.  Maybe contributed content can be subpages,
> such as mjs/page1 and mjs/page2, etc.

I like the structure that MDN used [3], under these topic clusters:
* Objects
* Constants
* Properties
* Functions
* Methods
* Operators
* Statements
* Directives
* Errors
* Reserved Words

So, it could look like:


We could also look at how MDN [4] did it, and see if there are tweaks we 
could use from that.

> 2.  MediaWiki templates - anything the XSLT or Python? script produces
> is easily template aware, i.e., it can create content that is
> template-syntax-correct.  Not focusing on this just yet.

Yeah, to some degree, it's orthogonal for first steps, but it will 
become critical before too long.

> 3.  Methodology to convert the HTML content into the wiki - that's
> what I'm addressing mainly, a potentially multi-step conversion.

Yup, we're certainly open to experimentation.

Is there anything you need to get started, or any way we can help? 
Please don't hesitate to check back frequently with updates or questions.





Received on Monday, 29 April 2013 15:13:15 UTC