On Mar 27, 2013 1:31 AM, "Teresa Cho" <[email protected]> wrote: > > Hi everyone, > > My name is Teresa (or terrrydactyl if you've seen me on IRC) and I've > been interning at Wikimedia for the last few months through the > Outreach Program for Women[1]. My project, Git2Pages[2], is an > extension to pull snippets of code/text from a git repository. I've > been working hard on learning PHP and the MediaWiki > framework/development cycle. My internship is ending soon and I wanted > to reach out to the community and ask for feedback. >
Cool stuff! > Here's what the program currently does: > - User supplies (git) url, filename, branch, startline, endline using > the #snippet tag > - Git2Pages.body.php will validate the information and then pass on > the inputs into my library, GitRepository.php > - GitRepository will do a sparse checkout on the information, that is, > it will clone the repository but only keep the specified file (this > was implemented to save space) > - The repositories will be cloned into a folder that is a md5 hash of > the url + branch to make sure that the program isn't cloning a ton of > copies of the same repository Why hash it, and not just keep the url + branch encoded to some charset that is a valid path, saving rare yet hairy collisions? > - If the repository already exists, the file will be added to the > sparse-checkout file and the program will update the working tree Will there be a re checkout for a duplicate request? Will the cache of files ever be cleaned? > - Once the repo is cloned, the program will go and yank the lines that > the user requested and it'll return the text encased in a <pre> tag. > > This is my baseline program. It works (for me at least). I have a few > ideas of what to work on next, but I would really like to know if I'm > going in the right direction. Is this something you would use? How > does my code look, is the implementation up to the MediaWiki coding > standard? buttt You can find the progression of the code on > gerrit[3]. > > Here are some ideas of what I might want to implement while still on > the internship: > - Instead of a <pre> tag, encase it in a <syntaxhighlight lang> tag if > it's code, maybe add a flag for user to supply the language > - Keep a database of all the repositories that a wiki has (though not > sure how to handle deletions) > > Here are some problems I might face: > - If I update the working tree each time a file from the same > repository is added, then the line numbers may not match the old file > - Should I be periodically updating the repositories or perhaps keep > multiple snapshots of the same repository > - Cloning an entire repository and keeping only one file does not seem > ideal, but I've yet to find a better solution, the more repositories > being used concurrently the bigger an issue this might be > - I'm also worried about security implications of my program. Security > isn't my area of expertise, and I would definitely appreciate some > input from people with a security background > > Thanks for taking the time to read this and thanks in advance for any > feedback, bug reports, etc. > > Have a great day, > Teresa > http://www.mediawiki.org/wiki/User:Chot > > [1] https://www.mediawiki.org/wiki/Outreach_Program_for_Women > [2] http://www.mediawiki.org/wiki/Extension:Git2Pages > [3] https://gerrit.wikimedia.org/r/#/q/project:mediawiki/extensions/Git2Pages,n,z > > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
