Thank you for your feedback. I think we have a very large and open
corpus of documented incidents. As said, the Wikidata workshop is not
a good target conference. Following the reference

[4]: <https://doi.org/10.1109/ICSE.2013.6606583>

I think a paper on that would better fit
https://conf.researchr.org/track/icse-2024/icse-2024-software-engineering-in-practice

I will continue updating the paper on Overleaf

https://www.overleaf.com/read/swswtbdyyhmg



All the best
Moritz

On Thu, Jun 8, 2023 at 7:57 PM Tyler Cipriani <[email protected]> wrote:
>
>
>
> On Thu, Jun 8, 2023 at 12:40 AM Physikerwelt <[email protected]> wrote:
>>
>> Hi all,
>>
>> is there any research on common causes of Wikimedia production errors?
>>
>> Based on recent examples, I plan to analyze and discuss how production
>> errors could be avoided. I am considering submitting a short paper on
>> that to the Wikidata workshop, with the deadline
>> Thursday, 20 July 2023
>> Website: https://wikidataworkshop.github.io/2023/
>> However, there might be better suitable venues.
>
>
> We (Release Engineering) file production-error tasks as part of the weekly 
> train and collect some data in the "train-stats" repo on GitLab[0]. 
> Additionally, Timo Tijhof's "production excellence" blog posts and emails to 
> this list may be of interest to you[1].
>
> The "train-stats" repo collects data for "software defect prediction" based 
> on the use of "FixCaches" or "BugCaches."[2] Each week, we record changes 
> that fix bugs (i.e., the change uses the git trailer `Bug: TXXX` and gets 
> backported to a currently deployed branch). The theory (per the paper linked 
> above) is that the more often a file needs a fix, the more likely it is to 
> cause future bugs. I have an extremely convoluted query to show the list of 
> commonly backported files[3].
>
> Problems with this data:
> - Many of these files are frequently touched files vs error-prone files 
> (e.g., "composer.json")
> - Looking at the count of backports for each file means newer files are less 
> likely to be represented
> - "Lower level" files may be overrepresented (although, that's probably to be 
> expected)
>
> In 2013, a case study used data like this inside Google and found it to be 
> fairly accurate at predicting future bugs[4].
>
> Also, in the case study, whenever a developer edited a file that was present 
> in their fixCache, researchers added a bot-generated note to the patch in 
> their code review tool.Their developers found this note unhelpful: developers 
> already knew these files were problematic—the warning just caused confusion.
>
> Based on that, in March 2020, we created the "Risky Change Template"[5]. My 
> thinking was: if developers already know what's risky, then they can flag it 
> in the train task for the week[6]. At the time, I hoped this would reduce the 
> total version deployment time (although I have no data on that).
>
> I hope some of this helps!
>
> – Tyler
>
> [0]: <https://gitlab.wikimedia.org/repos/releng/train-stats>
> [1]: 
> <https://phabricator.wikimedia.org/phame/post/view/296/production_excellence_46_july_august_2022/>
> [2]: 
> <https://people.csail.mit.edu/hunkim/images/3/37/Papers_kim_2007_bugcache.pdf>
> [3]: 
> <https://data.releng.team/train?sql=select%0D%0A++filename%2C%0D%0A++project%2C%0D%0A++count%28*%29+as+bug_count%0D%0Afrom%0D%0A++bug+b%0D%0A++join+bug_bug_patch+bbp+on+bbp.bug_id+%3D+b.id%0D%0A++join+bug_patch+bp+on+bp.id+%3D+bbp.bug_patch_id%0D%0A++join+bug_file+bf+on+bp.id+%3D+bf.bug_patch_id%0D%0Agroup+by%0D%0A++project%2C+filename%0D%0Aorder+by%0D%0A++bug_count+desc%3B>
> [4]: <https://doi.org/10.1109/ICSE.2013.6606583>
> [5]: <https://wikitech.wikimedia.org/wiki/Deployments/Risky_change_template>
> [6]: <https://train-blockers.toolforge.org/> (here's an example this week: 
> <https://phabricator.wikimedia.org/T337526#8901982>)
>
>>
>>
>> I am also open to collaboration on this effort. If you are interested
>> in a joint paper, drop me an email until the end of this week.
>>
>> All the best
>> Moritz
>> _______________________________________________
>> Wikitech-l mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>
> _______________________________________________
> Wikitech-l mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
_______________________________________________
Wikitech-l mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

Reply via email to