Ladsgroup added a comment.
So I did a little bit of statistics. First I rebuilt the old model with the old features multiple times to build a distribution of roc_auc and other metrics it produced. - For roc_auc the mean is 0.965, the std is 0.000655 - For accuracy, the mean is 0.921 and the std is 0.000663 The z value for changes caused by the new feature for roc auc is 9.98 and for accuracy 11.1, these are so big that no z tables have the p values for them (and online tools give plain zero for that z score). Meaning statistically it's impossible to new features to improve accuracy just by chance. For PS OTOH: The z score of roc auc is 2.11 and for accuracy is 0.69 which according to z tables means the p-values are 17% and 24% respectively meaning it's very likely that PS has no effect on the model performance at all and all changes are by chance (the p-value is usually considered good enough if it's lower than 5% or 1%). This makes a lots of sense given that in some places adding PS seems to decrease the performance instead. TASK DETAIL https://phabricator.wikimedia.org/T261850 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ladsgroup Cc: Aklapper, GoranSMilovanovic, Lydia_Pintscher, guergana.tzatchkova, Hazizibinmahdi, Akuckartz, darthmon_wmde, Michael, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Ladsgroup, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
