Ladsgroup added a comment.

  So I did a little bit of statistics. First I rebuilt the old model with the 
old features multiple times to build a distribution of roc_auc and other 
metrics it produced.
  
  - For roc_auc the mean is 0.965, the std is 0.000655
  - For accuracy, the mean is 0.921 and the std is 0.000663
  
  The z value for changes caused by the new feature for roc auc is 9.98 and for 
accuracy 11.1, these are so big that no z tables have the p values for them 
(and online tools give plain zero for that z score). Meaning statistically it's 
impossible to new features to improve accuracy just by chance.
  
  For PS OTOH: The z score of roc auc is 2.11 and for accuracy is 0.69 which 
according to z tables means the p-values are 17% and 24% respectively meaning 
it's very likely that PS has no effect on the model performance at all and all 
changes are by chance (the p-value is usually considered good enough if it's 
lower than 5% or 1%). This makes a lots of sense given that in some places 
adding PS seems to decrease the performance instead.

TASK DETAIL
  https://phabricator.wikimedia.org/T261850

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Ladsgroup
Cc: Aklapper, GoranSMilovanovic, Lydia_Pintscher, guergana.tzatchkova, 
Hazizibinmahdi, Akuckartz, darthmon_wmde, Michael, Nandana, Lahi, Gq86, 
QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Ladsgroup, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to