AndrewTavis_WMDE added a comment.

  Will check the following with @Manuel later today, but the following are the 
metrics I'm getting from the `20230717` dated data from 
`discovery.wikibase_rdf` (note that I don't have access to later ones given 
permission restrictions that are documented in T342416 
<https://phabricator.wikimedia.org/T342416>):
  
    get_num_str_with_commas(total_triples)
    # 15,043,483,216
    
    total_sa_triples = total_sa_direct_triples + total_sa_val_triples + 
total_sa_ref_triples
    # 7,188,746,257 + 200,337 + 332,476,964
    get_num_str_with_commas(total_sa_triples)
    # 7,521,423,558
    
    percent_sa_triples = round(total_sa_triples / total_triples * 100, 4)
    percent_sa_triples
    # 49.9979
    
    total_only_sa_triples = total_sa_direct_triples + total_only_sa_val_triples 
+ total_only_sa_ref_triples
    # 7,188,746,257 + 13,651 + 332,466,067
    get_num_str_with_commas(total_only_sa_triples)
    # 7,521,225,975
    
    percent_only_sa_triples = round(total_only_sa_triples / total_triples * 
100, 4)
    percent_only_sa_triples
    # 49.9966
  
  I did end up using PySpark so I could follow @dcausse's example as well as I 
could :) Should I upload the finished notebook to people.wikimedia.org?

TASK DETAIL
  https://phabricator.wikimedia.org/T342111

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: mpopov, JAllemandou, Lydia_Pintscher, dcausse, Gehel, dr0ptp4kt, 
AndrewTavis_WMDE, Aklapper, Manuel, Danny_Benjafield_WMDE, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to