AndrewTavis_WMDE added a comment.

  Notes from the call that @dcausse and had:
  
  - Notebook seems alright to him, so I'm moving this into review
  - Our plan of doing the same subclasses as AKhatun as well as the direct 
subclasses also made sense, so work will progress as planned in T342123 
<https://phabricator.wikimedia.org/T342123>
  - With regards to the Presto vs. Spark outputs of nested columns, he 
suggested that Presto might also be able to be referenced in a dictionary-like 
fashion, but it might just not appear so on first glance of the output in 
Pandas.
    - In testing this a bit I found that Presto does have the `UNNEST` function 
that has a similar use case to `LATERAL VIEW EXPLODE`
    - The following allows us to get sub-entries within the `claims` column of 
`wmf.wikidata_entity`:
  
    SELECT
        a, b, c, d, e, f, g
    
    FROM 
        wmf.wikidata_entity
    
    CROSS JOIN 
        UNNEST(claims) AS claims_explode (a, b, c, d, e, f, g)
    
    WHERE 
        snapshot = '2023-07-24'
        AND id = 'Q1895685'
  
  It doesn't appear to allow for a direct dictionary-like key to value 
reference system, but does allow for arrays to be unnested and assigned to 
output columns. Might be of use in the future 😊

TASK DETAIL
  https://phabricator.wikimedia.org/T342111

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: AndrewTavis_WMDE
Cc: mpopov, JAllemandou, Lydia_Pintscher, dcausse, Gehel, dr0ptp4kt, 
AndrewTavis_WMDE, Aklapper, Manuel, Danny_Benjafield_WMDE, Astuthiodit_1, 
karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, 
Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to