JAllemandou added a comment.
In T342416#9091146 <https://phabricator.wikimedia.org/T342416#9091146>, @EBernhardson wrote: > I looked into these, the attached patch should fix it but it leaves an open question (@JAllemandou): > > The `core-site.xml`, along with puppet which writes it out, has the default umask of 027 since at least 2021, which prevents world readability. So why do we have the following permissions for historical dumps: > > drwxr-xr-x /wmf/data/discovery/wikidata/rdf/date=20230710 > drwxr-xr-x /wmf/data/discovery/wikidata/rdf/date=20230716 > drwxr-xr-x /wmf/data/discovery/wikidata/rdf/date=20230717 > drwxr-x--- /wmf/data/discovery/wikidata/rdf/date=20230723 > drwxr-x--- /wmf/data/discovery/wikidata/rdf/date=20230724 > drwxr-x--- /wmf/data/discovery/wikidata/rdf/date=20230730 > drwxr-x--- /wmf/data/discovery/wikidata/rdf/date=20230731 > drwxr-x--- /wmf/data/discovery/wikidata/rdf/date=20230806 The world-readable change were manually made by myself to unblock @AndrewTavis_WMDE - I logged my change in the analytics IRC chan but didn't ping on the search IRC chan - I should have, please excuse me on this :) > Similarly we have other jobs that still run today and emit world readable dumps without explicitly setting the umask, what is causing the difference? > > drwxrwxr-x /wmf/data/discovery/cirrus/index/cirrus_replica=codfw/cirrus_group=chi/wiki=enwiki/snapshot=20230716 > drwxrwxr-x /wmf/data/discovery/cirrus/index/cirrus_replica=codfw/cirrus_group=chi/wiki=enwiki/snapshot=20230723 > drwxrwxr-x /wmf/data/discovery/cirrus/index/cirrus_replica=codfw/cirrus_group=chi/wiki=enwiki/snapshot=20230730 > drwxrwxr-x /wmf/data/discovery/cirrus/index/cirrus_replica=codfw/cirrus_group=chi/wiki=enwiki/snapshot=20230806 The guess I have about those would be that they are still generated by a Hive job. Hive and spark behave differently in regard to permissions when generating files. Spark uses the configured umask, while hive reproduces the parent-dir patten. I'd be interested to be sure if my guess is correct :) TASK DETAIL https://phabricator.wikimedia.org/T342416 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: EBernhardson, JAllemandou Cc: dcausse, BTullis, AndrewTavis_WMDE, Aklapper, JAllemandou, Danny_Benjafield_WMDE, Mohamed-Awnallah, Astuthiodit_1, AWesterinen, lbowmaker, karapayneWMDE, Invadibot, Ywats0ns, maantietaja, ItamarWMDE, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- [email protected] To unsubscribe send an email to [email protected]
