akosiaris added a comment.
In T280485#7510193 <https://phabricator.wikimedia.org/T280485#7510193>, @Gehel wrote: >>> In T280485#7506072 <https://phabricator.wikimedia.org/T280485#7506072>, @akosiaris wrote: >>> >>>> Is T280485#7275149 <https://phabricator.wikimedia.org/T280485#7275149> related to blazegraph and not flink ? I am not sure what 13B triplets vs 2.8T triples means storage wise and in which context. > > Oh, I now see the confusion! Wrong units (typo) in the initial message. The current Flink updater takes data from Wikidata, which has ~13B triples. 😆😆. OK, thanks for clearing that up. That 200 times increase had me worried. > The new Flink updater will add support for getting data from Commons, which has ~2.8B triples. So the new updater will add ~20% more resource consumption (assuming a linear cost). OK, that's nothing then. > This will mean: > > - additional storage on Swift (I assume this is trivial given the size of Swift and can be ignored) > - additional CPU / RAM usage on k8s > - additional local storage (/tmp) on the containers We got enough on all of those 3, no worries there. > It isn't super clear to me if our strategy is to increase the size of the current Flink cluster, or have a new cluster dedicated to the Commons updater (to be decided later today). Cool. Let us know what you decide. On our side, it probably isn't much more than 1 more deployment on the k8s cluster. That being said, and assuming my memory is up to date with how flink works in session cluster mode, I 'd expect it's able to handle this internally without needing another k8s deployment. It's also fine to increase the number of worker pods if that's something that would make things easier for you. > Duplicate the existing cluster would provide additional isolation between the 2 workflows. This is also the worst case scenario in terms of resource needed. The additional estimated resources are: > > - manager: 1 more pod at 1.6G, cpu: 500m > - workers: 3 pods at 2.1G ram, cpu: 1000m Even in this case, we got that capacity. TASK DETAIL https://phabricator.wikimedia.org/T280485 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Gehel, akosiaris Cc: akosiaris, Zbyszko, Aklapper, RKemper, Gehel, MPhamWMF, wkandek, JMeybohm, CBogen, Namenlos314, jijiki, Gq86, Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Dzahn
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org