Recommender

Addshore Thu, 05 Aug 2021 03:34:09 -0700

Addshore added a comment.

  Before diving into the individual parts of the comments above the main points 
here seem to be:

  1. Is there a precedent for calling services that are external to Wikimedia 
production, from Wikimedia production
  2. Can we treat a service on WMCS in the same way (a service external to 
Wikimedia production)
  3. If so: What was / is the process for such a call to an external service

  ----------

  In T285098#7262291 <https://phabricator.wikimedia.org/T285098#7262291>, @Joe 
wrote:

  > First of all, I want to say that IMHO things would have gone smoother if 
you asked SRE for an opinion about the plan before it was put in motion. Keep 
this in mind for the future.

  So I did touch base back in in April, and we got guided toward potentially 
using the https://www.mediawiki.org/wiki/Technical_Decision_Forum
  But then the powers that be opted for direct WMDE -> WMF communication via 
email which ultimately lead us to 
https://wikitech.wikimedia.org/wiki/Cross-Realm_traffic_guidelines
  Which after this ticket coming up has since had some clarifying alterations 
https://wikitech.wikimedia.org/w/index.php?title=Cross-Realm_traffic_guidelines&type=revision&diff=1920750&oldid=1900570

  > Having said that, we don't usually allow any request to flow from 
production services to services running in WMCS for a few good reasons, 
regarding reliability, privacy, and security. I don't think we've ever made an 
exception to this rule, and I don't think we should make one in this case - but 
this is my own personal opinion.

  The precedent mentioned in the task description that we were directed to was 
calling other external to WMF production services from WMF production.
  The one example of this that springs to mind is the vision API, however I 
believe at least 1 more example has been offered in the past year.
  We totally agree that this is not the usual thing to do, but would like this 
case to be considered for one of the exceptions.
  Specifically talking to the 3 good reasons that for not doing this you raise, 
we believe that our case presents minimal risk there.

  > I would be interested in seeing if we can have a path forward that allows 
you to get what you want with minimum effort while deploying in production.
  >
  > I would say that **a security review cannot be skipped** even if you're 
running the code from WMCS - as it's involved in serving production traffic. 
The other things standing in your way are a production deployment and a 
performance review, AIUI.

  I totally agree that if this service is to be deployed to production, then it 
should have a full security review, performance review and go through the 
regular vigorous process to get deployed.
  However tying this into the precedent mentioned above, I highly doubt that 
external services that we call get a security review of their code etc, but 
indeed perhaps for the requests & responses and general risk. 
  Services running on WMCS (the Service we want to use in A/B testing) and 
routine Gerrit changes (which were made to the property suggestor extension) 
are also listed as things unlikely to get a review 
https://www.mediawiki.org/wiki/Security/SOP/Security_Readiness_Reviews

  > Given you already have a docker image coming from the pipeline, creating a 
dedicated chart shouldn't be much harder than running the scaffolding script, 
and we can create an LVS endpoint for it in a relatively short timescale, so 
that you can get to production (a few weeks I think). I can't speak for the 
performance team, but I think you can ask for the performance review to happen 
with the service already deployed for the inital phases of your A/B testing.
  >
  > This path seems more reasonable to me than creating an exception to serve 
production traffic from WMCS. Does it sound unreasonable / irrealistic to you?

  If we spend the latter half of 2021 going through the full production 
deployment flow for the service:

  - Security & performance review of the golang service (using resources from 2 
WMF teams)
  - Making needed changes to the service based on the feedback
  - Resourcing and provisioning of the service in production with WMF SREs

  To then run a 1 month A/B test and then turn the service off / undeploy it, 
evaluate the A/B result and potentially not deploy the service again.
  Then I feel that we would have unnecessarily spent a whole lot of resources 
for the year and probably extended the timeline of this A/B test by 6 months or 
so (I could be wrong).

  But I do agree that having some sign off on the payloads being exchanged 
could still make sense if we are to try and move forward with calling WMCS, 
even if we deem it to be low risk.

  > In the future, we plan to have a much easier process for small experiments 
to run on kubernetes as "lambdas", but that will take some time to come to 
fruition - we're just working right now on introducing the technologies that 
will make it possible.

  I imagine this would probably leave us in the same situation for this case, 
as it is the security and performance review of a service that we are not 
currently planning on deploying to production that we are trying to avoid, by 
framing it as an external service.
  If such a service were to be deployed as a lambdas etc it would still then be 
getting deployed to production and 100% need to have the full service code be 
reviewed.

  --------

  In T285098#7262378 <https://phabricator.wikimedia.org/T285098#7262378>, 
@Ladsgroup wrote:

  > (Not speaking on behalf of the team, completely personal):
  > I see three way out that we could talk about and decide:
  >
  > - Get SRE/Security/Legal approval for a temporary deployment of reading for 
wmcs. One idea I have to ease and compromise is to have a fixed deadline. e.g. 
"This will stay in production no more than 30 days" This would reduce the risk. 
The actual number should be decided by PM and the rest.

  To me this seems quite reasonable, and probably a much smaller investment of 
people time into such an A/B test than trying to fulfil a full deployment to 
production that we may just undo.
  I believe this is probably something like the process for previous such 
external calls (if we can frame a WMCS call as external)

  > - Do the A/B testing outside production. Get the requests made in 
production (from hadoop). Make a list and find differences between the old and 
the new system and then decide for its future.

  This would indeed be partly possible, but one part of the A/B test measures 
user selection and timing of results from the suggester.
  So although some sort of A/B testing would probably be possible, not the full 
A/B test that is actually currently planned.

  > - Do an actual security review, performance review, get it properly 
deployed.

  Indeed, also a path forward, and if this is truly the only way forward then 
we will indeed have to do this.
  But as noted above, this ends up spending lots of resources on something that 
we may not want to spend said resources on, the whole point of the A/B test is 
to answer that question.

  > My ideal solution would be a mixture of two and three. e.g. Just do a basic 
check outside production to make sure the suggestion are not off beyond repair, 
then if it's fine, we can deploy and fix and iterate.

  I believe 2 has already happened to some degree.
  My ideal solution would be 1, then if we want to deploy the service 3.

TASK DETAIL
  https://phabricator.wikimedia.org/T285098

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: Joe, Ladsgroup, Ottomata, Lucas_Werkmeister_WMDE, Martaannaj, 
Michaelcochez, Michael, Addshore, Aklapper, Biggs657, Invadibot, Lalamarie69, 
maantietaja, Juan90264, Alter-paule, Beast1978, Un1tY, Akuckartz, Hook696, 
Iflorez, Kent7301, alaa_wmde, joker88john, CucyNoiD, Nandana, Gaboe420, 
Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, 
QZanden, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Lydia_Pintscher, Sjoerddebruin, Mbch331

_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Wikidata-bugs] [Maniphest] T285098: Production A/B test deployment - Improved Property Suggester/Recommender

Reply via email to