Joe added a comment. In https://phabricator.wikimedia.org/T114443#1703097, @Eevans wrote:
> In https://phabricator.wikimedia.org/T114443#1701296, @Joe wrote: > > > Apart from the concerns on a practical use case which I agree with, I have > > a big doubt about the implementation idea: > > > > I am in general a fan of the paradigm that it's better to beg for > > forgiveness than to ask for permission, and of Postel's robustness > > principle, so I don't really see what use a service in front of kafka would > > serve us, apart from introducing another software that could fail and some > > latency. > > > > Messages we send onto kafka will be anyways verified on the receiving end > > (considering them "trusted" would be foolish), so we will need to write > > validation libraries in basically all the languages we will consume our > > data from; this is the standard way to build communications protocols and I > > don't see a good reason for introducing a level of indirection here. > > > Why is this, why would they //need// to be verified on the receiving end? yes, unless we prevent kafka from speaking to anything but our rest service. We can do it, of course, but we already have a counterexample I guess from what I read a few comments ago. > I see this as being somewhat analogous to a database. In any database you > //could// store your data opaquely, allow each client to marshal it according > to some shared notion of schema, and then have every client validate (the > untrustworthy input) on read, but how is that better? If the data is > structured according to a well defined schema, why not let the system > persisting it apply those constraints on write? Assuming the goal is to > disseminate these events to an arbitrary number of independently implemented > systems, it seems the latter approach would provided better guarantees about > the integrity of the data, and eliminate a lot of redundancy among > implementations. A message queue is not a database, it's a router. What you want to validate is the content of the messages kafka is routing, I stand by the idea that doing that is importnat but must be done at the app level anyways. > > > > So, I have two questions I'd like an answer to: > > > > > > - What is the advantage of having a service validate messages before they > > get into the queue (Kafka or other doesn't really matter) > > > It assures a single consistent set of constraints on events, independent of > the various producer/consumer implementations. at the cost of reducing a complex and rich queue system to a REST paradigm, and introducing yet another layer of chain-calls that can fail independently. Did you guys already evaluated what would be lost in translation, if anything? > > - Why building libraries that do the validations based on shared schemas > > not enough? > > > A service provides a single high level abstraction that hides the details of > the underlying implementation (allowing said implementation to be > transparently changed), eliminates redundancy among implementations, and > prevents a single buggy consumer from propagating corrupt events to all > consumers. Well, as I said above, if the implementations (we're talking about 2, max 3 of them... which you should still do against your rest service, vs the 1 you'd put in your service) need to verify the messages they're receiving, which is sensible, you don't duplicate effort, you just simplify an architecture. I have worked before with systems that purposedly added a "sane indirection" in front of backend technologies, and it always turned out to be a worse idea than using libraries. But well, this might just be the case in which that won't happen, I just don't see it. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Joe Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
