GWicke added a comment. I guess we have slightly different ideas about what a message bus should be:
1. a way to get blobs from a to b, and 2. a way to expose a stream of events in a defined format that can be consumed easily by a range of clients. The use cases I care about require 2). Applying my interpretation of the Robustness Principle <https://en.wikipedia.org/wiki/Robustness_principle> to that use case means thoroughly checking / coercing things on the way in, and keeping promises on the way out. I also agree that it is possible to implement 2) by writing directly to Kafka, provided that *each* producer - emits only events satisfying the expected (current) schema, - never writes to queues it shouldn't write to (access control), and - is fully aware of internal optimizations such as binary encodings and compression specific to the event queue implementation and topic. Add to this requirements like emitting per-topic metrics, and I think it becomes clear why limiting the number of implementations is desirable. I also think that we should look at actual data before making assumptions about latency. For example, simple Kafka clients establish a new TCP connection per write, and might even fetch metadata for each connection. The simple REST service (120 lines) processes 1100+ req/s with a mean enqueue latency of around 10ms, with both Kafka and the service running on a single-core labs instance. At production load, low single-digit ms should be typical. There will be use cases where sub-ms latency or extremely high volume is needed & REST is not a good fit, but lets base decisions around that on actual data. Regarding the monolog backend, my understanding based on https://phabricator.wikimedia.org/T108618 and conversations is that this is primarily aiming to ship events to hadoop for later analysis. As such, it's message format is geared towards that use case, and no effort has been made to generalize events and their representation for general use. That said, we *could* consider using the monolog integration for emitting more general events from MediaWiki, but would then also need to implement support for alternative backends, and ensure that schemas agree. TASK DETAIL https://phabricator.wikimedia.org/T114443 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: GWicke Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, chasemp, Krenair _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
