GWicke added a comment.

I guess we have slightly different ideas about what a message bus should be:

1. a way to get blobs from a to b, and
2. a way to expose a stream of events in a defined format that can be consumed 
easily by a range of clients.

The use cases I care about require 2). Applying my interpretation of the 
Robustness Principle <https://en.wikipedia.org/wiki/Robustness_principle> to 
that use case means thoroughly checking / coercing things on the way in, and 
keeping promises on the way out.

I also agree that it is possible to implement 2) by writing directly to Kafka, 
provided that *each* producer

- emits only events satisfying the expected (current) schema,
- never writes to queues it shouldn't write to (access control), and
- is fully aware of internal optimizations such as binary encodings and 
compression specific to the event queue implementation and topic.

Add to this requirements like emitting per-topic metrics, and I think it 
becomes clear why limiting the number of implementations is desirable.

I also think that we should look at actual data before making assumptions about 
latency. For example, simple Kafka clients establish a new TCP connection per 
write, and might even fetch metadata for each connection. The simple REST 
service (120 lines) processes 1100+ req/s with a mean enqueue latency of around 
10ms, with both Kafka and the service running on a single-core labs instance. 
At production load, low single-digit ms should be typical. There will be use 
cases where sub-ms latency or extremely high volume is needed & REST is not a 
good fit, but lets base decisions around that on actual data.

Regarding the monolog backend, my understanding based on 
https://phabricator.wikimedia.org/T108618 and conversations is that this is 
primarily aiming to ship events to hadoop for later analysis. As such, it's 
message format is geared towards that use case, and no effort has been made to 
generalize events and their representation for general use. That said, we 
*could* consider using the monolog integration for emitting more general events 
from MediaWiki, but would then also need to implement support for alternative 
backends, and ensure that schemas agree.


TASK DETAIL
  https://phabricator.wikimedia.org/T114443

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GWicke
Cc: EBernhardson, bd808, Joe, dr0ptp4kt, madhuvishy, Nuria, ori, faidon, aaron, 
GWicke, mobrovac, Halfak, Eevans, Ottomata, Matanya, Aklapper, JAllemandou, 
jkroll, Smalyshev, Hardikj, Wikidata-bugs, Jdouglas, RobH, aude, Deskana, 
Manybubbles, mark, JanZerebecki, RobLa-WMF, fgiunchedi, Dzahn, jeremyb, 
chasemp, Krenair



_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to