created: 1245140668|%e %B %Y, %H:%M
Internal accounting
I've not made the engine API yet, still working through some design concepts. Let's look at credit-based flow control. This is needed to make MOVE work safely. As I've said previously, for COPYs I don't want to use credit-based flow control, indeed any kind of flow control at all.
I'll define a "credit" unit, so we can say that one credit corresponds to one message. In AMQP I originally designed credit based on message volume as well but this is nasty because the semantics of credit are fundamentally not about network capacity but rather about how much the upstream trusts the downstream. We cannot trust a recipient with fractions of a message. (This is also why I think COPY does not need credits, because it does not imply any kind of trust relationship.)
As messages move down the directed graph of queues, credits move up the graph. Let's look at a typical scenario:
- We have a queue with two selectors that each move messages to respective external destinations.
- Each selector starts with one credit.
- Two messages arrive, and are passed respectively to each selector.
- Each selector moves the message it receives to its destination.
- Each selector now has a credit of zero.
- A new message arrives in the queue, and waits there, since there are no available selectors.
- One of the destinations sends an acknowledgment, which is passed to its selector as a credit.
- That selector is now able to process one more message.
To some extent credits and acknowledgements are the same concept, and we might use "credit" everywhere, especially since it's easier to spell. Thus applications send messages to queues, and credits to selectors.
Providing a credit to a selector is in effect paying for a message, and credits are like an internal currency. It makes sense that we aim to balance the books. When a queue is itself fed by one or more MOVE selectors, and it receives credits from a selector, it must pass these credits to its parent MOVE selector(s) in turn, i.e. pay for the messages that it received.
Selectors can, at any point, have outstanding messages. That is, they may have one or more messages they provided to a destination but did not yet receive payment for. If the selector is then deleted, it must return its outstanding messages to its source queue. These messages can then be routed to other selectors, or bounced.
Queue size limits
It makes sense in several scenarios to set a limit on the size of a queue. In fact there are two possible limits:
- On the number of unprocessed messages;
- On the history of processed messages.
When the history gets too large, there is only one sensible strategy, namely to discard the oldest messages. We can hardcode this, there is no need to allow for alternative strategies.
When the unprocessed size gets too large, the sensible strategy is to reject the new message as an exception. Discarding the message is an acceptable default (presumably the sender will realize, eventually, and retry or complain).
Message exceptions
There are several situations where a queue may find itself with a troublesome message that it has to deal with:
- When a queue has reached its limit of unprocessed messages;
- When a queue does not have any active selector that can accept a message.
Dropping the message is a good default action but in cases where it matters, we want to send the message to someone who knows how to handle it. That is often, but not always, the original sender. In AMQP we discussed this question extensively and the best design we came up with (which is not in the current protocol draft, which has taken a left turn to the Moon) was to route the message on a "return address".
Pushing the message forwards on an alternative route rather than returning it to the original source is as far as I can tell more robust and useful. It does not require us to track the message's routing path. It does not require that the source still exists. It lets us centralize exception handling for critical queues. And it lets us handle exceptions with the same mechanisms as deliveries.
When exactly is a message "unroutable"? It depends, but these are the cases I know of:
- When a message arrives in a queue that has no selectors. E.g. if I send a document to a print queue that has no printers attached to it.
- When a message arrives in a queue that has no matching selectors. E.g. if I send a document that requires color printing to a queue that has only black and white printers.
In the case when a queue has selectors but they are all busy (i.e. don't have the credit to accept a new message), we can't tell if the message would be routed or not and it does not make much sense to treat it as an exception, yet.
The very simplest way to handle exceptions is to move the affected messages to a 'dead-letter' queue, that is defined on a per-queue basis. On the dead-letter queue we can put selectors that route exceptions however we like. To make this work properly, we might want to mangle the message envelope so that the return address, if specified, is used as the address, and the original address is saved somewhere. This is a familiar pattern from the real world.
So here is a sketch for exception handling on queues:
- By default, if there are no (ready) selectors, hold the message as unprocessed.
- By default, if the unprocessed size exceeds a per-queue limit, discard new messages (and log some error).
- By default, if there are selectors but none match, discard the message.
- Add a BOUNCE operation that is like MOVE, but mangles the address and return addresses.
- Allow a "default selector" that is invoked if no other selectors match.
- Allow an "overflow selector" that is invoked if the unprocessed size hits a limit.
When applications try to publish messages to non-existent queues, we can bounce those messages to some per-engine dead-letter queue. This, when used with self-destructive queues, lets us implement a service lookup test.
Self-destructive queues
A pattern that has been very useful in AMQP and elsewhere is the "self-destructive queue". This is a queue that deletes itself when its last selector is deleted. Self-destructive queues are typically used for Wolfpack scenarios.
Rate this post:
