Welcome, Guest! Log In | Create Account

Sagas

From nservicebus

Jump to: navigation, search

Contents

Timeouts

If you do a process that involves request/responsing an external system (say credit card authorization) using one way messaging – you have to deal with the fact that you don’t know when/if you’ll get the response. However, you have SLA requirements around your process. So you need timeouts. The only problem with threading timers is that if a server process dies, you lose knowledge of time – possibly causing your process to work incorrectly, even if you failed over to a different server.

Therefore, we make use of messaging – we send a TimeoutMessage to some other process, the TimeOut Manager. That process is responsible for its availability and has its own SLA (usually per business domain). Those messages can be durable. When that process notices that time is up, it sends the message back to the original process (well, its return address). The TimeoutMessage has a saga id. Therefore, when the original process receives that message, it can lookup the saga object and call its Timeout method.

The saga, which at that point has all of its state, can check if it already had received a response when it gets notified about the timeout. In that way, timeouts are really more of a “wakeup call” and the saga figures out what it should be doing. In that way, the saga maintains the state “I’ve sent a request, have I received a response yet?” and makes use of an external process to maintain the state of “how much time has gone by”. As messages come in, the saga makes local decisions based on its current state, and can send out other messages. For instance, in the credit card authorization scenario, if we haven’t gotten an answer “in time” (5 second SLA), we can send back a response saying “we’ll get back to you via email” possibly containing a link to a page which will show results. In that way, the saga can continue working even after a timeout.

The other process is the Timeout Manager, and can be found under /build/timeout. It is not kicked off, but rather is part of the system deployment listening to messages on a queue just like any other nServiceBus endpoint. When a saga sends a timeout message, it can put in some extra state that it will get back in the Timeout(object state) method. This is useful for the cases where you send out messages in parallel to multiple parties and would like to manage timeouts for each one. You might use the id of those parties as the state object, or possibly the message you sent them.

Model behind sagas

As a part of my efforts to make clearer what place nServiceBus has in the Microsoft .NET ecosystem, I’ve decided to retire the term “workflow”. Almost every conversation about nServiceBus where the term “workflow” was brought up, the reaction was almost identical: “What’s wrong with Workflow Foundation?”

There’s nothing wrong with Workflow Foundation. The thing is that nServiceBus doesn’t really need workflow in the general sense of the term. An older term that’s been used in the DBMS community might make more sense - “long-lived transactions”. You see, nServiceBus requires state management over many messaging interactions, and thus, some kind of long-lived transaction to maintain consistency.

In 1987, a different model was introduced for handling these scenarios - the Saga GARCIA-MOLINA 87, and was further expanded in 1992 CHRYSANTHIS 92 to allow the saga to commit if a non-vital subset of the sub-transactions abort. This is what is used in nServiceBus.

When used in distributed manner on top of one-way messaging, this results in a solution where each service runs its own “mini-workflow”, and coordinates its actions with other services via messages. This integration style is different from the traditional broker, man-in-the-middle approach found in products like Biztalk; and is known as “choreography”, and is in the process of standardization in the WS-* specs, known as WS-Choreography.

So, the bottom line is that the source, examples, and documentation of nServiceBus is being moved in this direction. The source and examples on the sourceforge site have already been brought forward. This is a breaking change.

I hope that this will make it clearer that nServiceBus is a higher-level set of abstractions than WCF/WF - limiting the generality found in these frameworks to enable one cohesive way of working that will result in a solution that is both scalable and robust. On the other hand, nServiceBus is not a fully integrated product like Biztalk and intends to tackle different kinds of problems.

Bigger than a WCF/WF, smaller than a breadbox Biztalk.

No orchestration

"I also have a question for you. I understand that nServiceBus implements workflow using sagas. When thinking of workflow I imagine something sitting on top of an "application" orchestrating a number of service calls. So far this makes sense to me. What puzzles me is that a single workflow will undoubtedless cross multiple database transaction boundaries. This raises the question of compensation. What should happen if the workflow fails. The book on MS WWF is rather vague on this - it merely says that you will have to perform some compensation activity. This begs two obvious questions to me: 1) surely data that an earlier step in the workflow has committed to the database is now visible to other users which seems to compromise the isolation level of the workflow as a whole 2) how can the bus, in its role as manager of the workflow perform compensation actions? Does this not imply that it needs to hold state on what it has successfully done and how to undo it? Where does this state go? Who does it "belong" to?"

NServiceBus doesn’t actually deal with “workflow as something sitting on top of an application orchestrating a number of service calls”. Sagas are about managing the state of multi-message exchanges from the perspective of one party in a robust way.

Sagas don’t inherently fail. The handling of a single message by a saga may be unsuccessful – but just like regular message handlers, exceptions result in retrying that message. So “unsuccessful” could mean that the saga hasn’t received a response in time – but this is a business logic interpretation of regular data.

If you’re worried about state “bleeding” out and being seen by other activities, in other words, you need transactional isolation – then there’s a good chance you haven’t got your boundaries right – things are separate that should be together.

BTW – the bus is not “the manager of the workflow”, it just dispatches messages to it. The bus doesn’t perform any compensating actions – the saga itself has to do that based on its own logic.

I don’t know if that helps much – but most problems are usually solved at an architectural level. This is what keeps nServiceBus so simple. If there’s one rule I could give it would be this – if nServiceBus makes it hard, revisit your architecture; you probably can avoid those issues entirely.

Transactions

Let us take OrderSaga from the samples and the following scenario. 1/ Create Order Message 2/ Saga sends authorization requests 3/ Authorization1 arrives 4.1 / Timeout message arrives 4.2 / Authorization2 message arrives

Race condition In this scenario, there is a possible race condition between 4.1 and 4.2. The endpoint got two messages, one is the Timeout, the second is the Rejection. The endpoint is not single threaded, so it will dispatch them both at the same time.

Right. Both will try to get the saga within the transactional context of the endpoint. One thread will start doing its thing – the other will block.

How? The endpoint itself would be marked transactional

Bus.IsTransactional = true; or its configuration equivalent. That is part of the context of saga persisters, they need to be transaction aware / need to be able to auto enlist.

This explains the usage of System.Transactions.

Sequenced Message Handling

"I need to call the saga's Handle method syncronously for a particular Saga. How to do it ? For a particular saga, I receive multiple message of same type... I need to handle them syncronously... I mean the saga itself shouldnt be created again untill I process the previous one Hope I make sense. Please advise."

If you aren’t using the distributor, or running multiple threads on the endpoint running that saga, and you can get the messages into the queue in order, you’re done. The messages will be handled sequentially / in order.

If the throughput gains of multi-threading / distributor are required, here’s what to do.

If there is a way for you to write business logic in your saga that detects whether the current message being processed has possibly jumped ahead of another message, write that logic in your saga method as follows. If your messages have some sequence number in them, you can store the last sequence number in your saga data and check against that:

public class MyMessage : IMessage
{
    public int SequenceNumber { get; set; }
    // all other data
}
 
public class MySagaData : ISagaEntity
{
    public virtual Guid Id { get; set; }
    public virtual string Originator { get; set; }
    public virtual int LastSequenceNumber { get; set; }
    // all other data
}
 
public class MySaga : Saga<MySagaData>, ISagaStartedBy<MyMessage>
{
    public void Handle(MyMessage message)
    {
        if (IsOutOfOrder(message))
        {
            this.Bus.HandleCurrentMessageLater();
            return;
        }
 
        this.Data.LastSequenceNumber = message.SequenceNumber;
 
        // rest of your handling logic
    }
 
    private bool IsOutOfOrder(MyMessage message)
    {
        return (this.Data.LastSequenceNumber + 1 != message.SequenceNumber);
    }
}


Since the same message can start the saga and can continue a pre-existing saga, you should implement a SagaFinder like so:

public class MySagaFinder : IFindSagas<MySagaData>.Using<MyMessage>
{
    public MySagaData FindBy(MyMessage message)
    {
        // go to the DB or wherever these sagas are stored and look for it
    }
}

Links