Saga Pattern: Distributed Transactions with Choreography and Orchestration
In microservices there is no such thing as BEGIN TRANSACTION ... COMMIT that crosses service boundaries.
When an operation involves the Order Service, the Inventory Service and the Payment Service,
How do we ensure that either all three are successful or none are effective?
The Saga Pattern is the answer: each service performs its own local transaction
and, in case of failure, the compensating transactions to undo what has already been done.
Why 2PC Doesn't Work in Microservices
Il Two-Phase Commit (2PC) is the classic protocol for distributed transactions: a Coordinator asks all participants to prepare (prepare phase), then order the commit. This works well for homogeneous databases (e.g. two PostgreSQL instances), but is problematic in microservices:
- Coupling: All services must be available during the transaction
- Performance: resources remain locked until committed (distributed locks)
- Limited support: Many cloud services, REST APIs, and NoSQL databases do not support 2PC
- Scalability: the coordinator becomes a single point of failure
The Saga Pattern replaces one large distributed transaction with one sequence of local transactions, each completed independently, coordinated through events or an orchestrator.
Example: E-commerce Purchasing Saga
Saga steps for a purchase:
- Create Order (Orders Service) → OrdineCreato
- Reserve Inventory (Inventory Service) → Reserved Inventory
- Process Payment (Payment Service) → PaymentConfirmed
- Confirm Order (Order Service) → OrderConfirmed
If the payment fails, the compensating transactions are:
- Release Inventory (Inventory Service)
- Cancel Order (Order Service)
Choreography Saga: Service Autonomy
In the Choreography Saga, there is no central orchestrator. Each service listens to the events produced by the previous service and reacts accordingly, in turn producing new events. The coordination logic is distributed between services: each knows only its own step and the events to which it reacts.
Pro: maximum decoupling, no single point of failure, simple to implement. Against: hard to debug (saga logic is distributed), no visibility central to the state of the saga, risk of cycles of poorly managed events.
// CHOREOGRAPHY SAGA con Spring Kafka
// ===== STEP 1: Servizio Ordini - crea l'ordine =====
@Service
public class OrdiniService {
private final KafkaTemplate<String, Object> kafkaTemplate;
public String creaOrdine(CreaOrdineCommand cmd) {
String ordineId = UUID.randomUUID().toString();
// Salva l'ordine in stato PENDING
Ordine ordine = Ordine.builder()
.id(ordineId)
.clienteId(cmd.getClienteId())
.prodotti(cmd.getProdotti())
.stato(StatoOrdine.PENDING)
.build();
ordineRepository.save(ordine);
// Pubblica evento: il Servizio Inventario lo ascolterà
OrdineCreato event = new OrdineCreato(ordineId, cmd.getProdotti());
kafkaTemplate.send("ordini-events", ordineId, event);
return ordineId;
}
// Ascolta la conferma del pagamento per finalizzare l'ordine
@KafkaListener(topics = "pagamenti-events", groupId = "ordini-service")
public void onPagamento(ConsumerRecord<String, Object> record) {
if (record.value() instanceof PagamentoConfermato event) {
ordineRepository.updateStato(event.getOrdineId(), StatoOrdine.CONFERMATO);
kafkaTemplate.send("ordini-events", event.getOrdineId(),
new OrdineConfermato(event.getOrdineId()));
}
if (record.value() instanceof PagamentoFallito event) {
// Compensating: annulla l'ordine
ordineRepository.updateStato(event.getOrdineId(), StatoOrdine.ANNULLATO);
}
}
}
// ===== STEP 2: Servizio Inventario - riserva i prodotti =====
@Service
public class InventarioService {
@KafkaListener(topics = "ordini-events", groupId = "inventario-service")
public void onOrdinCreato(ConsumerRecord<String, Object> record) {
if (record.value() instanceof OrdineCreato event) {
try {
// Tenta la prenotazione dell'inventario
prenotaInventario(event.getOrdineId(), event.getProdotti());
// Successo: pubblica evento per il Servizio Pagamenti
kafkaTemplate.send("inventario-events", event.getOrdineId(),
new InventarioRiservato(event.getOrdineId(), event.getTotale()));
} catch (InventarioInsufficienterException e) {
// Fallimento: pubblica evento di fallimento
// Il Servizio Ordini annullerà l'ordine
kafkaTemplate.send("inventario-events", event.getOrdineId(),
new InventarioNonDisponibile(event.getOrdineId(), e.getMessage()));
}
}
// Compensating: rilascia l'inventario se il pagamento fallisce
if (record.value() instanceof PagamentoFallito event) {
rilasciaInventario(event.getOrdineId());
}
}
}
// ===== STEP 3: Servizio Pagamenti - processa il pagamento =====
@Service
public class PagamentiService {
@KafkaListener(topics = "inventario-events", groupId = "pagamenti-service")
public void onInventarioRiservato(ConsumerRecord<String, Object> record) {
if (record.value() instanceof InventarioRiservato event) {
try {
processaPagamento(event.getOrdineId(), event.getTotale());
kafkaTemplate.send("pagamenti-events", event.getOrdineId(),
new PagamentoConfermato(event.getOrdineId()));
} catch (PagamentoRifiutatoException e) {
kafkaTemplate.send("pagamenti-events", event.getOrdineId(),
new PagamentoFallito(event.getOrdineId(), e.getMessage()));
}
}
}
}
Orchestration Saga: Centralized Control
In the'Orchestration Saga, a central component called Saga Orchestrator (or Process Manager) coordinates the entire sequence. The orchestrator knows all the steps, send Command to services and wait for their response events. In case of failure, the orchestrator explicitly sends compensation commands.
Pro: centralized and visible logic, easy to debug, explicit saga state, more precise error handling. Against: major coupling, the orchestrator knows all the services involved.
// ORCHESTRATION SAGA con Axon Framework (Java)
// Axon gestisce automaticamente la persistenza dello stato della saga
import org.axonframework.saga.*;
import org.axonframework.eventhandling.*;
@Saga
public class AcquistoSaga {
@Autowired
private transient CommandGateway commandGateway;
// Lo stato della saga persiste automaticamente tra gli step
private String ordineId;
private List<ProdottoOrdinato> prodotti;
private BigDecimal totale;
// STEP 1: Avvio della saga quando viene creato un ordine
@StartSaga
@SagaEventHandler(associationProperty = "ordineId")
public void on(OrdineCreato event) {
this.ordineId = event.getOrdineId();
this.prodotti = event.getProdotti();
this.totale = event.getTotale();
// Invia comando al servizio inventario
commandGateway.send(new RiservaInventarioCommand(
ordineId,
prodotti
));
}
// STEP 2: Inventario riservato con successo - procedi con il pagamento
@SagaEventHandler(associationProperty = "ordineId")
public void on(InventarioRiservato event) {
commandGateway.send(new ProcessaPagamentoCommand(
ordineId,
totale,
"CARTA_CREDITO"
));
}
// STEP 3: Pagamento confermato - finalizza l'ordine
@SagaEventHandler(associationProperty = "ordineId")
public void on(PagamentoConfermato event) {
commandGateway.send(new ConfermaOrdineCommand(ordineId));
}
// Fine saga (successo)
@EndSaga
@SagaEventHandler(associationProperty = "ordineId")
public void on(OrdineConfermato event) {
System.out.println("Saga completata con successo per ordine: " + ordineId);
}
// ===== COMPENSATING TRANSACTIONS =====
// Pagamento fallito: rilascia inventario, poi annulla ordine
@SagaEventHandler(associationProperty = "ordineId")
public void on(PagamentoFallito event) {
// Compensating step 1: rilascia inventario
commandGateway.send(new RilasciaInventarioCommand(ordineId, prodotti));
}
// Inventario rilasciato - annulla l'ordine
@SagaEventHandler(associationProperty = "ordineId")
public void on(InventarioRilasciato event) {
// Compensating step 2: annulla l'ordine
commandGateway.send(new AnnullaOrdineCommand(ordineId, "Pagamento fallito"));
}
// Inventario non disponibile - annulla direttamente
@SagaEventHandler(associationProperty = "ordineId")
public void on(InventarioNonDisponibile event) {
commandGateway.send(new AnnullaOrdineCommand(ordineId, "Inventario insufficiente"));
}
// Fine saga per percorso di fallimento
@EndSaga
@SagaEventHandler(associationProperty = "ordineId")
public void on(OrdineAnnullato event) {
System.out.println("Saga fallita/annullata per ordine: " + ordineId + ". Motivo: " + event.getMotivo());
}
}
Saga with AWS Step Functions (Serverless)
For serverless architectures on AWS, Step Functions it is the managed service to implement Orchestration Saga. Each step is a Lambda function or an AWS action, and Step Functions automatically manages state, retry and compensations by definition in Amazon States Language (ASL).
{
"Comment": "Saga acquisto e-commerce con compensating transactions",
"StartAt": "CreaOrdine",
"States": {
"CreaOrdine": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:123456:function:crea-ordine",
"Next": "RiservaInventario",
"Catch": [
{
"ErrorEquals": ["States.ALL"],
"Next": "FallimentoSaga"
}
]
},
"RiservaInventario": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:123456:function:riserva-inventario",
"Next": "ProcessaPagamento",
"Catch": [
{
"ErrorEquals": ["InventarioInsufficienterException"],
"Next": "CompensaAnnullaOrdine"
}
]
},
"ProcessaPagamento": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:123456:function:processa-pagamento",
"Retry": [
{
"ErrorEquals": ["TransientPaymentError"],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 2
}
],
"Next": "ConfermaOrdine",
"Catch": [
{
"ErrorEquals": ["PagamentoRifiutatoException"],
"Next": "CompensaRilasciaInventario"
}
]
},
"ConfermaOrdine": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:123456:function:conferma-ordine",
"End": true
},
"CompensaRilasciaInventario": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:123456:function:rilascia-inventario",
"Next": "CompensaAnnullaOrdine"
},
"CompensaAnnullaOrdine": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:123456:function:annulla-ordine",
"Next": "FallimentoSaga"
},
"FallimentoSaga": {
"Type": "Fail",
"Error": "SagaFailed",
"Cause": "La transazione distribuita e fallita, compensazioni eseguite"
}
}
}
Idempotence in Compensating Transactions
Compensating transactions must be idempotent: if invoked several times (due to network or orchestrator retry), the result must be the same. The most common technique is to use a idempotency key: the orderId + the operation type form a unique key that prevents duplicate operations.
// RilasciaInventarioHandler.java - Compensating transaction idempotente
@CommandHandler
public void handle(RilasciaInventarioCommand cmd) {
// Controllo idempotenza: questa compensazione e gia stata eseguita?
String idempotencyKey = cmd.getOrdineId() + ":RILASCIA_INVENTARIO";
if (compensazioniEseguite.contains(idempotencyKey)) {
System.out.println("Compensazione gia eseguita per " + idempotencyKey + ", skip");
return;
}
// Esegui la compensazione
for (ProdottoOrdinato prodotto : cmd.getProdotti()) {
inventarioRepository.incrementaDisponibilita(
prodotto.getProductId(),
prodotto.getQuantita()
);
}
// Segna la compensazione come eseguita
compensazioniEseguite.add(idempotencyKey);
// Pubblica evento per continuare la catena di compensazione
eventBus.publish(new InventarioRilasciato(cmd.getOrdineId()));
}
Saga State Tracking: Visibility into the State
One of the criticisms of Orchestration Saga is the difficulty of tracking the state in high volume systems. An explicit saga tracking table solves this problem:
-- Tabella di tracking per le saga in corso
CREATE TABLE saga_state (
saga_id VARCHAR(36) PRIMARY KEY,
saga_type VARCHAR(100) NOT NULL,
ordine_id VARCHAR(36) NOT NULL,
current_step VARCHAR(100) NOT NULL,
stato VARCHAR(50) NOT NULL, -- IN_CORSO, COMPLETATA, FALLITA, COMPENSATA
started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
completed_at TIMESTAMPTZ,
error_message TEXT
);
-- Query utili per monitoring
-- Saga in corso da piu di 5 minuti (potenzialmente bloccate)
SELECT * FROM saga_state
WHERE stato = 'IN_CORSO'
AND started_at < NOW() - INTERVAL '5 minutes';
-- Distribuzione per step corrente
SELECT current_step, COUNT(*) as count
FROM saga_state WHERE stato = 'IN_CORSO'
GROUP BY current_step;
Comparison: Choreography vs Orchestration
| I wait | Choreography | Orchestration |
|---|---|---|
| Coupling | Low (events) | Medium (orchestrator knows services) |
| Status visibility | Hard (distributed) | High (centralized) |
| Debugging | Hard (distributed logic) | Easy (explicit state) |
| Complexity of services | High (each service knows the compensations) | Low (simple services, logic in orchestrator) |
| Scalability | High (no single points) | Medium (Orchestrator to scale) |
| Ideal use case | Few steps, autonomous services, independent teams | Many steps, complex logic, necessary visibility |
Next Steps in the Series
- Article 6 – AWS EventBridge: Choreography Saga uses a message bus to exchange events. EventBridge is the ideal AWS managed bus for intelligent routing of events between Lambda and SQS.
- Article 10 – Outbox Pattern: Ensure atomic dispatch of Saga events together with the local database update.
Link with Other Series
- Event Sourcing + CQRS (Article 4): each step of the Saga produces events which can be used as a source of truth to reconstruct the aggregate state. Axon Framework natively supports the Saga + Event Sourcing combination.
- Apache Kafka (Series 38): Kafka is the ideal message bus for Choreography Saga in production, with guarantees of durability and ordering per key partition.







