Ciao! Sono

Federico Calò

Sviluppatore Software | Divulgatore Tecnico

Creo applicazioni web moderne e strumenti digitali personalizzati per aiutare le attività a crescere attraverso l'innovazione tecnologica. La mia passione è unire informatica ed economia per generare valore reale.

Contattami

Chi Sono

La mia passione per l'informatica è nata tra i banchi dell'Istituto Tecnico Commerciale di Maglie, dove ho scoperto il potere della programmazione e il fascino di creare soluzioni digitali. Fin da subito, ho capito che l'informatica non era solo codice, ma uno strumento straordinario per trasformare idee in realtà.

Durante gli studi superiori in Sistemi Informativi Aziendali, ho iniziato a intrecciare informatica ed economia, comprendendo come la tecnologia possa essere il motore della crescita per qualsiasi attività. Questa visione mi ha accompagnato all'Università degli Studi di Bari, dove ho conseguito la Laurea in Informatica, approfondendo le mie competenze tecniche e la mia passione per lo sviluppo software.

Oggi metto questa esperienza al servizio di imprese, professionisti e startup, creando soluzioni digitali su misura che automatizzano processi, ottimizzano risorse e aprono nuove opportunità di business. Perché la vera innovazione inizia quando la tecnologia incontra le esigenze reali delle persone.

Le Mie Competenze

Analisi Dati & Modelli Previsionali

Trasformo i dati in insights strategici con analisi approfondite e modelli predittivi per decisioni informate

Automazione Processi

Creo strumenti personalizzati che automatizzano operazioni ripetitive e liberano tempo per attività a valore aggiunto

Sistemi Custom

Sviluppo sistemi software su misura, dalle integrazioni tra piattaforme alle dashboard personalizzate

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

🚀

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

💡

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

🎯

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

Dicembre 2024

Visualizza

Master SQL

RoadMap.sh

Novembre 2024

Visualizza

Oracle Certified Foundations Associate

Oracle

Ottobre 2024

Visualizza

People Leadership Credential

Connect

Settembre 2024

💻 Linguaggi & Tecnologie

☕Java

🐍Python

📜JavaScript

🅰️Angular

⚛️React

🔷TypeScript

🗄️SQL

🐘PHP

🎨CSS/SCSS

🔧Node.js

🐳Docker

🌿Git

💼

12/2024 - Presente

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italia · Ibrida Analisi e sviluppo di sistemi informatici attraverso l'utilizzo di Java e Quarkus in Health and Public Sector. Formazione continua su tecnologie moderne per la creazione di soluzioni software personalizzate ed efficienti e sugli agenti.

💼

06/2022 - 12/2024

Analista software e Back End Developer Associate Consultant

Links Management and Technology SpA

Esperienza nell'analisi di sistemi software as-is e flussi ETL utilizzando PowerCenter. Formazione completata su Spring Boot per lo sviluppo di applicazioni backend moderne e scalabili. Sviluppatore Backend specializzato in Spring Boot, con esperienza in progettazione di database, analisi, sviluppo e testing dei task assegnati.

💼

02/2021 - 10/2021

Programmatore software

Adesso.it (prima era WebScience srl)

Esperienza nell'analisi AS-IS e TO-BE, evoluzioni SEO ed evoluzioni website per migliorare le performance e l'engagement degli utenti.

🎓

2018 - 2025

Laurea in Informatica

Università degli Studi di Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Sistemi Informativi Aziendali

Istituto Tecnico Commerciale di Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

Game Backend Observability: Latency, Tickrate and Player Experience

A game backend can be technically perfect on paper - distributed architecture, auto-scaling, multi-zone replication - and at the same time be a disaster for players. Latency spikes of 300ms lasting 2 seconds, tickrate dropping from 128 to 64 under peak load, a server zone unable to complete matches for 20 minutes: these problems exist, but without the right tools you will not see them until players flood you with negative tweets.

Observability in gaming is not simply applying Prometheus and Grafana to any server. It requires a deep understanding of domain-specific metrics: what a degraded tickrate means for gameplay experience, how p99 latency correlates with match abandonment rate, why the Player Experience Score (PES) is the most important metric of all.

In this article we build a complete observability system for game backends, from the technical stack (Prometheus, Grafana, OpenTelemetry, Loki) to gaming-specific metrics, all the way to SLOs that correlate technical performance with player experience.

What You Will Learn

Gaming-specific metrics: tickrate, latency, packet loss, server utilization
Observability stack: Prometheus, Grafana, OpenTelemetry, Loki, Jaeger
Instrumenting a Go game server with custom metrics
Grafana dashboard for game backend: latency heatmap, tickrate, active matches
Smart alerting: SLO-based vs threshold-based
Distributed tracing for debugging match lifecycle issues
Player Experience Score (PES): composite metric for QoE
Correlating technical performance with business metrics (retention, abandonment)

1. Gaming-Specific Metrics

Standard web backend metrics (HTTP latency, RPS throughput, error rate) are necessary but insufficient for a game backend. There are metrics that only make sense in a gaming context:

      Game Backend Metrics Taxonomy
      
        
            Category
            Metric
            Unit
            Target
            Impact
          

        
            Networking
            Round-Trip Time (RTT)
            ms
            < 80ms
            Gameplay responsiveness
          

            Networking
            Packet Loss Rate
            %
            < 0.1%
            Teleportation, rubber-banding
          

            Networking
            Jitter
            ms
            < 20ms
            Erratic interpolation
          

            Game Loop
            Server Tickrate
            tick/s
            Target +/-5%
            Gameplay precision
          

            Game Loop
            Tick Processing Time
            ms
            < tick_period
            If exceeded: gameplay hickup
          

            Match
            Abandonment Rate
            %
            < 5%
            User frustration
          

            Match
            Matchmaking Time
            s
            < 30s
            Pre-match engagement
          

      
    

2. Game Server Instrumentation in Go

The game server must expose Prometheus metrics on a dedicated HTTP endpoint. In Go, the prometheus/client_golang library is the de facto standard. Here we implement critical metrics: tickrate, per-player latency, and active match state.

// metrics/game_metrics.go - Prometheus metrics definitions
package metrics

import (
  "github.com/prometheus/client_golang/prometheus"
  "github.com/prometheus/client_golang/prometheus/promauto"
)

var (
  ServerTickRate = promauto.NewGaugeVec(prometheus.GaugeOpts{
    Namespace: "gameserver",
    Subsystem: "loop",
    Name:      "tickrate_hz",
    Help:      "Actual server tickrate in Hz",
  }, []string{"match_id", "server_id", "region"})

  TickProcessingTime = promauto.NewHistogramVec(prometheus.HistogramOpts{
    Namespace: "gameserver",
    Subsystem: "loop",
    Name:      "tick_processing_seconds",
    Help:      "Time to process a single game tick",
    // Granular buckets to detect hickups
    Buckets: []float64{0.001, 0.005, 0.010, 0.015, 0.020, 0.025, 0.050, 0.100},
  }, []string{"match_id", "server_id"})

  PlayerRTT = promauto.NewHistogramVec(prometheus.HistogramOpts{
    Namespace: "gameserver",
    Subsystem: "network",
    Name:      "player_rtt_milliseconds",
    Help:      "Per-player round-trip time in milliseconds",
    Buckets:   []float64{10, 20, 40, 60, 80, 100, 150, 200, 300, 500},
  }, []string{"player_id", "region", "platform"})

  ActiveMatches = promauto.NewGaugeVec(prometheus.GaugeOpts{
    Namespace: "gameserver",
    Subsystem: "match",
    Name:      "active_count",
    Help:      "Number of active game matches",
  }, []string{"region", "mode"})

  MatchAbandonment = promauto.NewCounterVec(prometheus.CounterOpts{
    Namespace: "gameserver",
    Subsystem: "match",
    Name:      "abandonment_total",
    Help:      "Total match abandonments",
  }, []string{"region", "mode", "reason"})

  MatchmakingWaitTime = promauto.NewHistogramVec(prometheus.HistogramOpts{
    Namespace: "gameserver",
    Subsystem: "matchmaking",
    Name:      "wait_seconds",
    Buckets:   []float64{5, 10, 15, 20, 30, 45, 60, 120, 300},
  }, []string{"region", "mode"})
)

// game_loop.go - Game loop with metrics instrumentation
func (g *GameLoop) Run(ctx context.Context) error {
  ticker := time.NewTicker(g.tickPeriod)
  defer ticker.Stop()

  var tickCount int64
  loopStart := time.Now()

  for {
    select {
    case <-ctx.Done():
      return nil
    case tickTime := <-ticker.C:
      tickStart := time.Now()
      g.processTick(tickTime)

      tickDuration := time.Since(tickStart)
      metrics.TickProcessingTime.WithLabelValues(
        g.matchID, g.serverID,
      ).Observe(tickDuration.Seconds())

      tickCount++
      if elapsed := time.Since(loopStart).Seconds(); elapsed >= 1.0 {
        actualRate := float64(tickCount) / elapsed
        metrics.ServerTickRate.WithLabelValues(
          g.matchID, g.serverID, g.region,
        ).Set(actualRate)

        if actualRate < float64(g.tickRate)*0.90 {
          log.Warnf("Tickrate degraded: %.1f Hz (target %d)", actualRate, g.tickRate)
        }
        tickCount = 0
        loopStart = time.Now()
      }
    }
  }
}

3. SLO-Based Alerting: Beyond Fixed Thresholds

Alerts based on fixed thresholds produce too many false positives or false negatives. Game backends have variable behavior: nighttime latency is much lower than peak hours. SLO-based alerting measures the percentage of time the service meets its objectives and generates alerts only when the error budget is about to be exhausted.

# Prometheus: SLO definitions and alerting rules
# File: prometheus/rules/game_slo.yaml

groups:
- name: game_backend_slos
  rules:

  # SLO 1: 99.5% of players must have RTT < 100ms
  - record: job:gameserver_rtt_slo:ratio_rate5m
    expr: |
      sum(rate(gameserver_network_player_rtt_milliseconds_bucket{le="100"}[5m]))
      /
      sum(rate(gameserver_network_player_rtt_milliseconds_count[5m]))

  - alert: GameRTTSLOBreach
    expr: job:gameserver_rtt_slo:ratio_rate5m < 0.995
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "RTT SLO breach: {{ $value | humanizePercentage }} compliance"
      description: "Only {{ $value | humanizePercentage }} of players have RTT < 100ms."

  # SLO 2: Tickrate must be >= 90% of target
  - alert: GameTickRateDegraded
    expr: |
      (gameserver_loop_tickrate_hz / on(match_id) gameserver_loop_target_tickrate_hz)
      < 0.90
    for: 30s
    labels:
      severity: critical
    annotations:
      summary: "Tickrate degraded on match {{ $labels.match_id }}"

  # SLO 3: Match abandonment rate < 5%
  - alert: HighMatchAbandonmentRate
    expr: |
      rate(gameserver_match_abandonment_total[15m])
      /
      rate(gameserver_match_start_total[15m])
      > 0.05
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High abandonment in region {{ $labels.region }}"

  # Alert: Stagnant matchmaking queue (possible bug)
  - alert: MatchmakingQueueStagnant
    expr: |
      gameserver_matchmaking_queue_depth > 50
      AND
      rate(gameserver_match_start_total[5m]) == 0
    for: 2m
    labels:
      severity: critical
    annotations:
      summary: "Matchmaking stagnant: {{ $value }} players waiting, zero matches started"

4. Distributed Tracing with OpenTelemetry

Distributed tracing is essential for debugging complex issues in match lifecycle: why a matchmaking request takes 8 seconds instead of 2, which component introduces latency in the critical game loop path. OpenTelemetry (OTEL) has become the open-source standard for tracing, with export to Jaeger or Grafana Tempo.

// matchmaker.go - Tracing the matchmaking flow
func (m *Matchmaker) FindMatch(ctx context.Context, ticket MatchTicket) (*Match, error) {
  tracer := otel.Tracer("matchmaker")
  ctx, span := tracer.Start(ctx, "matchmaker.FindMatch")
  defer span.End()

  span.SetAttributes(
    attribute.String("ticket.id", ticket.ID),
    attribute.String("ticket.mode", ticket.Mode),
    attribute.Float64("ticket.mmr", ticket.MMR),
    attribute.String("ticket.region", ticket.Region),
  )

  // Phase 1: Fetch compatible players from pool
  ctx, poolSpan := tracer.Start(ctx, "matchmaker.FetchPool")
  pool, err := m.fetchCompatiblePool(ctx, ticket)
  poolSpan.SetAttributes(attribute.Int("pool.size", len(pool)))
  poolSpan.End()

  if err != nil {
    span.RecordError(err)
    return nil, err
  }

  // Phase 2: Run matching algorithm
  ctx, algoSpan := tracer.Start(ctx, "matchmaker.RunAlgorithm")
  match, err := m.runGlicko2Algorithm(ctx, ticket, pool)
  algoSpan.SetAttributes(
    attribute.Int("candidates.evaluated", len(pool)),
    attribute.Bool("match.found", match != nil),
  )
  algoSpan.End()

  if match != nil {
    span.SetAttributes(attribute.String("match.id", match.ID))
  }

  return match, err
}

5. Player Experience Score (PES): The Metric That Matters

The Player Experience Score is a composite metric that aggregates multiple technical signals into a single value (0-100) representing the quality of experience from the player's perspective.

-- ClickHouse: Player Experience Score calculation per match
CREATE VIEW game_analytics.match_pes AS
SELECT
  match_id, server_id, region,
  toStartOfMinute(server_ts) AS minute,
  round(
    -- RTT Score (45% weight): most perceived by players
    avg(multiIf(
      toFloat64OrZero(payload['rtt_ms']) <= 40, 100,
      toFloat64OrZero(payload['rtt_ms']) <= 80,
        100 - (toFloat64OrZero(payload['rtt_ms']) - 40) * 1.5,
      toFloat64OrZero(payload['rtt_ms']) <= 150,
        40 - (toFloat64OrZero(payload['rtt_ms']) - 80) * 0.57,
      0
    )) * 0.45 +

    -- Tickrate Score (35% weight)
    avg(multiIf(
      toFloat64OrZero(payload['actual_tickrate']) /
        toFloat64OrZero(payload['target_tickrate']) >= 0.95, 100,
      toFloat64OrZero(payload['actual_tickrate']) /
        toFloat64OrZero(payload['target_tickrate']) >= 0.70,
        (toFloat64OrZero(payload['actual_tickrate']) /
         toFloat64OrZero(payload['target_tickrate']) - 0.70) * 400,
      0
    )) * 0.35 +

    -- Packet Loss Score (20% weight)
    avg(multiIf(
      toFloat64OrZero(payload['packet_loss_pct']) <= 0, 100,
      toFloat64OrZero(payload['packet_loss_pct']) <= 2,
        100 - toFloat64OrZero(payload['packet_loss_pct']) * 50,
      0
    )) * 0.20,
    1
  ) AS pes
FROM game_analytics.events_all
WHERE event_type = 'system.server_stats'
  AND server_ts >= now() - INTERVAL 5 MINUTE
GROUP BY match_id, server_id, region, minute;

PES Interpretation

PES Range	Classification	Expected Abandonment	Action
90-100	Excellent	< 2%	None
75-89	Good	2-5%	Monitor
60-74	Acceptable	5-10%	Investigate
40-59	Degraded	10-20%	Alert + intervene
0-39	Critical	> 20%	Rollback or migrate

6. Log Aggregation with Loki: Structured Logging

Game server logging must be structured (JSON) and correlated with metrics via match_id, server_id, and trace_id. Loki allows searching logs by label without indexing all content (unlike Elasticsearch), making it much cheaper at high volume.

// logger.go - Structured logging with zap + Loki labels
func NewMatchLogger(matchID, serverID, region string) *GameLogger {
  logger, _ := zap.NewProduction()
  return &GameLogger{
    base: logger.With(
      // These fields become Loki labels for filtering
      zap.String("match_id", matchID),
      zap.String("server_id", serverID),
      zap.String("region", region),
      zap.String("service", "game-server"),
    ),
  }
}

// Loki queries for investigation:
// {match_id="match_789xyz"} |= "player.kill"
// {region="eu-west"} | json | rtt_ms > 150
// {service="game-server"} | json | level="error" | rate()[5m] > 10

// Correlate with traces via trace_id field:
// {service="game-server"} | json | trace_id="abc123..."

Conclusions

Game backend observability requires a domain-specific approach: applying standard web patterns is not enough. Gaming-specific metrics (tickrate, per-player RTT, packet loss, match abandonment) must be combined into composite metrics like the Player Experience Score that correlate technical performance with actual player behavior.

The Prometheus + Grafana + Loki + Jaeger/Tempo stack has become the open-source standard for this need. The key is deep instrumentation of the game server from the beginning, not as an afterthought: an uninstrumented game server is like an airplane without flight instruments.

Next Steps in the Game Backend Series

Previous: Cloud Gaming: Streaming with WebRTC and Edge Nodes
This is the final article in the Game Backend series
Related series: MLOps for Business - AI Models in Production
Related series: DevOps Frontend - CI/CD and Monitoring

Category	Metric	Unit	Target	Impact
Networking	Round-Trip Time (RTT)	ms	< 80ms	Gameplay responsiveness
Networking	Packet Loss Rate	%	< 0.1%	Teleportation, rubber-banding
Networking	Jitter	ms	< 20ms	Erratic interpolation
Game Loop	Server Tickrate	tick/s	Target +/-5%	Gameplay precision
Game Loop	Tick Processing Time	ms	< tick_period	If exceeded: gameplay hickup
Match	Abandonment Rate	%	< 5%	User frustration
Match	Matchmaking Time	s	< 30s	Pre-match engagement