Hi! I'm

Federico Calò

Software Developer | Technical Writer

I create modern web applications and custom digital tools to help businesses grow through technological innovation. My passion is combining computer science and economics to generate real value.

Contact Me

About Me

My passion for computer science was born at the Technical Commercial Institute of Maglie, where I discovered the power of programming and the fascination of creating digital solutions. From the start, I understood that computer science was not just code, but an extraordinary tool for turning ideas into reality.

During my studies in Business Information Systems, I began to interweave computer science and economics, understanding how technology can be the engine of growth for any business. This vision accompanied me to the University of Bari, where I obtained my degree in Computer Science, deepening my technical skills and passion for software development.

Today I put this experience at the service of businesses, professionals and startups, creating tailor-made digital solutions that automate processes, optimize resources and open new business opportunities. Because true innovation begins when technology meets the real needs of people.

My Skills

Data Analysis & Predictive Models

I transform data into strategic insights with in-depth analysis and predictive models for informed decisions

Process Automation

I create custom tools that automate repetitive operations and free up time for value-added activities

Custom Systems

I develop tailor-made software systems, from platform integrations to customized dashboards

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

🚀

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

💡

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

🎯

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

December 2024

View

Master SQL

RoadMap.sh

Novembre 2024

View

Oracle Certified Foundations Associate

Oracle

October 2024

View

People Leadership Credential

Connect

Settembre 2024

💻 Languages & Technologies

☕Java

🐍Python

📜JavaScript

🅰️Angular

⚛️React

🔷TypeScript

🗄️SQL

🐘PHP

🎨CSS/SCSS

🔧Node.js

🐳Docker

🌿Git

💼

12/2024 - Presente

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italia · Ibrida Analisi e sviluppo di sistemi informatici attraverso l'utilizzo di Java e Quarkus in Health and Public Sector. Formazione continua su tecnologie moderne per la creazione di soluzioni software personalizzate ed efficienti e sugli agenti.

💼

06/2022 - 12/2024

Analista software e Back End Developer Associate Consultant

Links Management and Technology SpA

Esperienza nell'analisi di sistemi software as-is e flussi ETL utilizzando PowerCenter. Formazione completata su Spring Boot per lo sviluppo di applicazioni backend moderne e scalabili. Sviluppatore Backend specializzato in Spring Boot, con esperienza in progettazione di database, analisi, sviluppo e testing dei task assegnati.

💼

02/2021 - 10/2021

Programmatore software

Adesso.it (prima era WebScience srl)

Esperienza nell'analisi AS-IS e TO-BE, evoluzioni SEO ed evoluzioni website per migliorare le performance e l'engagement degli utenti.

🎓

2018 - 2025

Laurea in Informatica

Università degli Studi di Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Sistemi Informativi Aziendali

Istituto Tecnico Commerciale di Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

EXPLAIN ANALYZE: Read and Optimize PostgreSQL Query Plans

A query that takes 8 seconds in production and can often be optimized to 50ms with the right index or a little rewriting. EXPLAIN ANALYZE and the tool that reveals exactly why PostgreSQL runs a query in a certain way: how many rows it scans, what indexes it uses, where it spends 90% of the time. Learning to read query plans is the most impactful skill for a PostgreSQL developer.

What You Will Learn

Read the output of EXPLAIN ANALYZE and identify the expensive nodes
Difference between Sequential Scan, Index Scan and Index-Only Scan
When PostgreSQL chooses Nested Loop, Hash Join or Merge Join
How to fix incorrect planner estimates with updated statistics
Use auto_explain to automatically catch slow queries

The Structure of a Query Plan

A query plan and a tree of operations: each node represents an operation (scan, join, sort) and child nodes provide data to the parent node. PostgreSQL runs the tree from bottom to top.

-- EXPLAIN ANALYZE: aggiunge il tempo reale e le righe effettive
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT
  o.id AS ordine_id,
  u.email,
  SUM(ol.quantita * ol.prezzo_unitario) AS totale
FROM ordini o
JOIN utenti u ON u.id = o.utente_id
JOIN ordini_linee ol ON ol.ordine_id = o.id
WHERE o.stato = 'completato'
  AND o.creato_il >= '2026-01-01'
GROUP BY o.id, u.email
ORDER BY totale DESC
LIMIT 100;

Typical output (annotated):

Limit  (cost=45823.12..45823.37 rows=100 width=48)
       (actual time=892.341..892.363 rows=100 loops=1)
  ->  Sort  (cost=45823.12..46073.12 rows=100000 width=48)
            (actual time=892.340..892.350 rows=100 loops=1)
        Sort Key: (sum((ol.quantita * ol.prezzo_unitario))) DESC
        Sort Method: top-N heapsort  Memory: 32kB
        ->  HashAggregate  (cost=40000.00..42500.00 rows=100000 ...)
                           (actual time=820.123..880.234 rows=98432 loops=1)
              Group Key: o.id, u.email
              Batches: 5  Memory Usage: 4096kB  Disk Usage: 1024kB  <- PROBLEMA!
              ->  Hash Join  (cost=1200.00..35000.00 rows=500000 ...)
                             (actual time=45.123..650.234 rows=512000 loops=1)
                    Hash Cond: (ol.ordine_id = o.id)
                    ->  Seq Scan on ordini_linee ol           <- Sequential Scan!
                             (actual time=0.012..280.123 rows=2100000 loops=1)
                    ->  Hash  (cost=800.00..800.00 rows=32000 ...)
                              (actual time=44.123..44.123 rows=31845 loops=1)
                          Buckets: 32768  Batches: 1  Memory: 2048kB
                          ->  Hash Join  (cost=100.00..800.00 ...)
                                         (actual time=2.123..38.123 ...)
                                Hash Cond: (o.utente_id = u.id)
                                ->  Index Scan using idx_ordini_stato_data on ordini o
                                         (actual time=0.234..15.123 rows=31845 loops=1)
                                    Index Cond: ((stato = 'completato') AND ...)
                                ->  Hash  (cost=50.00..50.00 rows=4000 ...)
                                          ...
Planning Time: 3.456 ms
Execution Time: 892.987 ms

Read Key Metrics

Cost: Estimate vs Reality

Each node shows two pairs of values: (cost=startup..total rows=stima width=N) I am le planner estimates, While (actual time=startup..total rows=reali loops=N) they are the values really measure yourself. When there is a large discrepancy between estimate and reality, the planner has incorrect information on statistics.

-- Discrepanza grave: stima 1000 righe, realta 500000
-- Questo causa scelta errata del tipo di join e degli indici
--
-- (cost=... rows=1000 ...)    <- planner si aspettava 1000 righe
-- (actual ... rows=500000 ...) <- ma ne ha trovate 500000
--
-- Soluzione: aggiornare le statistiche
ANALYZE VERBOSE ordini_linee;

-- Per controllare le statistiche attualmente in uso
SELECT
  tablename,
  attname,
  n_distinct,
  correlation,
  most_common_vals,
  most_common_freqs
FROM pg_stats
WHERE tablename = 'ordini_linee'
  AND attname IN ('ordine_id', 'stato');

Seq Scan vs Index Scan vs Index-Only Scan

The type of scan has a huge impact on performance. PostgreSQL chooses type based to the estimated selectivity and table size.

-- Quando PostgreSQL sceglie Sequential Scan nonostante un indice esista:
-- 1. Troppe righe restituite (>5-10% della tabella): seq scan e piu veloce
-- 2. Statistiche obsolete: il planner crede ci siano poche righe
-- 3. work_mem insufficiente per materializzare il result set
-- 4. L'indice non copre le colonne del WHERE

-- Forzare l'Index Scan per debugging (NON usare in produzione)
SET enable_seqscan = off;
EXPLAIN ANALYZE SELECT ...;
SET enable_seqscan = on;

-- Index-Only Scan: il piu efficiente (nessun accesso alla tabella heap)
-- Richiede che tutte le colonne nel SELECT siano nell'indice
-- E che le pagine siano nel visibility map (dopo autovacuum)

-- Esempio: creare un covering index per abilitare index-only scan
CREATE INDEX idx_ordini_covering
  ON ordini (stato, creato_il)
  INCLUDE (utente_id, totale);  -- colonne extra non usate nel filtro

-- Verificare se il visibility map e aggiornato
VACUUM ordini;  -- o aspettare autovacuum

Identifying Costly Nodes

The fastest way to find the bottleneck is to use FORMAT JSON with tools visual, or manually search for the node with actual time taller.

-- Usare EXPLAIN con FORMAT JSON e pgMustard / explain.dalibo.com
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT ...;

-- Pattern da cercare nell'output:
-- 1. "Disk Usage" > 0 in HashAggregate o Sort: work_mem troppo basso
-- 2. "Rows Removed by Filter" molto alto: indice mancante o inefficiente
-- 3. "loops" > 1 su Seq Scan: Nested Loop con tabella grande
-- 4. Stima vs reale molto diversa: statistiche obsolete

-- Calcolo del "costo effettivo" di un nodo:
-- actual_total_time * loops = tempo totale del nodo
-- Se un nodo con loops=50000 ha actual_time=0.05ms,
-- il costo totale e 2500ms!

Types of Joins and When to Use Them

Nested Loop Join

Iterate over each row of the "outer" table and for each one look in the "inner" table. Excellent when the inner is small and indexed. Disastrous when both tables are large.

-- Nested Loop: ideale per join su PK con pochi record
-- outer: 100 ordini recenti -> inner: users con indice su id
-- Costo: 100 * O(log N) = molto efficiente

-- Quando diventa un problema:
-- outer: 50.000 righe -> inner: 50.000 righe -> 2.5 miliardi di confronti!
-- Il planner dovrebbe preferire Hash Join in questo caso

-- Se vedi Nested Loop su tabelle grandi, verifica:
-- 1. Statistiche: il planner credeva di trovare poche righe
-- 2. join_collapse_limit: aumentare se il planner non esplora tutte le alternative
-- 3. enable_nestloop = off per testare se Hash/Merge Join sarebbe meglio
SET enable_nestloop = off;
EXPLAIN ANALYZE ...;
SET enable_nestloop = on;

Hash Join

Constructs a hash table from the smaller table, then scans the larger one looking for the correspondences. Efficient for joins on medium/large tables without usable indexes.

-- Hash Join con spill to disk (Batches > 1): aumentare work_mem
-- "Batches: 5  Memory Usage: 4096kB  Disk Usage: 1024kB"
-- significa che la hash table non entra in memoria

-- Calcolare la work_mem necessaria
-- La hash table occupa circa: righe * larghezza_media * 2
-- Aumentare per sessione specifica (non globalmente)
SET work_mem = '256MB';
EXPLAIN ANALYZE ...;
SET work_mem = DEFAULT;

-- Oppure aumentare solo per le query analitiche
ALTER ROLE analyst_user SET work_mem = '256MB';

Merge Join

Requires both inputs to be sorted on the join key. PostgreSQL chooses it when both tables have indexes that guarantee order.

Correcting Planner Estimates

Incorrect estimates are the most common cause of suboptimal query plans. Here are the techniques to correct them.

-- 1. Aumentare il target di statistiche per colonne ad alta cardinalita
ALTER TABLE ordini ALTER COLUMN stato SET STATISTICS 500;  -- default: 100
ANALYZE ordini;

-- 2. Creare statistiche estese per colonne correlate
-- (PostgreSQL non sa che stato='completato' implica certi range di date)
CREATE STATISTICS stat_ordini_stato_data (dependencies, ndistinct)
  ON stato, creato_il FROM ordini;
ANALYZE ordini;

-- 3. Verificare le statistiche con pg_statistic_ext
SELECT
  stxname,
  stxkeys::int[] AS colonne,
  stxdependencies,
  stxndistinct
FROM pg_statistic_ext
JOIN pg_statistic_ext_data ON pg_statistic_ext_data.stxoid = pg_statistic_ext.oid
WHERE stxrelid = 'ordini'::regclass;

-- 4. Forzare il re-planning disabilitando la plan cache
-- (utile per query parametriche con distribuzione skewed)
SET plan_cache_mode = force_generic_plan;

auto_explain: Catch Slow Queries Automatically

In production you can't run EXPLAIN manually on every query. auto_explain automatically logs the plan of queries that exceed a time threshold.

-- Abilitare auto_explain in postgresql.conf
# shared_preload_libraries = 'auto_explain'  # richiede riavvio

# auto_explain parametri
auto_explain.log_min_duration = '500ms'  # logga query > 500ms
auto_explain.log_analyze = true           # mostra i tempi reali
auto_explain.log_buffers = true           # mostra I/O
auto_explain.log_timing = true            # timing dettagliato
auto_explain.log_triggers = false         # non loggare trigger
auto_explain.log_verbose = false          # output compatto
auto_explain.log_format = 'text'          # json per tools

-- In alternativa: abilitare per sessione (senza riavvio)
LOAD 'auto_explain';
SET auto_explain.log_min_duration = '200ms';
SET auto_explain.log_analyze = true;

-- Ora ogni query > 200ms nel log di PostgreSQL includera il piano
-- Cercare in /var/log/postgresql/postgresql-17-main.log:
-- duration: 892.987 ms  plan:
-- ...Seq Scan on ordini_linee  (actual time=0.012..280.123)...

Practical Optimization Workflow

Here is the systematic process to optimize a slow query identified in production.

-- STEP 1: Identificare la query con pg_stat_statements
SELECT
  LEFT(query, 100) AS q,
  calls,
  ROUND(mean_exec_time::numeric, 1) AS mean_ms,
  ROUND(max_exec_time::numeric, 1) AS max_ms,
  ROUND(total_exec_time::numeric / 1000, 1) AS total_s
FROM pg_stat_statements
WHERE calls > 10
ORDER BY mean_exec_time DESC
LIMIT 10;

-- STEP 2: Ottenere il piano con EXPLAIN ANALYZE
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) ;

-- STEP 3: Identificare il nodo piu costoso
-- Cercare: actual time alto, Seq Scan su tabelle grandi, Batches > 1

-- STEP 4: Verificare gli indici esistenti
SELECT
  indexname,
  indexdef,
  pg_relation_size(indexrelid) AS size_bytes
FROM pg_indexes
WHERE tablename = 'ordini_linee';

-- STEP 5: Creare l'indice mancante
-- CONCURRENTLY: non blocca le scritture durante la creazione
CREATE INDEX CONCURRENTLY idx_ordini_linee_ordine_id
  ON ordini_linee (ordine_id)
  INCLUDE (quantita, prezzo_unitario);

-- STEP 6: Verificare che il nuovo piano usi l'indice
EXPLAIN (ANALYZE, BUFFERS)
SELECT ol.ordine_id, SUM(ol.quantita * ol.prezzo_unitario)
FROM ordini_linee ol
WHERE ol.ordine_id IN (SELECT id FROM ordini WHERE stato = 'completato')
GROUP BY ol.ordine_id;

-- STEP 7: Misurare il miglioramento con pg_stat_statements
-- Attendere almeno 100 esecuzioni, poi confrontare mean_exec_time
SELECT pg_stat_statements_reset();
-- ... attendere ...
SELECT calls, mean_exec_time, max_exec_time FROM pg_stat_statements WHERE ...;

Anti-Pattern: Do not change the planner in production

Avoid SET enable_seqscan = off o SET enable_hashjoin = off in production. These settings disable strategies that are optimal in other contexts. The correct solution and always update the statistics, create appropriate indexes, or rewrite the query.

Visual Tools for Query Plans

For complex query plans, visual tools accelerate analysis. The best ones are free.

explain.dalibo.com: Paste the JSON output of EXPLAIN, get an interactive graph. The slowest node is highlighted in red.
pgMustard: guided analysis with specific suggestions ("consider an index on column X").
pspg: pager for psql that colors EXPLAIN output readably.

-- Copiare l'output JSON per explain.dalibo.com
\pset format unaligned
\pset tuples_only on
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT ...;

-- Output e un JSON singolo che puoi incollare direttamente

Conclusions

EXPLAIN ANALYZE and the lens that reveals the inner workings of PostgreSQL. Learn to read Sequential Scan vs Index Scan, understand when Hash Join is preferable to Nested Loop, and identifying discrepancies between estimates and reality transforms a developer from "write SQL" to "optimize database".

In the next article we will delve deeper into advanced indexing: partial index, covering index, BRIN for time-series and GIN for full-text search. These are the tools that, combined with an understanding of query plans, they enable dramatic optimizations.

Series: Database Engineering and Optimization

Article 1: PostgreSQL 17/18 - News and Performance
Article 2 (this): EXPLAIN ANALYZE - Read Query Plans
Article 3: Advanced Indexing - Partial, Covering, BRIN, GIN
Article 4: Table Partitioning - Managing Billions of Rows
Article 5: Connection Pooling - PgBouncer vs Pgpool-II