Hi! I'm

Federico Calò

Software Developer | Technical Writer

I create modern web applications and custom digital tools to help businesses grow through technological innovation. My passion is combining computer science and economics to generate real value.

Contact Me

About Me

My passion for computer science was born at the Technical Commercial Institute of Maglie, where I discovered the power of programming and the fascination of creating digital solutions. From the start, I understood that computer science was not just code, but an extraordinary tool for turning ideas into reality.

During my studies in Business Information Systems, I began to interweave computer science and economics, understanding how technology can be the engine of growth for any business. This vision accompanied me to the University of Bari, where I obtained my degree in Computer Science, deepening my technical skills and passion for software development.

Today I put this experience at the service of businesses, professionals and startups, creating tailor-made digital solutions that automate processes, optimize resources and open new business opportunities. Because true innovation begins when technology meets the real needs of people.

My Skills

Data Analysis & Predictive Models

I transform data into strategic insights with in-depth analysis and predictive models for informed decisions

Process Automation

I create custom tools that automate repetitive operations and free up time for value-added activities

Custom Systems

I develop tailor-made software systems, from platform integrations to customized dashboards

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

La Mia Missione

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

🚀

Democratizzare la Tecnologia

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

💡

Unire Informatica ed Economia

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

🎯

Creare Soluzioni su Misura

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

Trasforma la Tua Attività con la Tecnologia

December 2024

View

Master SQL

RoadMap.sh

Novembre 2024

View

Oracle Certified Foundations Associate

Oracle

October 2024

View

People Leadership Credential

Connect

Settembre 2024

💻 Languages & Technologies

☕Java

🐍Python

📜JavaScript

🅰️Angular

⚛️React

🔷TypeScript

🗄️SQL

🐘PHP

🎨CSS/SCSS

🔧Node.js

🐳Docker

🌿Git

💼

12/2024 - Presente

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italia · Ibrida Analisi e sviluppo di sistemi informatici attraverso l'utilizzo di Java e Quarkus in Health and Public Sector. Formazione continua su tecnologie moderne per la creazione di soluzioni software personalizzate ed efficienti e sugli agenti.

💼

06/2022 - 12/2024

Analista software e Back End Developer Associate Consultant

Links Management and Technology SpA

Esperienza nell'analisi di sistemi software as-is e flussi ETL utilizzando PowerCenter. Formazione completata su Spring Boot per lo sviluppo di applicazioni backend moderne e scalabili. Sviluppatore Backend specializzato in Spring Boot, con esperienza in progettazione di database, analisi, sviluppo e testing dei task assegnati.

💼

02/2021 - 10/2021

Programmatore software

Adesso.it (prima era WebScience srl)

Esperienza nell'analisi AS-IS e TO-BE, evoluzioni SEO ed evoluzioni website per migliorare le performance e l'engagement degli utenti.

🎓

2018 - 2025

Laurea in Informatica

Università degli Studi di Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Sistemi Informativi Aziendali

Istituto Tecnico Commerciale di Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

Contattami

Hai un progetto in mente? Parliamone! Compila il form qui sotto e ti risponderò al più presto.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

Persistent Storage in Kubernetes: CSI, PV, StorageClass and StatefulSet

Kubernetes was born as a platform for stateless workloads, but the reality of applications enterprise and very different: databases, message queues, persistent cache systems, shared file systems. All require storage that survives the lifecycle of a Pod. Manage this storage in a reliable, high-performance and portable way between cloud providers different and one of the most concrete challenges of day-to-day production.

In this article we will explore the entire Kubernetes storage layer: from Container Storage Interface (CSI) which standardizes integration with providers, PersistentVolume and StorageClass for dynamic provisioning, up to StatefulSet for manage databases such as PostgreSQL, Cassandra and Redis on Kubernetes.

What You Will Learn

The Kubernetes storage model: Volume, PersistentVolume, PersistentVolumeClaim
How the Container Storage Interface (CSI) works and the most used drivers
StorageClass and dynamic provisioning: configuration for AWS EBS, GCE PD, Azure Disk
Access Mode: ReadWriteOnce, ReadOnlyMany, ReadWriteMany - when to use which
StatefulSet: stable identity, automatic PVCs, orderly rolling update
How to run PostgreSQL on Kubernetes with StatefulSet
Backup and restore of PersistentVolumes with Velero

The Kubernetes Storage Model

Kubernetes defines an abstraction hierarchy for storage that separates "what's needed" (PersistentVolumeClaim) from "as and provided" (PersistentVolume and StorageClass). This allows developers to request storage without knowing provider details underlying cloud.

Storage Primitives

Resource	Brooms	Who manages it	Description
Volume	Pod	Developer	Ephemeral storage linked to the life cycle of the Pod
PersistentVolume (PV)	Clusters	Admin / Provisioner	Storage piece in the cluster, lifecycle independent of Pods
PersistentVolumeClaim (PVC)	Namespace	Developer	Requesting storage from a Pod
StorageClass	Clusters	Admin	Defines the storage "type" and provisioner

Lifecycle of a PersistentVolume

The life cycle of a PV goes through different states. Understanding them is essential troubleshooting:

Available: The PV exists and is free, not associated with any PVC
Bound: The PV has been bonded to a PVC that meets the requirements
Released: The PVC has been eliminated, but the PV is not yet available (the data is still there)
Failed: The PV failed automatic reclaim

# Verifica lo stato dei PersistentVolume nel cluster
kubectl get pv -o wide

# Output tipico:
# NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                STORAGECLASS
# pv-db-001   100Gi      RWO            Retain           Bound    production/postgres   fast-ssd
# pv-db-002   100Gi      RWO            Retain           Available                     fast-ssd

# Descrizione dettagliata di un PV
kubectl describe pv pv-db-001

# Verifica i PVC in un namespace
kubectl get pvc -n production
kubectl describe pvc postgres-data -n production

Container Storage Interface (CSI)

Before CSI, every storage provider had to maintain plugins built into their code Kubernetes source (in-tree plugins). This created a strong and efficient coupling difficult to update plugins independently of Kubernetes. The CSI solves this with a standard gRPC interface that allows third parties to create storage drivers as independent Kubernetes Pods.

Main CSI Drivers

CSI Drivers	Providers	Storage type	ReadWriteMany
aws-ebs-csi-driver	AWS	Block (gp3, io2)	No
aws-efs-csi-driver	AWS	NFS (EFS)	Si
gce-pd-csi-driver	GCP	Block (pd-ssd, pd-balanced)	No (RWX FileStore only)
azuredisk-csi-driver	Azure	Block (Premium SSD)	No
azurefile-csi-driver	Azure	NFS (Azure Files)	Si
csi-rook-ceph	Rook/Ceph	Block/FS/Object	Yes (CephFS)
longhorn	Ranchers	Block distributed	Yes (with NFS)

Installation of the CSI EBS Driver on EKS

# Installa il driver CSI EBS su Amazon EKS
# Prima, crea un IAM Role con le policy necessarie
aws eks create-addon \
  --cluster-name my-cluster \
  --addon-name aws-ebs-csi-driver \
  --service-account-role-arn arn:aws:iam::ACCOUNT_ID:role/EBSCSIRole

# Verifica il daemonset del driver CSI
kubectl get daemonset -n kube-system ebs-csi-node
kubectl get deployment -n kube-system ebs-csi-controller

StorageClass: Dynamic Provisioning

Static provisioning (manually creating PVs) is impractical in production. With dynamic provisioning, Kubernetes automatically creates the PV when a PVC comes created, using the CSI driver configured in the StorageClass.

StorageClass for AWS EBS

# storage-classes-aws.yaml

# StorageClass per dischi gp3 (performance ottimale)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer  # IMPORTANTE: evita cross-AZ mounting
reclaimPolicy: Retain  # Protegge i dati in produzione
allowVolumeExpansion: true
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
  encrypted: "true"
  kmsKeyId: "arn:aws:kms:eu-west-1:ACCOUNT:key/KEY_ID"
---
# StorageClass per io2 (database ad alto IOPS)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ultra-fast-ssd
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
allowVolumeExpansion: true
parameters:
  type: io2
  iops: "32000"
  encrypted: "true"
---
# StorageClass per EFS (ReadWriteMany, NFS)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: shared-storage
provisioner: efs.csi.aws.com
reclaimPolicy: Retain
parameters:
  provisioningMode: efs-ap
  fileSystemId: fs-XXXXXXXX
  directoryPerms: "700"

StorageClass for GKE (Google Kubernetes Engine)

# storage-classes-gke.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd-gke
provisioner: pd.csi.storage.gke.io
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
allowVolumeExpansion: true
parameters:
  type: pd-ssd
  replication-type: regional-pd  # replica su 2 zone
  availability-class: regional-hard-failover

volumeBindingMode: Why WaitForFirstConsumer

Always use WaitForFirstConsumer instead of Immediate when the cluster has multiple Availability Zones. With Immediate, the PV is created in the zone where the provisioner is scheduled, which may be different from the zone where the Pod will be scheduled. Result: The Pod fails to mount the volume. WaitForFirstConsumer creates the PV in the same area as the Pod.

PersistentVolumeClaim in Practice

PVC and how a Pod requires storage. The PVC specifies size, access mode and StorageClass. Kubernetes finds or creates a compatible PV and binds it to the PVC.

# pvc-database.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-data
  namespace: production
  labels:
    app: postgres
    tier: database
  annotations:
    # Snapshot policy (con alcuni storage providers)
    storageclass.kubernetes.io/is-default-class: "false"
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 100Gi
  # Espandi a 200Gi in futuro con:
  # kubectl patch pvc postgres-data -p '{"spec":{"resources":{"requests":{"storage":"200Gi"}}}}'

Access Mode: Choose the Right One

Access Mode	Abbreviation	What does it mean	Typical use
ReadWriteOnce	RWO	Only one read/write node	Databases, single-instance applications
ReadOnlyMany	ROX	Many read-only nodes	Shared static data,configurations
ReadWriteMany	RWX	Many read/write nodes	Shared NFS, upload handler
ReadWriteOncePod	RWOP	Only one read/write Pod	Exclusive storage for single Pod (K8s 1.29+)

StatefulSet: Workload with Stable Identity

Deployments are great for stateless workloads, but for databases and applications that require stable identity (predictable hostname, dedicated volume, boot order) serves it StatefulSet. The key differences compared to Deployments:

Stable identity: Pods have predictable names: myapp-0, myapp-1, myapp-2
Boot order: Pods are started in order (0, then 1, then 2) and shut down in reverse order
Dedicated PVCs: Each Pod gets its own PVC via volumeClaimTemplates
Headless Service: Each Pod has a stable DNS entry: myapp-0.myapp.namespace.svc.cluster.local

PostgreSQL on Kubernetes with StatefulSet

Here is a complete and production-ready setup for PostgreSQL with StatefulSet, including ConfigMap for configuration, Secret for credentials, and headless Service:

# postgres-statefulset.yaml
apiVersion: v1
kind: Service
metadata:
  name: postgres
  namespace: production
  labels:
    app: postgres
spec:
  ports:
    - port: 5432
      name: postgres
  clusterIP: None  # Headless Service - abilita il DNS per Pod individuali
  selector:
    app: postgres
---
# Service per accesso al master (read/write)
apiVersion: v1
kind: Service
metadata:
  name: postgres-master
  namespace: production
spec:
  ports:
    - port: 5432
  selector:
    app: postgres
    role: master
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
  namespace: production
spec:
  serviceName: postgres  # deve corrispondere al nome del headless Service
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      # Anti-affinity: distribuisce le repliche su nodi diversi
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchLabels:
                  app: postgres
              topologyKey: kubernetes.io/hostname
      initContainers:
        # init container per configurare i permessi del volume
        - name: init-postgres
          image: postgres:16
          command:
            - bash
            - "-c"
            - |
              chown -R 999:999 /var/lib/postgresql/data
          volumeMounts:
            - name: postgres-data
              mountPath: /var/lib/postgresql/data
      containers:
        - name: postgres
          image: postgres:16
          ports:
            - containerPort: 5432
              name: postgres
          env:
            - name: POSTGRES_DB
              value: myapp
            - name: POSTGRES_USER
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: username
            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: password
            - name: PGDATA
              value: /var/lib/postgresql/data/pgdata
          volumeMounts:
            - name: postgres-data
              mountPath: /var/lib/postgresql/data
            - name: postgres-config
              mountPath: /etc/postgresql/postgresql.conf
              subPath: postgresql.conf
          resources:
            requests:
              memory: "2Gi"
              cpu: "1000m"
            limits:
              memory: "4Gi"
              cpu: "2000m"
          livenessProbe:
            exec:
              command:
                - pg_isready
                - -U
                - $(POSTGRES_USER)
                - -d
                - $(POSTGRES_DB)
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            exec:
              command:
                - pg_isready
                - -U
                - $(POSTGRES_USER)
                - -d
                - $(POSTGRES_DB)
            initialDelaySeconds: 5
            periodSeconds: 5
      volumes:
        - name: postgres-config
          configMap:
            name: postgres-config
  # PVC template: ogni Pod ottiene il proprio volume da 100Gi
  volumeClaimTemplates:
    - metadata:
        name: postgres-data
      spec:
        accessModes:
          - ReadWriteOnce
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 100Gi

ConfigMap for PostgreSQL Configuration

# postgres-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: postgres-config
  namespace: production
data:
  postgresql.conf: |
    # Performance tuning per 4GB RAM
    shared_buffers = 1GB
    work_mem = 64MB
    maintenance_work_mem = 256MB
    effective_cache_size = 3GB

    # WAL settings
    wal_level = replica
    max_wal_senders = 5
    wal_keep_size = 1GB

    # Checkpoint
    checkpoint_completion_target = 0.9
    max_wal_size = 4GB
    min_wal_size = 1GB

    # Connection settings
    max_connections = 200

    # Logging
    log_min_duration_statement = 1000  # log query lente >1s
    log_checkpoints = on
    log_connections = on
    log_disconnections = on

Volume Snapshot and Backup

Volume Snapshots allow you to create point-in-time backups of PersistentVolumes using the cloud provider's native capabilities (EBS snapshot, GCE PD snapshot, etc.).

# Installa le CRD per Volume Snapshot (se non presenti)
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml

# VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: ebs-snapshot-class
driver: ebs.csi.aws.com
deletionPolicy: Retain
---
# Crea uno snapshot del volume di PostgreSQL
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: postgres-snapshot-20260801
  namespace: production
spec:
  volumeSnapshotClassName: ebs-snapshot-class
  source:
    persistentVolumeClaimName: postgres-data-postgres-0
---
# Verifica lo stato dello snapshot
kubectl get volumesnapshot -n production

# Restore da snapshot: crea un nuovo PVC da snapshot
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-data-restored
  namespace: production
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 100Gi
  dataSource:
    name: postgres-snapshot-20260801
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io

Velero for Complete Cluster Backup

Velero is the reference tool for backup and restore of entire Kubernetes clusters, including PersistentVolumes:

# Installa Velero con il plugin EBS
velero install \
  --provider aws \
  --plugins velero/velero-plugin-for-aws:v1.8.0 \
  --bucket my-velero-backups \
  --backup-location-config region=eu-west-1 \
  --snapshot-location-config region=eu-west-1 \
  --secret-file ./credentials-velero

# Crea un backup del namespace production con i volume
velero backup create production-backup-20260801 \
  --include-namespaces production \
  --snapshot-volumes \
  --wait

# Verifica il backup
velero backup describe production-backup-20260801
velero backup logs production-backup-20260801

# Schedule: backup giornaliero a mezzanotte
velero schedule create daily-production \
  --schedule="0 0 * * *" \
  --include-namespaces production \
  --snapshot-volumes \
  --ttl 720h  # mantieni per 30 giorni

# Restore in un nuovo cluster
velero restore create --from-backup production-backup-20260801 \
  --namespace-mappings production:production-restored

Storage for AI/ML Workloads

ML training workloads have special storage requirements: parallel access high throughput for large datasets, often from multiple GPU workers simultaneously.

# PVC con ReadWriteMany per training distribuito
# Usa EFS (AWS) o CephFS (on-premise) per RWX
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ml-dataset-storage
  namespace: ml-training
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: shared-storage  # EFS o CephFS
  resources:
    requests:
      storage: 10Ti  # 10TB per dataset ImageNet, etc.
---
# Job di training che accede ai dati in parallelo
apiVersion: batch/v1
kind: Job
metadata:
  name: distributed-training
  namespace: ml-training
spec:
  parallelism: 8  # 8 worker GPU in parallelo
  completions: 8
  template:
    spec:
      containers:
        - name: trainer
          image: pytorch/pytorch:2.3.0-cuda12.1-cudnn8-runtime
          resources:
            limits:
              nvidia.com/gpu: "1"
          volumeMounts:
            - name: dataset
              mountPath: /data
              readOnly: true  # tutti i worker leggono, nessuno scrive
            - name: checkpoints
              mountPath: /checkpoints
      volumes:
        - name: dataset
          persistentVolumeClaim:
            claimName: ml-dataset-storage
            readOnly: true
        - name: checkpoints
          persistentVolumeClaim:
            claimName: ml-checkpoints-rwx  # RWX per checkpoint condivisi

Best Practices for Kubernetes Storage

Production Storage Checklist

Always use reclaimPolicy: Retain for production data. Delete deletes data automatically when PVC is deleted
volumeBindingMode: WaitForFirstConsumer: Avoid cross-AZ binding issues in multi-zone clusters
allowVolumeExpansion: true: Configure StorageClasses to allow volume expansion without downtime
Monitor disk usage: Configure alerts on Prometheus when a PVC exceeds 80% capacity
Automatic snapshots: Configure VolumeSnapshotClass and scheduled backups
Test the restore: An untested and useless backup. Do monthly restore tests
Separate PVCs by role: One PVC for data, one for logs, one for temporary backups
StatefulSet with anti-affinity: Distribute replicas across different nodes and zones

Anti-Pattern: Don't Do This

Don't use hostPath in production: Ties the Pod to a specific node and is not portable
Don't use emptyDir for persistent data: It clears when the Pod is restarted
Do not use reclaimPolicy: Delete for production data: You can lose everything by mistake
Do not mount the same PVC (RWO) on multiple Pods: Causes data corruption

Storage Monitoring with Prometheus

# Metriche chiave da monitorare con kube-state-metrics
# Aggiungi alert a Prometheus

# Alert: PVC vicino alla capacita massima
groups:
  - name: kubernetes-storage
    rules:
      - alert: PVCStorageUsageHigh
        expr: |
          kubelet_volume_stats_used_bytes /
          kubelet_volume_stats_capacity_bytes > 0.80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "PVC {{ $labels.persistentvolumeclaim }} e all'80% della capacita"
          description: "Namespace: {{ $labels.namespace }}"

      - alert: PVCStorageFull
        expr: |
          kubelet_volume_stats_used_bytes /
          kubelet_volume_stats_capacity_bytes > 0.95
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "PVC {{ $labels.persistentvolumeclaim }} quasi piena!"

      - alert: PVCNotBound
        expr: |
          kube_persistentvolumeclaim_status_phase{phase="Pending"} == 1
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "PVC {{ $labels.persistentvolumeclaim }} in stato Pending da 10 minuti"

Conclusions and Next Steps

Kubernetes storage is one of the most critical layers for enterprise applications in production. The Container Storage Interface standardizes integration with any provider, dynamic provisioning with StorageClass eliminates manual work, e StatefulSets provide the primitives necessary to manage databases with stable identities.

The key to robust storage in production is a combination of architectural choices correct (reclaimPolicy Retain, WaitForFirstConsumer, anti-affinity), proactive monitoring with Prometheus and a regularly tested backup strategy with Velero or VolumeSnapshot.

Upcoming Articles in the Kubernetes at Scale Series

Related Series

Kubernetes Networking: CNI, Cilium with eBPF and Network Policy
MLOps and Machine Learning in Production — storage for ML datasets
PostgreSQL and Vector Search for AI