안녕하세요!

Federico Calò

Sviluppatore Software | Divulgatore Tecnico

Creo applicazioni web moderne e strumenti digitali personalizzati per aiutare le attività a crescere attraverso l'innovazione tecnologica. La mia passione è unire informatica ed economia per generare valore reale.

연락하기

소개

La mia passione per l'informatica è nata tra i banchi dell'Istituto Tecnico Commerciale di Maglie, dove ho scoperto il potere della programmazione e il fascino di creare soluzioni digitali. Fin da subito, ho capito che l'informatica non era solo codice, ma uno strumento straordinario per trasformare idee in realtà.

Durante gli studi superiori in Sistemi Informativi Aziendali, ho iniziato a intrecciare informatica ed economia, comprendendo come la tecnologia possa essere il motore della crescita per qualsiasi attività. Questa visione mi ha accompagnato all'Università degli Studi di Bari, dove ho conseguito la Laurea in Informatica, approfondendo le mie competenze tecniche e la mia passione per lo sviluppo software.

Oggi metto questa esperienza al servizio di imprese, professionisti e startup, creando soluzioni digitali su misura che automatizzano processi, ottimizzano risorse e aprono nuove opportunità di business. Perché la vera innovazione inizia quando la tecnologia incontra le esigenze reali delle persone.

역량

Analisi Dati & Modelli Previsionali

Trasformo i dati in insights strategici con analisi approfondite e modelli predittivi per decisioni informate

프로세스 자동화

Creo strumenti personalizzati che automatizzano operazioni ripetitive e liberano tempo per attività a valore aggiunto

맞춤 시스템

Sviluppo sistemi software su misura, dalle integrazioni tra piattaforme alle dashboard personalizzate

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

미션

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

🚀

기술의 민주화

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

💡

IT와 비즈니스 통합

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

🎯

맞춤 솔루션

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

기술로 비즈니스를 혁신하세요

Dicembre 2024

Visualizza

Master SQL

RoadMap.sh

Novembre 2024

Visualizza

Oracle Certified Foundations Associate

Oracle

Ottobre 2024

Visualizza

People Leadership Credential

Connect

Settembre 2024

💻 Linguaggi & Tecnologie

☕Java

🐍Python

📜JavaScript

🅰️Angular

⚛️React

🔷TypeScript

🗄️SQL

🐘PHP

🎨CSS/SCSS

🔧Node.js

🐳Docker

🌿Git

💼

12/2024 - Presente

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italia · Ibrida Analisi e sviluppo di sistemi informatici attraverso l'utilizzo di Java e Quarkus in Health and Public Sector. Formazione continua su tecnologie moderne per la creazione di soluzioni software personalizzate ed efficienti e sugli agenti.

💼

06/2022 - 12/2024

Analista software e Back End Developer Associate Consultant

Links Management and Technology SpA

Esperienza nell'analisi di sistemi software as-is e flussi ETL utilizzando PowerCenter. Formazione completata su Spring Boot per lo sviluppo di applicazioni backend moderne e scalabili. Sviluppatore Backend specializzato in Spring Boot, con esperienza in progettazione di database, analisi, sviluppo e testing dei task assegnati.

💼

02/2021 - 10/2021

Programmatore software

Adesso.it (prima era WebScience srl)

Esperienza nell'analisi AS-IS e TO-BE, evoluzioni SEO ed evoluzioni website per migliorare le performance e l'engagement degli utenti.

🎓

2018 - 2025

Laurea in Informatica

Università degli Studi di Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Sistemi Informativi Aziendali

Istituto Tecnico Commerciale di Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

연락하기

프로젝트가 있으신가요? 아래 양식을 작성해 주시면 빠르게 답변드리겠습니다.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

소개: 정보 손실 없이 데이터 압축

실제 데이터 세트에는 수백 또는 수천 개의 기능이 있는 경우가 많습니다. 이들 중 다수는 중복되거나 서로 관련되어 있습니다. 거기 차원 축소 데이터를 다음으로 압축할 수 있습니다. 유용한 정보의 대부분을 유지하면서 저차원 공간. 알고리즘 가장 많이 사용되는 것과 PCA(주성분 분석), 이는 전적으로 공분산 행렬의 고유값과 고유벡터에 대해 설명합니다.

무엇을 배울 것인가

차원의 저주와 그것을 줄이는 이유
공분산 행렬: 상관 관계 이해
PCA: 최대 분산의 방향 찾기
설명된 차이 및 구성 요소 수 선택
비선형 시각화를 위한 t-SNE 및 UMAP
NumPy 및 scikit-learn에서 완전한 구현

차원의 저주

고차원 공간에서는 데이터가 뿔뿔이 흩어진: 모든 포인트는 대략 등거리. 이는 거리 측정의 유용성을 떨어뜨리고 모델의 과적합. PCA는 데이터를 방향으로 투영하여 이 문제를 해결합니다. 더 유익합니다.

공분산 행렬

La 공분산 행렬 $\\mathbf{C}$ 그들을 잡아라 모든 특성 쌍 간의 상관 관계. 중심 데이터세트의 경우(평균 0) $\\mathbf{X} \\in \\mathbb{R}^{n \\times d}$ :

\\mathbf{C} = \\frac{1}{n-1} \\mathbf{X}^T \\mathbf{X}

모든 요소 $C{ij}$ 특성 간의 공분산 $i$ e $j$ :

C_{ij} = \\frac{1}{n-1} \\sum_{k=1}^{n} (x_{ki} - \\bar{x}_i)(x_{kj} - \\bar{x}_j)

대각선에는 각 특성의 분산이 포함되고, 대각선을 벗어난 요소에는 공분산이 포함됩니다. 만약에 $C_{ij} > 0$ , 기능은 양의 상관 관계가 있습니다. 만약에 $C_{ij} = 0$ , 상관 관계가 없습니다.

PCA: 수학적 유도

PCA는 다음을 검색합니다. 지도 데이터의 분산이 최대가 되는 지점입니다. 첫 번째 주요 구성 요소 $\\mathbf{w}_1$ 그리고 단위 벡터는 투영의 분산을 최대화합니다.

\\mathbf{w}_1 = \\arg\\max_{\\|\\mathbf{w}\\| = 1} \\text{Var}(\\mathbf{X}\\mathbf{w}) = \\arg\\max_{\\|\\mathbf{w}\\| = 1} \\mathbf{w}^T \\mathbf{C} \\mathbf{w}

라그랑주 승수를 사용하여 해와고유벡터 가장 큰 것에 해당하는 고유값 di $\\mathbf{C}$ :

\\mathbf{C} \\mathbf{w}_i = \\lambda_i \\mathbf{w}_i

어디 $\\lambda_1 \\geq \\lambda_2 \\geq \\cdots \\geq \\lambda_d \\geq 0$ 그들은 고유값을 정렬했습니다. 고유값 $\\lambda_i$ 그리고 정확히 거기 에 의해 포착된 분산 $i$ -번째 주성분.

투영 및 재구성

로 줄이려면 $k$ 차원, 우리는 전자에 투영 $k$ 고유벡터:

\\mathbf{Z} = \\mathbf{X} \\mathbf{W}_k \\quad \\text{여기서} \\quad \\mathbf{W}_k = [\\mathbf{w}_1, \\mathbf{w}_2, \\ldots, \\mathbf{w}_k] \\in \\mathbb{R}^{d \\times 케이}

대략적인 재구성은 다음과 같습니다.

\\hat{\\mathbf{X}} = \\mathbf{Z} \\mathbf{W}_k^T

분산 설명

La 설명된 분산 첫 번째 것부터 $k$ 구성 요소 및:

\\text{분산 설명} = \\frac{\\sum_{i=1}^{k} \\lambda_i}{\\sum_{i=1}^{d} \\lambda_i}

실제로는 다음을 선택합니다. $k$ 예를 들어 95% 또는 99%를 유지하려면 총 차이.


import numpy as np

# Dataset sintetico: 200 campioni, 5 feature (correlate)
np.random.seed(42)
n, d = 200, 5
X = np.random.randn(n, 2) @ np.array([[2, 1, 0.5, 0.3, 0.1],
                                        [0.5, 1.5, 1, 0.2, 0.8]])
X += np.random.randn(n, d) * 0.3  # Rumore

# PCA da zero
# 1. Centrare i dati
X_centered = X - X.mean(axis=0)

# 2. Matrice di covarianza
C = np.cov(X_centered, rowvar=False)
print(f"Matrice di covarianza:\n{np.round(C, 3)}\n")

# 3. Autovalori e autovettori
eigenvalues, eigenvectors = np.linalg.eigh(C)
# Ordinare in ordine decrescente
idx = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]

print(f"Autovalori: {np.round(eigenvalues, 4)}")

# 4. Varianza spiegata
var_explained = eigenvalues / eigenvalues.sum()
cumulative = np.cumsum(var_explained)
for i in range(d):
    print(f"PC{i+1}: {var_explained[i]*100:.1f}% (cumulativa: {cumulative[i]*100:.1f}%)")

# 5. Proiezione a 2D
k = 2
W_k = eigenvectors[:, :k]
Z = X_centered @ W_k
print(f"\nShape originale: {X.shape} -> Ridotta: {Z.shape}")

# 6. Errore di ricostruzione
X_reconstructed = Z @ W_k.T + X.mean(axis=0)
reconstruction_error = np.mean((X - X_reconstructed)**2)
print(f"Errore di ricostruzione (MSE): {reconstruction_error:.6f}")

Scikit-Learn을 사용한 PCA


from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
import numpy as np

# Standardizzazione (importante! PCA e sensibile alla scala)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# PCA automatica
pca = PCA(n_components=0.95)  # Mantieni 95% varianza
X_pca = pca.fit_transform(X_scaled)

print(f"Componenti selezionate: {pca.n_components_}")
print(f"Varianza spiegata: {pca.explained_variance_ratio_}")
print(f"Shape: {X.shape} -> {X_pca.shape}")

PCA 너머: t-SNE 및 UMAP

PCA는 다음으로 제한됩니다. 선형 변환. 데이터의 비선형 구조의 경우 t-SNE, UMAP 등의 방법이 사용됩니다.

t-SNE

t-SNE (t-분산 확률적 이웃 임베딩)은 거리 지역: 원래 공간의 가까운 점은 2D 표현에서도 가까운 상태로 유지됩니다. 원래 공간과 축소된 공간의 유사성 분포 간의 KL 발산을 최소화합니다.

p_{j|i} = \\frac{\\exp(-\\|\\mathbf{x}_i - \\mathbf{x}_j\\|^2 / 2\\sigma_i^2)}{\\sum_{k \\neq i} \\exp(-\\|\\mathbf{x}_i - \\mathbf{x}_k\\|^2 / 2\\sigma_i^2)}

q_{ij} = \\frac{(1 + \\|\\mathbf{y}_i - \\mathbf{y}_j\\|^2)^{-1}}{\\sum_{k \\neq l} (1 + \\|\\mathbf{y}_k - \\mathbf{y}_l\\|^2)^{-1}}

UMAP

UMAP (Uniform Manifold Approximation and Projection) 및 t-SNE e보다 빠릅니다. 더 잘 보존한다 글로벌 구조. 대수적 토폴로지를 기반으로 하며 퍼지 그래프 이론.

언제 어느 것을 사용할지: PCA 전처리용(분류자 전 크기 감소, 잡음 제거) t-SNE/UMAP 2D/3D 시각화용(클러스터, 이상값 탐색) PCA는 반전 및 해석이 가능하지만 t-SNE/UMAP은 그렇지 않습니다.

애플리케이션: ML의 전처리를 위한 PCA


from sklearn.datasets import load_digits
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import numpy as np

# Dataset digits: 1797 immagini 8x8 = 64 feature
digits = load_digits()
X, y = digits.data, digits.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Senza PCA (64 feature)
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)

clf_full = LogisticRegression(max_iter=5000)
clf_full.fit(X_train_s, y_train)
acc_full = clf_full.score(X_test_s, y_test)

# Con PCA (mantieni 95% varianza)
pca = PCA(n_components=0.95)
X_train_pca = pca.fit_transform(X_train_s)
X_test_pca = pca.transform(X_test_s)

clf_pca = LogisticRegression(max_iter=5000)
clf_pca.fit(X_train_pca, y_train)
acc_pca = clf_pca.score(X_test_pca, y_test)

print(f"Senza PCA: {X_train_s.shape[1]} feature, Accuracy: {acc_full:.4f}")
print(f"Con PCA:   {X_train_pca.shape[1]} feature, Accuracy: {acc_pca:.4f}")
print(f"Riduzione: {(1 - X_train_pca.shape[1]/X_train_s.shape[1])*100:.0f}% delle feature")

ML과의 요약 및 연결

기억해야 할 핵심 사항

PCA: 첫 번째 항목에 투영 $k$ 공분산 행렬의 고유벡터
고유값 $\\lambda_i$ : 각 구성요소에 의해 포착된 분산
분산 설명: 당신이 선택 $k$ 분산의 95% 이상을 유지하기 위해
표준화: PCA 이전 기본(스케일 민감)
t-SNE/UMAP: 비선형 2D/3D 디스플레이용
전처리를 위한 PCA: 과적합을 줄이고 훈련 속도를 높입니다.

다음 기사에서: 우리는 다음을 탐구할 것이다. 손실 함수 안으로 세부 사항. MSE, 교차 엔트로피, 초점 손실, 힌지 손실 및 사용자 지정 손실을 선택하고 생성하는 방법.