안녕하세요!

Federico Calò

Sviluppatore Software | Divulgatore Tecnico

Creo applicazioni web moderne e strumenti digitali personalizzati per aiutare le attività a crescere attraverso l'innovazione tecnologica. La mia passione è unire informatica ed economia per generare valore reale.

연락하기

소개

La mia passione per l'informatica è nata tra i banchi dell'Istituto Tecnico Commerciale di Maglie, dove ho scoperto il potere della programmazione e il fascino di creare soluzioni digitali. Fin da subito, ho capito che l'informatica non era solo codice, ma uno strumento straordinario per trasformare idee in realtà.

Durante gli studi superiori in Sistemi Informativi Aziendali, ho iniziato a intrecciare informatica ed economia, comprendendo come la tecnologia possa essere il motore della crescita per qualsiasi attività. Questa visione mi ha accompagnato all'Università degli Studi di Bari, dove ho conseguito la Laurea in Informatica, approfondendo le mie competenze tecniche e la mia passione per lo sviluppo software.

Oggi metto questa esperienza al servizio di imprese, professionisti e startup, creando soluzioni digitali su misura che automatizzano processi, ottimizzano risorse e aprono nuove opportunità di business. Perché la vera innovazione inizia quando la tecnologia incontra le esigenze reali delle persone.

역량

Analisi Dati & Modelli Previsionali

Trasformo i dati in insights strategici con analisi approfondite e modelli predittivi per decisioni informate

프로세스 자동화

Creo strumenti personalizzati che automatizzano operazioni ripetitive e liberano tempo per attività a valore aggiunto

맞춤 시스템

Sviluppo sistemi software su misura, dalle integrazioni tra piattaforme alle dashboard personalizzate

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

미션

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

🚀

기술의 민주화

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

💡

IT와 비즈니스 통합

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

🎯

맞춤 솔루션

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

기술로 비즈니스를 혁신하세요

Dicembre 2024

Visualizza

Master SQL

RoadMap.sh

Novembre 2024

Visualizza

Oracle Certified Foundations Associate

Oracle

Ottobre 2024

Visualizza

People Leadership Credential

Connect

Settembre 2024

💻 Linguaggi & Tecnologie

☕Java

🐍Python

📜JavaScript

🅰️Angular

⚛️React

🔷TypeScript

🗄️SQL

🐘PHP

🎨CSS/SCSS

🔧Node.js

🐳Docker

🌿Git

💼

12/2024 - Presente

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italia · Ibrida Analisi e sviluppo di sistemi informatici attraverso l'utilizzo di Java e Quarkus in Health and Public Sector. Formazione continua su tecnologie moderne per la creazione di soluzioni software personalizzate ed efficienti e sugli agenti.

💼

06/2022 - 12/2024

Analista software e Back End Developer Associate Consultant

Links Management and Technology SpA

Esperienza nell'analisi di sistemi software as-is e flussi ETL utilizzando PowerCenter. Formazione completata su Spring Boot per lo sviluppo di applicazioni backend moderne e scalabili. Sviluppatore Backend specializzato in Spring Boot, con esperienza in progettazione di database, analisi, sviluppo e testing dei task assegnati.

💼

02/2021 - 10/2021

Programmatore software

Adesso.it (prima era WebScience srl)

Esperienza nell'analisi AS-IS e TO-BE, evoluzioni SEO ed evoluzioni website per migliorare le performance e l'engagement degli utenti.

🎓

2018 - 2025

Laurea in Informatica

Università degli Studi di Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Sistemi Informativi Aziendali

Istituto Tecnico Commerciale di Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

연락하기

프로젝트가 있으신가요? 아래 양식을 작성해 주시면 빠르게 답변드리겠습니다.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

Transformers를 이용한 감정 분석: 기술 및 구현

La 감성 분석 — 또는 감정 분석 — 그리고 가장 많이 요청되는 NLP 작업 기업 환경. 매일 수백만 개의 회사가 제품 리뷰, 게시물을 분석합니다. 사람들이 실제로 생각하는 것을 이해하기 위해 소셜 미디어, 지원 티켓 및 고객 피드백을 제공합니다. BERT 및 Transformer 모델의 출현으로 이러한 시스템의 품질이 향상되었습니다. 기존 사전 기반 또는 TF-IDF 접근 방식과 근본적으로 비교됩니다.

이 기사에서 우리는 완전한 감정 분석 시스템을 구축할 것입니다. HuggingFace로 미세 조정을 거쳐 프로덕션으로 데이터 세트를 제작하고, 불균형 수업 관리, 지표 및 전략 평가 아이러니, 부정, 모호한 언어 등 경계선에 있는 사건을 처리합니다.

이 시리즈의 세 번째 기사입니다 최신 NLP: BERT에서 LLM까지. BERT 기본 사항(2조)에 대해 잘 알고 있다고 가정합니다. 특히 이탈리아어의 경우, Feel-It 및 AlBERTo 모델에 대한 기사 4를 참조하세요.

무엇을 배울 것인가

감정 분석을 위한 기존 접근 방식과 BERT: VADER, 어휘 기반, 미세 조정된 변환기
대중 정서 데이터 세트: SST-2, IMDb, Amazon Reviews, SemEval
HuggingFace Transformers 및 Trainer API를 통한 완벽한 구현
계층간 정서 불균형 관리
측정항목: 정확도, F1, 정밀도, 재현율, AUC-ROC
세분화된 감정: 측면(ABSA) 및 강도
어려운 경우: 아이러니, 부정, 모호한 언어
FastAPI 및 일괄 추론을 사용한 프로덕션 파이프라인
대기 시간 최적화: 양자화 및 ONNX 내보내기

1. 접근방식의 진화: VADER에서 BERT까지

Transformers 구현에 대해 알아보기 전에 여정을 이해하는 것이 도움이 됩니다. 생산에 자주 사용되는 감정 분석에 대한 접근 방식의 역사 요구 사항을 충족하는 가장 간단한 방법입니다.

1.1 사전 기반 접근 방식: VADER

VADER(Valence Aware Dictionary 및 감정 추론기) 그리고 어휘 기반 소셜 미디어에 최적화된 분석기입니다. 훈련이 필요하지 않으며 매우 빠릅니다. 비격식적인 텍스트에서는 놀라울 정도로 잘 작동합니다.

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

analyzer = SentimentIntensityAnalyzer()

# Esempi base
texts = [
    "This product is absolutely AMAZING!!!",         # positivo forte
    "The service was okay I guess",                   # neutro ambiguo
    "Worst purchase I've ever made. Complete waste.", # negativo
    "The food wasn't bad at all",                     # negazione tricky
    "Yeah right, as if this would work :)",           # sarcasmo
]

for text in texts:
    scores = analyzer.polarity_scores(text)
    print(f"Text: {text[:50]}")
    print(f"  neg={scores['neg']:.3f}, neu={scores['neu']:.3f}, "
          f"pos={scores['pos']:.3f}, compound={scores['compound']:.3f}")
    label = 'POSITIVE' if scores['compound'] >= 0.05 else \
            'NEGATIVE' if scores['compound'] <= -0.05 else 'NEUTRAL'
    print(f"  Label: {label}\n")

# VADER gestisce bene: maiuscole, punteggiatura, emoji
# Non gestisce bene: sarcasmo, contesto complesso

1.2 고전적인 기계 학습 접근 방식

Transformers 이전에는 TF-IDF + 로지스틱 회귀 분석이 가장 널리 사용되었습니다. 또는 SVM. 오늘날에도 빠른 기준으로 사용하거나 데이터가 거의 없을 때 유용합니다.

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.metrics import classification_report

# Dataset di esempio
train_texts = [
    "Ottimo prodotto, lo consiglio a tutti",
    "Pessima esperienza, non lo ricomprero",
    "qualità eccellente, spedizione veloce",
    "Totale spreco di soldi",
    "Servizio clienti impeccabile",
    "Prodotto difettoso, deluso"
]
train_labels = [1, 0, 1, 0, 1, 0]

# Pipeline TF-IDF + Logistic Regression
pipe = Pipeline([
    ('tfidf', TfidfVectorizer(
        ngram_range=(1, 2),      # unigrammi e bigrammi
        max_features=50000,
        sublinear_tf=True        # log(1+tf) per attenuare frequenze alte
    )),
    ('clf', LogisticRegression(C=1.0, max_iter=1000))
])

pipe.fit(train_texts, train_labels)

# Valutazione
test_texts = ["Prodotto fantastico!", "Pessimo, non funziona"]
preds = pipe.predict(test_texts)
probs = pipe.predict_proba(test_texts)

for text, pred, prob in zip(test_texts, preds, probs):
    label = 'POSITIVO' if pred == 1 else 'NEGATIVO'
    confidence = max(prob)
    print(f"{text}: {label} ({confidence:.2f})")

1.3 BERT가 우수하기 때문에

감정 분석 접근법의 비교

접근하다	정확도(SST-2)	숨어 있음	훈련 데이터	어려운 사례
베이더	~71%	<1ms	아무도	희귀한
TF-IDF + LR	~85%	~5ms	필요한	중간
디스틸버트	~91%	~50ms	필요한	좋은
BERT 기본	~93%	~100ms	필요한	최적
로베르타	~96%	~100ms	필요한	훌륭한

2. 감정 분석을 위한 데이터 세트

미세 조정의 품질은 데이터 세트의 품질과 크기에 따라 크게 달라집니다. 다음 기사에서 이탈리아어에 대한 표시와 함께 영어에 대한 가장 중요한 사항은 다음과 같습니다.

from datasets import load_dataset

# SST-2: Stanford Sentiment Treebank (binario: positivo/negativo)
sst2 = load_dataset("glue", "sst2")
print(sst2)
# train: 67,349 esempi, validation: 872, test: 1,821

# IMDb Reviews (binario: positivo/negativo)
imdb = load_dataset("imdb")
print(imdb)
# train: 25,000, test: 25,000

# Amazon Reviews (1-5 stelle)
amazon = load_dataset("amazon_polarity")
print(amazon)
# train: 3,600,000, test: 400,000

# Esempio di esplorazione del dataset
print("\nSST-2 esempi:")
for i, example in enumerate(sst2['train'].select(range(3))):
    label = 'POSITIVO' if example['label'] == 1 else 'NEGATIVO'
    print(f"  [{label}] {example['sentence']}")

# Analisi distribuzione classi
from collections import Counter
labels = sst2['train']['label']
print("\nDistribuzione SST-2 train:", Counter(labels))
# Counter({1: 37569, 0: 29780}) - leggero sbilanciamento

3. HuggingFace로 완벽한 미세 조정

데이터 준비부터 완전한 감정 분류기를 구축합니다. 훈련된 모델을 저장할 때.

3.1 데이터 준비

from transformers import AutoTokenizer
from datasets import load_dataset, DatasetDict
import numpy as np

# Utilizziamo DistilBERT per velocità (97% di BERT, 60% più veloce)
MODEL_NAME = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

# Carica SST-2 da GLUE
dataset = load_dataset("glue", "sst2")

def tokenize_function(examples):
    return tokenizer(
        examples["sentence"],
        padding="max_length",
        truncation=True,
        max_length=128,
        return_tensors=None   # restituisce liste, non tensori
    )

# Tokenizzazione del dataset completo (con cache)
tokenized = dataset.map(
    tokenize_function,
    batched=True,
    batch_size=1000,
    remove_columns=["sentence", "idx"]  # rimuovi colonne non necessarie
)

# Formato PyTorch
tokenized.set_format("torch")
print(tokenized)
print("Colonne train:", tokenized['train'].column_names)
# ['input_ids', 'attention_mask', 'label']

3.2 모델 정의 및 훈련

from transformers import (
    AutoModelForSequenceClassification,
    TrainingArguments,
    Trainer
)
import evaluate
import numpy as np

# Modello con testa di classificazione
model = AutoModelForSequenceClassification.from_pretrained(
    MODEL_NAME,
    num_labels=2,
    id2label={0: "NEGATIVE", 1: "POSITIVE"},
    label2id={"NEGATIVE": 0, "POSITIVE": 1}
)

# Metriche di valutazione
accuracy = evaluate.load("accuracy")
f1 = evaluate.load("f1")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return {
        "accuracy": accuracy.compute(
            predictions=predictions, references=labels)["accuracy"],
        "f1": f1.compute(
            predictions=predictions, references=labels,
            average="binary")["f1"]
    }

# Configurazione training
training_args = TrainingArguments(
    output_dir="./results/distilbert-sst2",
    num_train_epochs=3,
    per_device_train_batch_size=32,
    per_device_eval_batch_size=64,
    warmup_ratio=0.1,
    weight_decay=0.01,
    learning_rate=2e-5,
    lr_scheduler_type="linear",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    metric_for_best_model="f1",
    greater_is_better=True,
    logging_dir="./logs",
    logging_steps=100,
    fp16=True,          # Mixed precision (GPU con Tensor Cores)
    dataloader_num_workers=4,
    report_to="none",   # Disabilita wandb/tensorboard per semplicità
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized["train"],
    eval_dataset=tokenized["validation"],
    compute_metrics=compute_metrics,
)

# Avvia training
train_result = trainer.train()
print(f"Training loss: {train_result.training_loss:.4f}")

# Valutazione finale
metrics = trainer.evaluate()
print(f"Validation accuracy: {metrics['eval_accuracy']:.4f}")
print(f"Validation F1: {metrics['eval_f1']:.4f}")

# Salva modello e tokenizer
trainer.save_model("./models/distilbert-sst2")
tokenizer.save_pretrained("./models/distilbert-sst2")

3.3 불균형 계층 관리

많은 실제 데이터 세트(예: 고객 지원 리뷰)에서 클래스는 강력하게 불균형: 90% 부정적, 10% 긍정적. 예방 조치 없이 모델은 학습합니다. 항상 다수 클래스를 예측합니다.

import torch
from torch import nn
from transformers import Trainer

# Soluzione 1: Weighted loss function
class WeightedTrainer(Trainer):
    def __init__(self, class_weights, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.class_weights = torch.tensor(class_weights, dtype=torch.float)

    def compute_loss(self, model, inputs, return_outputs=False):
        labels = inputs.get("labels")
        outputs = model(**inputs)
        logits = outputs.get("logits")

        # CrossEntropy con pesi inversamente proporzionali alla frequenza
        loss_fct = nn.CrossEntropyLoss(
            weight=self.class_weights.to(logits.device)
        )
        loss = loss_fct(logits.view(-1, self.model.config.num_labels),
                       labels.view(-1))
        return (loss, outputs) if return_outputs else loss

# Calcola pesi dalle frequenze del dataset
from sklearn.utils.class_weight import compute_class_weight
import numpy as np

labels = tokenized['train']['label'].numpy()
weights = compute_class_weight(
    class_weight='balanced',
    classes=np.unique(labels),
    y=labels
)
print("Class weights:", weights)  # es. [2.3, 0.7] se negativo e raro

# Soluzione 2: Oversampling con imbalanced-learn
# pip install imbalanced-learn
from imblearn.over_sampling import RandomOverSampler
# (applicabile alle feature matrix, non direttamente ai tensor)

# Soluzione 3: Metriche appropriate per sbilanciamento
from sklearn.metrics import classification_report
# Usa F1 macro o F1 per la classe minoritaria, non solo accuracy

4. 세분화된 감정: 측면 기반(ABSA)

바이너리(긍정적/부정적) 감정 분석은 복잡성을 포착하지 못합니다. 진짜 의견. 고객은 제품에 만족할 수도 있지만 배송에 불만족. 그만큼'측면 기반 감정 분석(ABSA) 언급된 각 측면에 대한 감정을 식별합니다.

from transformers import pipeline

# Zero-shot classification per ABSA
classifier = pipeline(
    "zero-shot-classification",
    model="facebook/bart-large-mnli"
)

review = "Il prodotto e eccellente ma la spedizione ha impiegato tre settimane. Il servizio clienti non ha risposto."

# Classificazione per ogni aspetto
aspects = ["prodotto", "spedizione", "servizio clienti"]
sentiments_per_aspect = {}

for aspect in aspects:
    # Prompt per ogni aspetto
    hypothesis = f"Il cliente e soddisfatto del {aspect}."
    result = classifier(
        review,
        candidate_labels=["positivo", "negativo", "neutro"],
        hypothesis_template=f"In questa recensione, il {{}} riguardo {aspect} e {{}}."
    )
    sentiments_per_aspect[aspect] = result['labels'][0]
    print(f"{aspect}: {result['labels'][0]} ({result['scores'][0]:.2f})")

# Output atteso:
# prodotto: positivo (0.89)
# spedizione: negativo (0.92)
# servizio clienti: negativo (0.87)

5. 어려운 경우: 아이러니, 부정, 모호함

BERT 모델은 고전적인 방법보다 많은 어려운 사례를 더 잘 처리합니다. 그러나 그들은 오류가 없는 것은 아니다. 가장 일반적인 사례를 분석하고 완화하는 방법은 다음과 같습니다.

5.1 거부 관리

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="distilbert-base-uncased-finetuned-sst-2-english"
)

# Test su casi di negazione
negation_examples = [
    "This is not bad at all",        # doppia negazione = positivo
    "I wouldn't say it's terrible",  # negazione attenuante
    "Not the worst, but not great",  # ambiguo
    "Far from perfect",              # negazione implicita
    "Could have been worse",         # comparativo negativo-positivo
]

for text in negation_examples:
    result = classifier(text)[0]
    print(f"'{text}'")
    print(f"  -> {result['label']} ({result['score']:.3f})\n")

# BERT gestisce bene "not bad" -> POSITIVE
# Ma può sbagliare con negazioni complesse e indirette

5.2 오류 분석

import pandas as pd
from sklearn.metrics import confusion_matrix, classification_report
import matplotlib.pyplot as plt
import seaborn as sns

def analyze_errors(texts, true_labels, predicted_labels, probs):
    """Analisi dettagliata degli errori del modello."""
    results = pd.DataFrame({
        'text': texts,
        'true_label': true_labels,
        'pred_label': predicted_labels,
        'confidence': [max(p) for p in probs],
        'correct': [t == p for t, p in zip(true_labels, predicted_labels)]
    })

    # Falsi positivi: modello dice POSITIVO ma e NEGATIVO
    fp = results[(results['true_label'] == 0) & (results['pred_label'] == 1)]
    print(f"Falsi Positivi ({len(fp)}):")
    for _, row in fp.head(5).iterrows():
        print(f"  Conf={row['confidence']:.2f}: {row['text'][:80]}")

    # Falsi negativi: modello dice NEGATIVO ma e POSITIVO
    fn = results[(results['true_label'] == 1) & (results['pred_label'] == 0)]
    print(f"\nFalsi Negativi ({len(fn)}):")
    for _, row in fn.head(5).iterrows():
        print(f"  Conf={row['confidence']:.2f}: {row['text'][:80]}")

    # Confusion matrix
    cm = confusion_matrix(true_labels, predicted_labels)
    print(f"\nClassification Report:\n")
    print(classification_report(true_labels, predicted_labels,
                                target_names=['NEGATIVO', 'POSITIVO']))

    return results

6. FastAPI를 사용한 시운전

감정 분석 모델은 프로덕션 환경에서 액세스할 수 있는 경우에만 가치가 있습니다. FastAPI를 사용하여 빠르고 확장 가능한 REST 엔드포인트를 구축하는 방법은 다음과 같습니다.

# sentiment_api.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, validator
from transformers import pipeline
from typing import List
import time

app = FastAPI(title="Sentiment Analysis API", version="1.0")

# Carica il modello una sola volta all'avvio
MODEL_PATH = "./models/distilbert-sst2"
sentiment_pipeline = pipeline(
    "text-classification",
    model=MODEL_PATH,
    device=-1,              # -1 = CPU, 0 = prima GPU
    batch_size=32,          # batch inference per efficienza
    truncation=True,
    max_length=128
)

class SentimentRequest(BaseModel):
    texts: List[str]

    @validator('texts')
    def validate_texts(cls, texts):
        if not texts:
            raise ValueError("Lista testi non può essere vuota")
        if len(texts) > 100:
            raise ValueError("Massimo 100 testi per richiesta")
        for text in texts:
            if len(text) > 5000:
                raise ValueError("Testo troppo lungo (max 5000 caratteri)")
        return texts

class SentimentResult(BaseModel):
    text: str
    label: str
    score: float
    processing_time_ms: float

@app.post("/predict", response_model=List[SentimentResult])
async def predict_sentiment(request: SentimentRequest):
    start = time.time()
    try:
        results = sentiment_pipeline(request.texts)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

    elapsed = (time.time() - start) * 1000
    per_text = elapsed / len(request.texts)

    return [
        SentimentResult(
            text=text,
            label=r['label'],
            score=r['score'],
            processing_time_ms=per_text
        )
        for text, r in zip(request.texts, results)
    ]

@app.get("/health")
def health_check():
    return {"status": "ok", "model": MODEL_PATH}

# Avvio: uvicorn sentiment_api:app --host 0.0.0.0 --port 8000

7. 지연 시간 최적화

프로덕션에서는 대기 시간이 중요한 경우가 많습니다. 주요 기술은 다음과 같습니다 품질을 너무 많이 잃지 않으면서 추론 시간을 줄이기 위한 것입니다.

7.1 동적 양자화

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("./models/distilbert-sst2")
tokenizer = AutoTokenizer.from_pretrained("./models/distilbert-sst2")

# Quantizzazione dinamica (INT8): riduce dimensione e aumenta velocità su CPU
quantized_model = torch.quantization.quantize_dynamic(
    model,
    {torch.nn.Linear},  # quantizza solo i layer Linear
    dtype=torch.qint8
)

# Confronto dimensioni
import os
def model_size(m):
    torch.save(m.state_dict(), "tmp.pt")
    size = os.path.getsize("tmp.pt") / (1024 * 1024)
    os.remove("tmp.pt")
    return size

print(f"Modello originale: {model_size(model):.1f} MB")
print(f"Modello quantizzato: {model_size(quantized_model):.1f} MB")
# Originale: ~250 MB, Quantizzato: ~65 MB

# Benchmark velocità
import time

def benchmark(m, tokenizer, texts, n_runs=50):
    inputs = tokenizer(texts, return_tensors='pt',
                      padding=True, truncation=True, max_length=128)
    with torch.no_grad():
        # Warm-up
        for _ in range(5):
            _ = m(**inputs)
        # Benchmark
        start = time.time()
        for _ in range(n_runs):
            _ = m(**inputs)
        elapsed = (time.time() - start) / n_runs * 1000
    return elapsed

texts = ["This product is amazing!"] * 8  # batch di 8
t_orig = benchmark(model, tokenizer, texts)
t_quant = benchmark(quantized_model, tokenizer, texts)
print(f"Originale: {t_orig:.1f}ms, Quantizzato: {t_quant:.1f}ms")
print(f"Speedup: {t_orig/t_quant:.2f}x")

7.2 배포를 위해 ONNX 내보내기

from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
import time

# Converti in ONNX con HuggingFace Optimum
# pip install optimum[onnxruntime]
model_onnx = ORTModelForSequenceClassification.from_pretrained(
    "./models/distilbert-sst2",
    export=True,          # esporta in ONNX al primo caricamento
    provider="CPUExecutionProvider"
)
tokenizer = AutoTokenizer.from_pretrained("./models/distilbert-sst2")

# Inferenza con ONNX Runtime
text = "This product exceeded all my expectations!"
inputs = tokenizer(text, return_tensors='pt', truncation=True, max_length=128)

start = time.time()
outputs = model_onnx(**inputs)
latency = (time.time() - start) * 1000

import torch
probs = torch.softmax(outputs.logits, dim=-1)
label = model_onnx.config.id2label[probs.argmax().item()]
confidence = probs.max().item()

print(f"Label: {label}")
print(f"Confidence: {confidence:.3f}")
print(f"Latency: {latency:.1f}ms")
# ONNX e tipicamente 2-4x più veloce della versione PyTorch su CPU

8. 완전한 평가 및 보고

from sklearn.metrics import (
    classification_report,
    roc_auc_score,
    average_precision_score,
    confusion_matrix
)
import numpy as np

def evaluate_sentiment_model(model, tokenizer, test_texts, test_labels,
                              batch_size=64):
    """Valutazione completa del modello di sentiment."""
    all_probs = []
    all_preds = []

    for i in range(0, len(test_texts), batch_size):
        batch = test_texts[i:i+batch_size]
        inputs = tokenizer(
            batch, return_tensors='pt', padding=True,
            truncation=True, max_length=128
        )
        with torch.no_grad():
            outputs = model(**inputs)
        probs = torch.softmax(outputs.logits, dim=-1).numpy()
        preds = np.argmax(probs, axis=1)
        all_probs.extend(probs[:, 1])  # probabilità classe positiva
        all_preds.extend(preds)

    # Report principale
    print("=== Classification Report ===")
    print(classification_report(
        test_labels, all_preds,
        target_names=['NEGATIVE', 'POSITIVE'],
        digits=4
    ))

    # Metriche aggiuntive
    auc = roc_auc_score(test_labels, all_probs)
    ap = average_precision_score(test_labels, all_probs)
    print(f"AUC-ROC:            {auc:.4f}")
    print(f"Average Precision:  {ap:.4f}")

    # Analisi errori per fascia di confidenza
    all_probs = np.array(all_probs)
    all_preds = np.array(all_preds)
    test_labels = np.array(test_labels)

    for threshold in [0.5, 0.7, 0.9]:
        high_conf = all_probs >= threshold
        if high_conf.sum() > 0:
            acc_high = (all_preds[high_conf] == test_labels[high_conf]).mean()
            print(f"Accuracy (conf >= {threshold}): {acc_high:.4f} "
                  f"({high_conf.sum()} esempi)")

    return np.array(all_probs), np.array(all_preds)

9. 생산을 위한 최적화: ONNX 및 양자화

BERT 모델에는 상당한 계산 리소스가 필요합니다. 신청을 위해 낮은 대기 시간이나 제한된 하드웨어에서는 다양한 최적화 전략이 있습니다. 품질 저하 없이 추론 시간을 대폭 단축합니다.

최적화 전략 비교

전략	지연 시간 감소	모델 축소	품질 저하	복잡성
ONNX 내보내기	2-4배	~10%	<0.1%	낮은
동적 양자화(INT8)	2-3배	75%	0.5-1%	낮은
정적 양자화(INT8)	3-5배	75%	0.3-0.8%	평균
디스틸버트(KD)	2x	40%	3%	평균
토치스크립트	1.5-2x	없음	<0.1%	낮은

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from optimum.onnxruntime import ORTModelForSequenceClassification
import torch
import time

# ---- Esportazione ONNX con Optimum ----
model_path = "./models/distilbert-sentiment"

# Esportazione e ottimizzazione ONNX in un comando
ort_model = ORTModelForSequenceClassification.from_pretrained(
    model_path,
    export=True,       # Esporta automaticamente in ONNX
    provider="CPUExecutionProvider"
)
tokenizer = AutoTokenizer.from_pretrained(model_path)

# Salva modello ONNX
ort_model.save_pretrained("./models/distilbert-sentiment-onnx")

# ---- Benchmark: PyTorch vs ONNX ----
def benchmark_model(predict_fn, texts, n_runs=100):
    """Misura latenza media su n_runs inferenze."""
    # Warmup
    for _ in range(10):
        predict_fn(texts[0])

    times = []
    for text in texts[:n_runs]:
        start = time.perf_counter()
        predict_fn(text)
        times.append((time.perf_counter() - start) * 1000)

    import numpy as np
    return {
        "mean_ms": round(np.mean(times), 2),
        "p50_ms":  round(np.percentile(times, 50), 2),
        "p95_ms":  round(np.percentile(times, 95), 2),
        "p99_ms":  round(np.percentile(times, 99), 2),
    }

# Carica modello PyTorch originale per confronto
pt_model = AutoModelForSequenceClassification.from_pretrained(model_path)
pt_model.eval()

def pt_predict(text):
    inputs = tokenizer(text, return_tensors='pt', truncation=True, max_length=128)
    with torch.no_grad():
        return pt_model(**inputs).logits

def onnx_predict(text):
    inputs = tokenizer(text, return_tensors='pt', truncation=True, max_length=128)
    return ort_model(**inputs).logits

test_texts = ["Prodotto ottimo, lo consiglio!"] * 100

pt_stats = benchmark_model(pt_predict, test_texts)
onnx_stats = benchmark_model(onnx_predict, test_texts)

print("PyTorch:  ", pt_stats)
print("ONNX:     ", onnx_stats)
print(f"Speedup: {pt_stats['p95_ms'] / onnx_stats['p95_ms']:.1f}x")

# Quantizzazione dinamica con PyTorch (nessun dato di calibrazione)
import torch

def quantize_bert_dynamic(model_path: str, output_path: str):
    """Quantizzazione INT8 dinamica per CPU inference."""
    from transformers import AutoModelForSequenceClassification

    model = AutoModelForSequenceClassification.from_pretrained(model_path)
    model.eval()

    # Quantizza solo i layer Linear (nn.Linear) dinamicamente
    quantized = torch.quantization.quantize_dynamic(
        model,
        {torch.nn.Linear},
        dtype=torch.qint8
    )

    # Salva il modello quantizzato
    torch.save(quantized.state_dict(), f"{output_path}/quantized_model.pt")

    # Confronto dimensioni
    import os
    original_size = sum(
        os.path.getsize(f"{model_path}/{f}")
        for f in os.listdir(model_path) if f.endswith('.bin')
    ) / 1024 / 1024

    print(f"Modello originale: ~{original_size:.0f} MB")
    print(f"Riduzione stimata: ~75% → ~{original_size * 0.25:.0f} MB")

    return quantized

# Esempio di utilizzo
# quantized_model = quantize_bert_dynamic(
#     "./models/distilbert-sentiment",
#     "./models/quantized"
# )

10. 생산 모범 사례

안티 패턴: 검증 없이 원시 패턴을 사용하지 마십시오

SST-2(영화 리뷰)로 훈련된 모델의 성능이 저하될 수 있음 기술 지원 티켓이나 소셜 미디어 게시물을 통해 항상 모델 검증 배포하기 전에 특정 도메인에

프로덕션 배포를 위한 체크리스트

공개 벤치마크뿐만 아니라 대상 도메인 데이터에 대한 모델 평가
신뢰도 임계값 설정: 임계값보다 낮은 "불확실함" 반환(예: 0.6)
시간 경과에 따른 신뢰도 점수 분포 모니터링
잘못된 라벨을 수집하는 피드백 메커니즘 구현
모델과 토크나이저를 함께 버전화
비정상적인 입력(빈 텍스트, 특수 문자, 극단적인 길이)에 대한 테스트 동작
API에 대한 속도 제한 및 시간 초과 구현
사후 분석을 위해 모든 예측을 기록합니다.

class ProductionSentimentClassifier:
    """Classificatore di sentiment pronto per la produzione."""

    def __init__(self, model_path: str, confidence_threshold: float = 0.7):
        self.pipeline = pipeline(
            "text-classification",
            model=model_path,
            truncation=True,
            max_length=128
        )
        self.threshold = confidence_threshold

    def predict(self, text: str) -> dict:
        # Validazione input
        if not text or not text.strip():
            return {"label": "UNKNOWN", "score": 0.0, "reason": "empty_input"}

        text = text.strip()[:5000]  # Trunca testi troppo lunghi

        result = self.pipeline(text)[0]

        # Gestione incertezza
        if result['score'] < self.threshold:
            return {
                "label": "UNCERTAIN",
                "score": result['score'],
                "raw_label": result['label'],
                "reason": "below_confidence_threshold"
            }

        return {
            "label": result['label'],
            "score": result['score'],
            "reason": "ok"
        }

    def predict_batch(self, texts: list) -> list:
        # Filtra testi vuoti mantenendo la posizione
        valid_texts = [t.strip()[:5000] if t and t.strip() else "" for t in texts]
        results = self.pipeline(valid_texts)
        return [
            self.predict(t) if t else {"label": "UNKNOWN", "score": 0.0}
            for t in valid_texts
        ]

결론 및 다음 단계

이 기사에서는 감정 분석 시스템의 전체 수명주기를 다루었습니다. 고전적인 접근 방식(VADER, TF-IDF)부터 Transformer 모델의 미세 조정까지, FastAPI를 통해 불균형 데이터 관리부터 프로덕션에 투입까지 대기 시간 최적화.

핵심 사항

요구 사항에 따라 접근 방식을 선택하세요. 속도는 VADER, 품질은 BERT
항상 평가 귀하의 도메인 벤치마크뿐만 아니라 구체적
가중 손실 또는 오버샘플링을 사용하여 불균형 클래스 처리
강제 예측 대신 프로덕션에서 신뢰도 임계값을 사용하세요.
DistilBERT는 생산을 위한 탁월한 속도/품질 절충안을 제공합니다.
시간 경과에 따른 예측을 모니터링하여 데이터 드리프트 감지

시리즈는 계속됩니다

다음: 이탈리아어를 위한 NLP — Feel-It, AlBERTo 및 이탈리아어의 구체적인 과제
제5조: 명명된 엔터티 인식 — 텍스트에서 엔터티 추출
제6조: 다중 레이블 텍스트 분류 — 텍스트가 여러 카테고리에 속하는 경우
제7조: HuggingFace Transformers: 전체 가이드 — API 트레이너, 데이터세트, 허브
제10조: 프로덕션에서의 NLP 모니터링 — 드리프트 감지 및 자동 재교육