안녕하세요!

Federico Calò

Sviluppatore Software | Divulgatore Tecnico

Creo applicazioni web moderne e strumenti digitali personalizzati per aiutare le attività a crescere attraverso l'innovazione tecnologica. La mia passione è unire informatica ed economia per generare valore reale.

연락하기

소개

La mia passione per l'informatica è nata tra i banchi dell'Istituto Tecnico Commerciale di Maglie, dove ho scoperto il potere della programmazione e il fascino di creare soluzioni digitali. Fin da subito, ho capito che l'informatica non era solo codice, ma uno strumento straordinario per trasformare idee in realtà.

Durante gli studi superiori in Sistemi Informativi Aziendali, ho iniziato a intrecciare informatica ed economia, comprendendo come la tecnologia possa essere il motore della crescita per qualsiasi attività. Questa visione mi ha accompagnato all'Università degli Studi di Bari, dove ho conseguito la Laurea in Informatica, approfondendo le mie competenze tecniche e la mia passione per lo sviluppo software.

Oggi metto questa esperienza al servizio di imprese, professionisti e startup, creando soluzioni digitali su misura che automatizzano processi, ottimizzano risorse e aprono nuove opportunità di business. Perché la vera innovazione inizia quando la tecnologia incontra le esigenze reali delle persone.

역량

Analisi Dati & Modelli Previsionali

Trasformo i dati in insights strategici con analisi approfondite e modelli predittivi per decisioni informate

프로세스 자동화

Creo strumenti personalizzati che automatizzano operazioni ripetitive e liberano tempo per attività a valore aggiunto

맞춤 시스템

Sviluppo sistemi software su misura, dalle integrazioni tra piattaforme alle dashboard personalizzate

const federico = {
  nome: "Federico Calò",
  ruolo: "Sviluppatore Software",
  città: "Bari, Italia",
  missione: "Aiutare attraverso l'informatica",
  passioni: [
    "Codice Pulito",
    "Innovazione",
    "Crescita Continua"
  ]
};

미션

Credo fermamente che l'informatica sia lo strumento più potente per trasformare le idee in realtà e migliorare la vita delle persone.

🚀

기술의 민주화

La mia missione è rendere l'informatica accessibile a tutti: dalle piccole imprese locali alle startup innovative, fino ai professionisti che vogliono digitalizzare la propria attività. Ogni realtà merita di sfruttare le potenzialità del digitale.

💡

IT와 비즈니스 통합

Non è solo questione di scrivere codice: è capire come la tecnologia possa generare valore reale. Intrecciando competenze informatiche e visione economica, aiuto le attività a crescere, ottimizzare processi e raggiungere nuovi traguardi di efficienza e redditività.

🎯

맞춤 솔루션

Ogni attività è unica, e così devono esserlo le soluzioni. Sviluppo strumenti personalizzati che rispondono alle esigenze specifiche di ciascun cliente, automatizzando processi ripetitivi e liberando tempo per ciò che conta davvero: far crescere il business.

기술로 비즈니스를 혁신하세요

Dicembre 2024

Visualizza

Master SQL

RoadMap.sh

Novembre 2024

Visualizza

Oracle Certified Foundations Associate

Oracle

Ottobre 2024

Visualizza

People Leadership Credential

Connect

Settembre 2024

💻 Linguaggi & Tecnologie

☕Java

🐍Python

📜JavaScript

🅰️Angular

⚛️React

🔷TypeScript

🗄️SQL

🐘PHP

🎨CSS/SCSS

🔧Node.js

🐳Docker

🌿Git

💼

12/2024 - Presente

Custom Software Engineering Analyst

Accenture

Bari, Puglia, Italia · Ibrida Analisi e sviluppo di sistemi informatici attraverso l'utilizzo di Java e Quarkus in Health and Public Sector. Formazione continua su tecnologie moderne per la creazione di soluzioni software personalizzate ed efficienti e sugli agenti.

💼

06/2022 - 12/2024

Analista software e Back End Developer Associate Consultant

Links Management and Technology SpA

Esperienza nell'analisi di sistemi software as-is e flussi ETL utilizzando PowerCenter. Formazione completata su Spring Boot per lo sviluppo di applicazioni backend moderne e scalabili. Sviluppatore Backend specializzato in Spring Boot, con esperienza in progettazione di database, analisi, sviluppo e testing dei task assegnati.

💼

02/2021 - 10/2021

Programmatore software

Adesso.it (prima era WebScience srl)

Esperienza nell'analisi AS-IS e TO-BE, evoluzioni SEO ed evoluzioni website per migliorare le performance e l'engagement degli utenti.

🎓

2018 - 2025

Laurea in Informatica

Università degli Studi di Bari Aldo Moro

Bachelor's degree in Computer Science, focusing on software engineering, algorithms, and modern development practices.

📚

2013 - 2018

Diploma - Sistemi Informativi Aziendali

Istituto Tecnico Commerciale di Maglie

Technical diploma specializing in Business Information Systems, combining IT knowledge with business management.

연락하기

프로젝트가 있으신가요? 아래 양식을 작성해 주시면 빠르게 답변드리겠습니다.

* Campi obbligatori. I tuoi dati saranno utilizzati solo per rispondere alla tua richiesta.

지식 그래프 및 AI: 구조화된 지식을 LLM에 통합

I 대규모 언어 모델 유창한 텍스트를 생성하는 능력이 놀랍습니다. 그러나 그들은 근본적인 한계에 직면해 있습니다. 즉, 그들이 담고 있는 지식과 절대적인, 모델 매개변수에 분산되어 업데이트가 어려움 구조화된 쿼리로는 쿼리할 수 없습니다. “그 사람들을 모두 나에게 주세요. 2020년 이후에 설립된 AI 회사에서 일한다"는 것은 지식으로는 사소한 일이다. 그래프는 LLM으로 보장할 수 없습니다.

I 지식 그래프 (KG)는 지식을 그래프로 나타냅니다. 엔터티 및 관계: 명시적이고 쿼리 가능하며 업데이트 가능하고 검증 가능한 구조입니다. KG와 LLM의 통합 - 패러다임 그래프RAG - 시스템을 생산 전통적인 RAG가 제공할 수 없는 구조화된 추론이 가능합니다. 이 기사에서는 Neo4j를 사용하여 GraphRAG 시스템을 구축하고 마이닝을 탐색합니다. LLM을 사용하여 텍스트에서 그래프를 자동 생성하고 지식 그래프를 쿼리하는 방법을 살펴보겠습니다. RAG 시스템을 강화합니다.

무엇을 배울 것인가

지식 그래프의 기본: 노드, 관계, 속성, RDF 및 속성 그래프
Neo4j: 모델, Cypher 쿼리 언어 및 LangChain 통합
LLM을 사용하여 구조화되지 않은 텍스트에서 자동 KG 추출
GraphRAG: 그래프 검색과 벡터 검색 결합
RAG용 KG: 엔터티 관계로 청크 강화
지식 그래프에 대한 다중 홉 추론
강화를 위한 공개 위키데이터 및 지식 그래프
프로덕션에서 유지 관리할 수 있는 KG 구축 모범 사례

1. 지식 그래프의 기본

Un 지식 그래프 지식을 다음과 같이 표현한 것이다. 내가 어디에 있는지 그래프 매듭 엔터티(사람, 조직, 개념, 이벤트) 및 아치 그들 사이의 관계를 나타냅니다. 각 트리플(주어, 술어, 목적어)은 사실을 인코딩합니다.

지식 그래프의 구조


KNOWLEDGE GRAPH: esempio dominio aziendale

ENTITA (Nodi):
  Person: "Luca Rossi" (name, role="CEO", birthYear=1975)
  Company: "TechCorp" (name, founded=2010, sector="AI")
  Product: "AIAnalytics" (name, version="2.0", category="software")
  Technology: "Python" (name, type="language")

RELAZIONI (Archi):
  (Luca Rossi) --[WORKS_AT]--> (TechCorp)
  (Luca Rossi) --[FOUNDED]--> (TechCorp)
  (TechCorp) --[DEVELOPS]--> (AIAnalytics)
  (AIAnalytics) --[USES_TECHNOLOGY]--> (Python)
  (TechCorp) --[COMPETES_WITH]--> (AICorp)

TRIPLE RDF:
  ("TechCorp", "rdf:type", "Company")
  ("TechCorp", "schema:foundingDate", "2010")
  ("Luca Rossi", "schema:worksFor", "TechCorp")

PROPERTY GRAPH (Neo4j):
  (:Person {name: "Luca Rossi", role: "CEO"})
    -[:WORKS_AT {since: 2010, equity: true}]->
  (:Company {name: "TechCorp", sector: "AI"})

VANTAGGI dei Knowledge Graph rispetto a tabelle relazionali:
1. Flessibilità: aggiungi relazioni senza modificare lo schema
2. Navigabilita: traversal multi-hop naturale
3. Reasoning: inferenza di nuove relazioni
4. Semantica: relazioni hanno significato esplicito

1.1 RDF 대 속성 그래프

서로 다른 장단점이 있는 두 가지 주요 지식 그래프 모델이 있습니다.

RDF 대 속성 그래프


크기
RDF/SPARQL
속성 그래프(Neo4j)


모델
트리플(S, P, O) 표준화
임의의 속성을 가진 노드와 에지

기준
W3C 표준, 상호 운용 가능
독점적이지만 더 유연함

쿼리 언어
SPARQL(복합)
사이퍼(더 읽기 쉬움)

관계에 대한 속성
복잡함(구체화)
기본적이고 단순함

AI 생태계
위키데이터, DBpedia, Schema.org
Neo4j(LangChain 통합)

언제 사용하나요?
개방형 데이터, 상호 운용성
AI 애플리케이션, GraphRAG

2. Neo4j: 설정 및 암호 쿼리 언어

네오4j AI 애플리케이션에 가장 널리 사용되는 그래프 데이터베이스입니다. LangChain과의 뛰어난 통합으로. 언어 사이퍼 직관적인 ASCII 아트 구문을 사용하여 그래프 패턴을 표현합니다.

Neo4j 설정 및 쿼리 Cypher Base


from neo4j import GraphDatabase
from typing import List, Dict, Any, Optional
import os

class Neo4jKnowledgeGraph:
    """Interfaccia Python per Neo4j knowledge graph"""

    def __init__(
        self,
        uri: str = "bolt://localhost:7687",
        user: str = "neo4j",
        password: str = "password"
    ):
        self.driver = GraphDatabase.driver(uri, auth=(user, password))

    def close(self):
        self.driver.close()

    def execute_query(self, query: str, parameters: dict = None) -> List[Dict]:
        """Esegui una query Cypher e ritorna i risultati"""
        with self.driver.session() as session:
            result = session.run(query, parameters or {})
            return [record.data() for record in result]

    def create_entity(self, label: str, properties: Dict) -> str:
        """Crea un nodo con label e proprietà"""
        props_str = ", ".join(f"{k}: ${k}" for k in properties.keys())
        query = f"CREATE (n:{label} {{{props_str}}}) RETURN id(n) as id"
        result = self.execute_query(query, properties)
        return result[0]["id"] if result else None

    def create_relationship(
        self,
        from_label: str, from_props: Dict,
        rel_type: str, rel_props: Dict,
        to_label: str, to_props: Dict
    ):
        """Crea una relazione tra due nodi"""
        from_match = " AND ".join(f"a.{k} = $from_{k}" for k in from_props)
        to_match = " AND ".join(f"b.{k} = $to_{k}" for k in to_props)
        rel_props_str = ", ".join(f"{k}: $rel_{k}" for k in rel_props) if rel_props else ""

        params = {
            **{f"from_{k}": v for k, v in from_props.items()},
            **{f"to_{k}": v for k, v in to_props.items()},
            **{f"rel_{k}": v for k, v in rel_props.items()}
        }

        query = f"""
MATCH (a:{from_label}) WHERE {from_match}
MATCH (b:{to_label}) WHERE {to_match}
MERGE (a)-[r:{rel_type} {{{rel_props_str}}}]->(b)
RETURN type(r) as rel_type"""

        return self.execute_query(query, params)

    def upsert_entity(self, label: str, match_props: Dict, set_props: Dict = None):
        """Upsert: crea se non esiste, aggiorna se esiste"""
        match_str = ", ".join(f"{k}: ${k}" for k in match_props)
        query = f"MERGE (n:{label} {{{match_str}}})"

        params = dict(match_props)
        if set_props:
            set_str = ", ".join(f"n.{k} = $set_{k}" for k in set_props)
            query += f" ON CREATE SET {set_str} ON MATCH SET {set_str}"
            params.update({f"set_{k}": v for k, v in set_props.items()})

        query += " RETURN n"
        return self.execute_query(query, params)


# Esempi di query Cypher avanzate
CYPHER_EXAMPLES = {
    # Trova tutte le aziende AI fondate dopo il 2020
    "aziende_recenti": """
MATCH (c:Company {sector: 'AI'})
WHERE c.founded > 2020
RETURN c.name, c.founded
ORDER BY c.founded DESC""",

    # Trova percorso tra due persone (degree di separazione)
    "percorso_sociale": """
MATCH path = shortestPath(
  (p1:Person {name: $person1})-[*..6]-(p2:Person {name: $person2})
)
RETURN path, length(path) as degrees""",

    # Trova comunita di entità correlate (community detection)
    "entita_correlate": """
MATCH (n:Company)-[r]-(related)
WHERE n.name = $company_name
RETURN related, type(r), n
LIMIT 50""",

    # Multi-hop: prodotti usati da aziende che competono con X
    "prodotti_competitor": """
MATCH (c1:Company)-[:COMPETES_WITH]->(c2:Company)
WHERE c1.name = $company_name
MATCH (c2)-[:DEVELOPS]->(p:Product)
RETURN DISTINCT p.name, p.category, c2.name as developed_by"""
}

3. 텍스트에서 자동 지식 그래프 추출

지식 그래프를 수동으로 작성하는 것은 비용이 많이 듭니다. 최신 LLM을 통해 다음을 수행할 수 있습니다. 구조화되지 않은 텍스트에서 엔터티와 관계를 자동으로 추출하여 채웁니다. 반자동 방식으로 그래프를 작성합니다.

LLM과 LangChain을 이용한 KG 채굴


from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
from typing import List, Optional


# Schema per l'estrazione strutturata
class Entity(BaseModel):
    """Un'entità estratta dal testo"""
    name: str = Field(description="Nome dell'entità")
    entity_type: str = Field(description="Tipo: Person, Company, Product, Technology, Location, Event, Concept")
    description: Optional[str] = Field(description="Breve descrizione dell'entità", default=None)
    properties: dict = Field(description="Proprietà aggiuntive (es. founded_year, role)", default_factory=dict)


class Relationship(BaseModel):
    """Una relazione tra due entità"""
    source: str = Field(description="Nome dell'entità sorgente")
    target: str = Field(description="Nome dell'entità destinazione")
    relationship_type: str = Field(description="Tipo di relazione (es. WORKS_AT, FOUNDED, COMPETES_WITH)")
    properties: dict = Field(description="Proprietà della relazione (es. since_year)", default_factory=dict)


class KnowledgeGraphExtraction(BaseModel):
    """Risultato dell'estrazione di un knowledge graph da testo"""
    entities: List[Entity] = Field(description="Entità estratte dal testo")
    relationships: List[Relationship] = Field(description="Relazioni tra entità")


class LLMKnowledgeGraphExtractor:
    """Estrae knowledge graph da testo usando LLM"""

    def __init__(self, model: str = "gpt-4o-mini"):
        llm = ChatOpenAI(model=model, temperature=0)
        self.structured_llm = llm.with_structured_output(KnowledgeGraphExtraction)

        self.extraction_prompt = ChatPromptTemplate.from_template("""
Estrai le entità e le relazioni dal seguente testo per costruire un knowledge graph.

Tipi di entità da estrarre: Person, Company, Product, Technology, Location, Event, Concept
Tipi di relazioni comuni: WORKS_AT, FOUNDED, DEVELOPS, USES, COMPETES_WITH, PART_OF,
  LOCATED_IN, ACQUIRED_BY, INVESTED_IN, AUTHORED_BY

Testo da analizzare:
{text}

Estrai TUTTE le entità e relazioni menzionate, anche quelle implicite.
Per le proprietà, estrai solo quelle esplicitamente menzionate nel testo.""")

    def extract(self, text: str) -> KnowledgeGraphExtraction:
        """Estrai entità e relazioni da un testo"""
        return self.structured_llm.invoke(
            self.extraction_prompt.format_messages(text=text)
        )

    def extract_and_store(
        self,
        text: str,
        neo4j_kg: Neo4jKnowledgeGraph,
        source_metadata: Dict = None
    ) -> dict:
        """Estrai dal testo e memorizza direttamente in Neo4j"""
        extraction = self.extract(text)

        stored_entities = 0
        stored_relationships = 0

        # Memorizza entità
        for entity in extraction.entities:
            props = {
                "name": entity.name,
                **(entity.properties or {}),
            }
            if entity.description:
                props["description"] = entity.description
            if source_metadata:
                props["source"] = source_metadata.get("source", "")

            neo4j_kg.upsert_entity(
                label=entity.entity_type,
                match_props={"name": entity.name},
                set_props=props
            )
            stored_entities += 1

        # Memorizza relazioni
        for rel in extraction.relationships:
            # Verifica che le entità esistano prima di creare la relazione
            source_entity = next(
                (e for e in extraction.entities if e.name == rel.source), None
            )
            target_entity = next(
                (e for e in extraction.entities if e.name == rel.target), None
            )

            if source_entity and target_entity:
                neo4j_kg.create_relationship(
                    from_label=source_entity.entity_type,
                    from_props={"name": rel.source},
                    rel_type=rel.relationship_type,
                    rel_props=rel.properties or {},
                    to_label=target_entity.entity_type,
                    to_props={"name": rel.target}
                )
                stored_relationships += 1

        return {
            "entities_found": len(extraction.entities),
            "relationships_found": len(extraction.relationships),
            "entities_stored": stored_entities,
            "relationships_stored": stored_relationships
        }


# Esempio utilizzo
extractor = LLMKnowledgeGraphExtractor()
kg = Neo4jKnowledgeGraph()

text = """
OpenAI, fondata da Sam Altman e Elon Musk nel 2015, ha sviluppato GPT-4 e ChatGPT.
L'azienda ha ricevuto un investimento da Microsoft di 10 miliardi di dollari nel 2023.
Anthropic, fondata da ex dipendenti OpenAI tra cui Dario Amodei, sviluppa Claude,
un modello che compete direttamente con ChatGPT.
"""

result = extractor.extract_and_store(text, kg, {"source": "news_article.txt"})
print(f"Estratte: {result['entities_found']} entità, {result['relationships_found']} relazioni")

4. GraphRAG: 그래프와 벡터 검색 결합

그래프RAG 전통적인 의미탐색을 결합한 패러다임이다. (벡터 검색) 지식 그래프 순회. 필요한 질문에 대해서는 엔터티 간의 관계에 대한 추론을 통해 GraphRAG는 기존 RAG보다 훨씬 뛰어난 성능을 발휘합니다.

LangChain 및 Neo4j를 사용한 GraphRAG


from langchain_community.graphs import Neo4jGraph
from langchain.chains import GraphCypherQAChain
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate


class GraphRAGSystem:
    """
    Sistema GraphRAG che combina:
    1. Retrieval vettoriale per domande semantiche
    2. Cypher query su Neo4j per domande strutturate
    3. LLM per sintetizzare entrambe le fonti
    """

    def __init__(self, neo4j_url: str, username: str, password: str, vector_retriever):
        self.llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
        self.vector_retriever = vector_retriever

        # Connessione Neo4j per LangChain
        self.graph = Neo4jGraph(
            url=neo4j_url,
            username=username,
            password=password
        )

        # Chain per generare e eseguire query Cypher automaticamente
        self.cypher_chain = GraphCypherQAChain.from_llm(
            cypher_llm=ChatOpenAI(model="gpt-4o-mini", temperature=0),
            qa_llm=self.llm,
            graph=self.graph,
            verbose=True,
            return_intermediate_steps=True,
            allow_dangerous_requests=True  # Necessario per query automatiche
        )

        # Router per decidere quale fonte usare
        self.router_chain = (
            ChatPromptTemplate.from_template("""
Analizza questa domanda e decidi la migliore strategia di retrieval.

Domanda: {question}

Scegli UNA strategia:
- "graph": la domanda richiede relazioni tra entità, conteggi, percorsi, o attributi specifici
- "vector": la domanda richiede spiegazioni, concetti, procedure o testo narrativo
- "hybrid": la domanda beneficia di entrambe le fonti

Rispondi SOLO con: graph, vector, o hybrid""")
            | self.llm
        )

    def _classify_query(self, question: str) -> str:
        """Classifica il tipo di query"""
        result = self.router_chain.invoke({"question": question})
        strategy = result.content.strip().lower()
        return strategy if strategy in ["graph", "vector", "hybrid"] else "vector"

    def query(self, question: str) -> dict:
        """Risponde alla domanda usando la strategia ottimale"""
        strategy = self._classify_query(question)
        print(f"Strategia selezionata: {strategy}")

        graph_context = ""
        vector_context = ""

        if strategy in ["graph", "hybrid"]:
            try:
                # Genera ed esegui query Cypher automaticamente
                graph_result = self.cypher_chain.invoke({"query": question})
                graph_context = str(graph_result.get("result", ""))
            except Exception as e:
                graph_context = f"Errore query grafo: {e}"

        if strategy in ["vector", "hybrid"]:
            docs = self.vector_retriever.invoke(question)
            vector_context = "\n".join(d.page_content for d in docs[:3])

        # Sintesi finale
        synthesis_prompt = f"""Domanda: {question}

{f"Dati dal knowledge graph:\n{graph_context}\n" if graph_context else ""}
{f"Documenti rilevanti:\n{vector_context}\n" if vector_context else ""}

Rispondi in modo completo basandoti sulle informazioni disponibili."""

        final_answer = self.llm.invoke(synthesis_prompt).content

        return {
            "answer": final_answer,
            "strategy": strategy,
            "graph_context": graph_context,
            "vector_context": vector_context[:200] if vector_context else ""
        }


# Esempi che mostrano i vantaggi di GraphRAG
graph_rag_examples = [
    # Domanda strutturale: meglio con graph
    "Quante aziende AI sono state fondate dopo il 2020?",

    # Domanda relazionale: meglio con graph
    "Chi sono le persone che lavorano per aziende che competono con OpenAI?",

    # Domanda semantica: meglio con vector
    "Come funziona il meccanismo di attenzione nei transformer?",

    # Domanda ibrida: beneficia di entrambe
    "Qual è la strategia di sviluppo prodotti di Anthropic?"
]

5. Enrichment RAG에 대한 지식 그래프

전체 GraphRAG를 수행하지 않더라도 지식 그래프가 풍부해질 수 있습니다. 상당히 전통적인 RAG 시스템: 엔터티로 쿼리 확장 관련, 관련 관계에 대한 문서 필터링 또는 컨텍스트 추가 복구된 청크로 구성됩니다.

KG 강화 RAG: 그래프를 통한 쿼리 확장


class KGEnhancedRetriever:
    """
    Retriever che usa il knowledge graph per espandere le query
    con entità correlate prima del vector search.
    """

    def __init__(self, kg: Neo4jKnowledgeGraph, vector_retriever, llm):
        self.kg = kg
        self.retriever = vector_retriever
        self.llm = llm

    def extract_entities_from_query(self, query: str) -> List[str]:
        """Estrai entità dalla query usando NER"""
        prompt = f"""Estrai i nomi di entità (persone, organizzazioni, prodotti, tecnologie)
dalla seguente query. Restituisci solo i nomi, uno per riga.

Query: {query}"""

        result = self.llm.invoke(prompt).content
        entities = [e.strip() for e in result.split('\n') if e.strip()]
        return entities

    def get_related_entities(self, entity_name: str, max_hops: int = 2) -> List[str]:
        """Ottieni entità correlate nel grafo"""
        query = f"""
MATCH (n)-[*1..{max_hops}]-(related)
WHERE n.name CONTAINS $entity_name
RETURN DISTINCT related.name as name
LIMIT 20"""

        results = self.kg.execute_query(query, {"entity_name": entity_name})
        return [r["name"] for r in results if r["name"]]

    def enhanced_retrieve(self, query: str, top_k: int = 5) -> list:
        """
        Recupera documenti con espansione della query via KG.
        1. Estrai entità dalla query
        2. Trova entità correlate nel grafo
        3. Espandi la query con le entità correlate
        4. Fai vector search sulla query espansa
        """
        # Step 1: Estrai entità dalla query
        entities = self.extract_entities_from_query(query)
        print(f"Entità trovate: {entities}")

        # Step 2: Trova entità correlate
        all_related = set()
        for entity in entities[:3]:  # Limita a 3 entità
            related = self.get_related_entities(entity)
            all_related.update(related[:5])  # Massimo 5 correlate per entità

        # Step 3: Espandi la query
        if all_related:
            expansion = ", ".join(list(all_related)[:10])
            expanded_query = f"{query} [Entità correlate: {expansion}]"
            print(f"Query espansa con: {expansion}")
        else:
            expanded_query = query

        # Step 4: Vector search sulla query espansa
        docs = self.retriever.invoke(expanded_query)
        return docs[:top_k]

    def get_entity_context(self, entity_name: str) -> str:
        """Ottieni contesto strutturato di un'entità dal grafo"""
        query = """
MATCH (n {name: $name})
OPTIONAL MATCH (n)-[r]->(related)
RETURN n, type(r) as rel_type, related.name as related_name
LIMIT 20"""

        results = self.kg.execute_query(query, {"name": entity_name})
        if not results:
            return ""

        lines = [f"Entità: {entity_name}"]
        for r in results:
            if r.get("rel_type") and r.get("related_name"):
                lines.append(f"  -> {r['rel_type']}: {r['related_name']}")

        return "\n".join(lines)

6. 모범 사례 및 안티 패턴

AI를 위한 모범 사례 지식 그래프

작게 시작하고 반복적으로 시작하세요. 처음부터 완벽한 KG를 구축하지 마십시오. 사용 사례에 가장 중요한 엔터티와 관계로 시작한 다음 확장하세요.
명확한 온톨로지를 정의합니다. 시작하기 전에 노드 및 관계 유형을 정의하십시오. 잘못된 온톨로지는 그래프가 채워진 후에 변경하기가 어렵습니다.
자동 추출 유효성을 검사합니다. LLM은 추출 시 오류를 범합니다. 특히 처음에는 중요한 데이터에 대해 사람이 직접 검증하는 프로세스를 구현합니다.
CREATE가 아닌 MERGE를 사용하세요. Neo4j에서는 중복을 피하기 위해 항상 엔터티에 MERGE를 사용합니다. CREATE는 이미 존재하더라도 항상 새 노드를 생성합니다.
검색 속성에 대한 색인: WHERE 쿼리(예: 이름, 날짜)에 사용되는 속성에 대해 Neo4j 인덱스를 생성합니다. 인덱스가 없으면 큰 그래프에 대한 쿼리가 느려집니다.

피해야 할 안티패턴

RAG를 완전히 대체하는 KG: GraphRAG는 강력하지만 설정 비용이 높습니다. 순전히 의미론적인 질문의 경우 기존 RAG가 더 좋고 저렴한 경우가 많습니다.
너무 일반적인 관계: "RELATED_TO" 관계에는 정보가 없습니다. 관계는 의미상 정확해야 합니다(FOUNDED, WORKS_AT, COMPETES_WITH).
업데이트 전략 없음: 정적 KG는 빠르게 쓸모 없게 됩니다. 그래프가 업데이트되는 방법과 빈도를 처음부터 정의하십시오.
검증 없이 생성된 암호화 쿼리: LLM에서 생성된 암호화 쿼리는 위험할 수 있습니다(실수로 삭제, 성능 문제). 가능하면 매개변수화된 템플릿 쿼리를 사용하세요.

결론

지식 그래프는 기존 RAG 시스템이 제공할 수 없는 기능을 제공합니다. 구조화되고 관계적이며 검증 가능한 지식. GraphRAG는 결합합니다 두 세계의 장점: 정확성을 갖춘 의미 검색의 유연성 그래프에 대한 구조적 추론. 자동 추출인 Neo4j를 살펴보았습니다. LLM이 포함된 KG, LangChain이 포함된 GraphRAG 및 그래프가 포함된 기존 RAG 강화.

핵심 포인트:

KG는 지식을 엔터티 및 관계로 표현합니다. 명시적이고 쿼리 가능한 구조입니다.
LLM을 사용하면 구조화되지 않은 텍스트에서 KG를 자동으로 추출할 수 있습니다.
GraphRAG는 관계형 및 다중 홉 쿼리에서 기존 RAG보다 성능이 뛰어납니다.
KG를 사용한 쿼리 확장으로 기존 RAG의 재현율 향상
간단한 온톨로지로 시작하여 반복적으로 확장

이것이 시리즈의 결론이다 AI 엔지니어링 및 고급 RAG. 우리는 RAG 기본부터 임베딩, 벡터까지 전체 스택을 다루었습니다. 데이터베이스부터 하이브리드 검색까지, 멀티 에이전트와 같은 고급 시스템까지 지식 그래프. 이 분야는 빠르게 발전하고 있습니다. 블로그를 계속 팔로우하세요. 업데이트를 위해.

관련 시리즈 계속하기: PostgreSQL의 RAG용 pgVector e BERT와 최신 NLP.

크기	RDF/SPARQL	속성 그래프(Neo4j)
모델	트리플(S, P, O) 표준화	임의의 속성을 가진 노드와 에지
기준	W3C 표준, 상호 운용 가능	독점적이지만 더 유연함
쿼리 언어	SPARQL(복합)	사이퍼(더 읽기 쉬움)
관계에 대한 속성	복잡함(구체화)	기본적이고 단순함
AI 생태계	위키데이터, DBpedia, Schema.org	Neo4j(LangChain 통합)
언제 사용하나요?	개방형 데이터, 상호 운용성	AI 애플리케이션, GraphRAG