Knowledge Base

Knowledge Graph Completion

Predicting missing links in knowledge graphs.

0 datasets0 resultsView full task mapping →

Knowledge graph completion predicts missing links in knowledge graphs — inferring that (Einstein, field_of_work, Physics) from known facts. Embedding methods (TransE, RotatE, ComplEx) and GNN-based approaches dominate, with LLMs increasingly used for text-enhanced KG completion on open-domain knowledge graphs.

History

2013

TransE models relations as translations in embedding space: h + r ≈ t

2015

ComplEx uses complex-valued embeddings to handle symmetric and antisymmetric relations

2017

ConvE applies 2D convolutions over reshaped entity-relation embeddings

2019

RotatE models relations as rotations in complex space, with strong theoretical properties

2020

CompGCN applies GNNs to knowledge graphs, jointly embedding entities and relations

2021

NodePiece reduces entity embedding memory by composing from relation anchors

2022

KG-BERT and StAR use pretrained language models for text-enhanced KG completion

2023

ChatGPT-based KG completion shows LLMs have implicit knowledge graph capabilities

2024

SimKGC and other contrastive methods bridge embedding and text-based approaches

2025

LLM-KG hybrid systems combine parametric knowledge with structured graph reasoning

How Knowledge Graph Completion Works

1Entity and Relation E…Each entity and relation in…2Scoring FunctionA scoring function evaluate…3TrainingThe model is trained to sco…4PredictionFor a query (Einstein5Ranking EvaluationPerformance is measured by …Knowledge Graph Completion Pipeline
1

Entity and Relation Embedding

Each entity and relation in the KG is assigned a learned vector in a continuous embedding space.

2

Scoring Function

A scoring function evaluates the plausibility of a triple (h, r, t): TransE uses ‖h+r−t‖, RotatE uses ‖h∘r−t‖ in complex space.

3

Training

The model is trained to score true triples higher than corrupted (negative-sampled) triples using margin-based or cross-entropy losses.

4

Prediction

For a query (Einstein, field_of_work, ?), all candidate tail entities are scored, and top-K predictions are returned.

5

Ranking Evaluation

Performance is measured by Mean Reciprocal Rank (MRR) and Hits@K — how well the model ranks correct completions.

Current Landscape

Knowledge graph completion in 2025 has two active paradigms: (1) geometric embedding methods (TransE, RotatE, ComplEx) that scale well but ignore entity text, and (2) text-enhanced methods (KG-BERT, SimKGC) that leverage pretrained language models for richer entity representations. LLMs are disrupting the field — they contain implicit knowledge graphs in their parameters and can predict missing links zero-shot. But structured KG reasoning still has advantages for multi-hop inference and verifiable reasoning chains. The practical impact is in search, recommendations, and biomedical knowledge bases (drug-gene interactions, disease-symptom relations).

Key Challenges

Scalability — real-world KGs have millions of entities, making full scoring expensive

Relation types — different relation patterns (symmetric, antisymmetric, transitive, compositional) require different modeling

Long-tail entities — entities with few connections have poor embeddings due to data sparsity

Temporal dynamics — KGs change over time; facts have validity periods that most methods ignore

Open-world assumption — real KGs are incomplete, and the absence of a link doesn't mean it's false

Quick Recommendations

Standard KG completion

RotatE / ComplEx

Best balance of expressiveness, scalability, and reproducibility

Text-enhanced completion

SimKGC / KG-BERT

Leverages entity descriptions and relation text for better embedding

Large-scale KGs

NodePiece + RotatE

Memory-efficient entity representation for million-entity KGs

Open-domain KG completion

LLM + structured KG query

LLMs can reason about missing links using world knowledge

What's Next

The frontier is merging knowledge graphs with LLMs — using KGs to ground LLM reasoning in verified facts while using LLMs to fill KG gaps. Temporal knowledge graph completion (handling time-varying facts) and few-shot relation learning (completing new relation types with few examples) are key research directions.

Benchmarks & SOTA

No datasets indexed for this task yet.

Contribute on GitHub

Related Tasks

Something wrong or missing?

Help keep Knowledge Graph Completion benchmarks accurate. Report outdated results, missing benchmarks, or errors.

0/2000