TGraphX vs PyKEEN: Choosing a Knowledge Graph Embedding Library
PyKEEN is the standard PyTorch library for knowledge graph (KG) embedding research. It is mature, feature-rich, and used in many published KG completion benchmarks. TGraphX ships a smaller KG subsystem as one part of a broader graph learning library.
This article compares the two honestly. The short version: for KG benchmark research, PyKEEN is usually the right choice. For projects that combine KG embedding with other graph workflows or that need tensor-valued entity features, TGraphX may fit better.
What PyKEEN does well
PyKEEN has:
- Many KG embedding models implemented and benchmarked
- Sophisticated hyperparameter optimization (HPO) with Optuna integration
- Multiple negative samplers (Bernoulli, basic, filtered, typed, pseudo-typed)
- Standardized evaluation protocols matching the literature exactly
- Active community, frequent releases, comprehensive documentation
- Direct integration with standard benchmark datasets (FB15k-237, WN18RR, Wikidata5M, etc.)
If your project is a KG completion benchmark publication, you should almost certainly use PyKEEN. It is the framework the field uses; using it makes your results directly comparable.
What TGraphX KG offers
TGraphX's tgraphx.kg subsystem has:
- Six core embedding models: TransE, DistMult, ComplEx, RotatE, RESCAL, SimplE
- Filtered ranking evaluation
- Negative sampling utilities
- A one-call API (
tgx.kg_completion(...)) - Native support for multimodal entity features (rank-3+ tensors per entity)
- The same reproducibility tooling as the rest of TGraphX
Notably, the multimodal entity feature support is a TGraphX-specific extension. PyKEEN's entity representations are vector-valued by default. If you have rich entity features (movie posters, paper abstracts, product images) and want to incorporate them into the embedding, TGraphX has dedicated infrastructure.
Side-by-side: TransE training
PyKEEN:
from pykeen.pipeline import pipeline
result = pipeline(
training=train_triples,
testing=test_triples,
model="TransE",
model_kwargs={"embedding_dim": 200},
training_kwargs={"num_epochs": 50, "batch_size": 256},
optimizer_kwargs={"lr": 1e-3},
random_seed=42,
)
TGraphX:
import tgraphx as tgx
result = tgx.kg_completion(
triples=train_triples,
num_entities=N,
num_relations=R,
model="transe",
embedding_dim=64,
epochs=50,
batch_size=256,
seed=42,
)
The TGraphX version is more compact. PyKEEN is more explicit and has more knobs.
Feature comparison
| Feature | PyKEEN | TGraphX |
|---|---|---|
| Number of models | many | 6 |
| Hyperparameter optimization | ✅ Optuna integration | ⚪ external |
| Negative samplers | 5+ types | 3 types |
| Evaluation protocols | Multiple, standardized | Filtered ranking |
| Multimodal entity features | ⚪ external | ✅ native |
| Reproducibility tooling | basic | ✅ comprehensive |
| One-call API | ⚪ requires pipeline setup | ✅ kg_completion() |
| Benchmark dataset integration | ✅ many | ⚪ external |
| Temporal KGs | ⚪ external | ⚠️ Experimental |
| Maturity | Stable | Beta |
PyKEEN has the broader feature set. TGraphX's specific contributions are the multimodal extension and the integration with the broader TGraphX ecosystem.
When TGraphX KG is the right pick
- You are using TGraphX for graph learning anyway and want a single library.
- Your entities have rich tensor features and you want them in the embedding.
- You want the reproducibility tooling (seeded runs, audit artifacts, dashboard) for KG experiments.
- You are doing graduate-level research and prefer prototype speed over feature breadth.
When PyKEEN is the right pick
- You are benchmarking against published KG completion numbers.
- You need a specific model not in TGraphX's six.
- You need sophisticated HPO with Optuna integration.
- You need typed or filtered negative sampling with specific protocols.
- You are submitting a KG paper and the field expects PyKEEN-comparable results.
Honest limitations
TGraphX KG limitations:
- Smaller model set means some research questions are not addressable without writing the model yourself.
- No published comparisons against PyKEEN on standard benchmarks. Numbers should not be assumed to match.
- Multimodal extension is research-grade; no published evidence that it consistently beats vector-only embeddings on standard benchmarks.
- HPO requires external tooling.
PyKEEN limitations (acknowledged in their docs):
- Vector-only entity representations by default.
- Larger surface area means more to learn.
- Less integrated with broader graph learning frameworks.
A note on results comparability
If you train the same model with the same hyperparameters in both frameworks, you should get similar but not identical results. Differences arise from:
- Default initialization schemes
- Default negative sampling distributions
- Floating-point reduction order
- Random number generator state at first sampler call
These add up. Differences of 1-3 percentage points on filtered MRR are normal between PyKEEN and TGraphX. For publication, run in whichever framework you cite and report only those numbers.
Combined use is possible
Nothing prevents using both. A common pattern: use PyKEEN for the standard embedding benchmark, then export the trained embeddings and use TGraphX to integrate them into a downstream graph workflow. The two frameworks coexist comfortably.
Bottom line
PyKEEN for KG benchmarks. TGraphX KG for projects that span KG and broader graph learning, or that need multimodal entity features. Neither replaces the other.
FAQ
Q: Can I import PyKEEN models into TGraphX?
A: Not directly. The model implementations have different internal interfaces. The trained embeddings (entity matrix, relation matrix) are portable as raw PyTorch tensors.
Q: Does TGraphX have an HPO equivalent?
A: Not as a built-in feature. Use Optuna or Ray Tune externally with tgx.kg_completion as the trial function.
Q: How does TGraphX handle inverse relations?
A: The KG models in TGraphX treat relations as directed. To model inverse relations explicitly, add both (h, r, t) and (t, r_inverse, h) to the triple set.
Q: What about KG completion at very large scale (e.g., Wikidata)?
A: TGraphX has not been benchmarked at Wikidata scale. PyKEEN has documented procedures for handling such datasets. For very large KG, PyKEEN is the more proven choice.
Q: Are there code examples comparing both?
A: There are no official side-by-side examples. The TGraphX examples directory includes kg_benchmark_quickstart.py; PyKEEN has analogous examples in its documentation.