Tutorial · June 12, 2026 · 6 min read

MovieLens Recommendation with Knowledge Graph Embeddings in TGraphX

Target keyword: MovieLens knowledge graph recommendation

MovieLens Recommendation with Knowledge Graph Embeddings in TGraphX

MovieLens is a classic recommendation dataset: users, movies, and ratings. It is small enough to run quickly, well-known enough that everyone has a reference point, and rich enough to demonstrate non-trivial techniques. This tutorial uses MovieLens as the substrate for a knowledge-graph-based recommendation workflow in TGraphX.

The KG framing of recommendation is well-established: model users and items as entities, model "rated" relations between them, and learn embeddings such that valid (user, rated, movie) triples score higher than random ones. Recommendation becomes a ranking task: for a user, score all movies and recommend the top-K.

This is not the highest-performing approach to recommendation (modern collaborative filtering models often outperform plain KG embeddings on MovieLens) but it is a clean, instructive way to use the framework's KG subsystem.

Setup

bash

pip install tgraphx

For data, MovieLens 100k is small enough for an example. Download from grouplens.org/datasets/movielens or use a cached version if you have one.

Step 1: Load and prepare the data

python

import pandas as pd
        import torch
        import tgraphx as tgx
        
        # Assume ratings.csv has columns: userId, movieId, rating, timestamp
        ratings = pd.read_csv("ml-100k/ratings.csv")
        
        # Filter to positive interactions
        positives = ratings[ratings["rating"] >= 4.0]
        print(f"Positive interactions: {len(positives)}")
        
        # Create entity index spaces
        user_ids = positives["userId"].unique()
        movie_ids = positives["movieId"].unique()
        
        user_to_idx = {u: i for i, u in enumerate(sorted(user_ids))}
        movie_to_idx = {m: i + len(user_to_idx) for i, m in enumerate(sorted(movie_ids))}
        
        num_users = len(user_to_idx)
        num_movies = len(movie_to_idx)
        num_entities = num_users + num_movies
        num_relations = 1   # "liked"

We use a single relation liked and offset movie indices by num_users so the global entity space is contiguous.

Step 2: Build triples

python

liked_relation = 0
        triples = torch.tensor([
            [user_to_idx[row.userId], liked_relation, movie_to_idx[row.movieId]]
            for row in positives.itertuples()
        ], dtype=torch.long)
        
        print(f"Triples: {triples.shape}")  # [num_positives, 3]

Each row is (head=user, relation=liked, tail=movie).

Step 3: Train/test split

For recommendation, the standard protocol is to hold out a fraction of each user's interactions for testing. We use a simpler global random split here for clarity:

python

torch.manual_seed(42)
        perm = torch.randperm(triples.shape[0])
        split = int(0.85 * triples.shape[0])
        train_triples = triples[perm[:split]]
        test_triples = triples[perm[split:]]
        
        print(f"Train: {train_triples.shape[0]}, Test: {test_triples.shape[0]}")

For a research-grade evaluation, use leave-one-out per user instead. The framework's evaluation works with either format.

Step 4: Train a TransE model

python

with tgx.reproducible(seed=42, deterministic=False):
            result = tgx.kg_completion(
                triples=train_triples,
                num_entities=num_entities,
                num_relations=num_relations,
                model="transe",
                embedding_dim=64,
                epochs=20,
                batch_size=512,
                lr=1e-3,
                seed=42,
            )
        
        print(result.metrics)

The framework's KG completion API trains the model and evaluates filtered ranking metrics.

Step 5: Generate recommendations

After training, the embedding matrix is result.model.entity_embeddings.weight. For a given user, score every movie:

python

import torch.nn.functional as F
        
        model = result.model
        relation_emb = model.relation_embeddings.weight[liked_relation]
        
        def recommend(user_id, top_k=10, exclude=None):
            user_idx = user_to_idx[user_id]
            user_emb = model.entity_embeddings.weight[user_idx]
            # TransE score: -|h + r - t| (higher is better)
            movie_offset = num_users
            movie_emb = model.entity_embeddings.weight[movie_offset:]
            diff = (user_emb + relation_emb).unsqueeze(0) - movie_emb
            scores = -torch.norm(diff, dim=1)
            if exclude:
                for m in exclude:
                    scores[movie_to_idx[m] - movie_offset] = -float("inf")
            top = scores.topk(top_k)
            return [(list(movie_to_idx.keys())[list(movie_to_idx.values()).index(i + movie_offset)],
                     top.values[k].item())
                    for k, i in enumerate(top.indices.tolist())]
        
        # Already-liked movies to exclude
        user_id = list(user_to_idx.keys())[0]
        already_liked = positives[positives["userId"] == user_id]["movieId"].tolist()
        recs = recommend(user_id, top_k=10, exclude=already_liked)
        for m, s in recs:
            print(f"  Movie {m}  score={s:.3f}")

The score function uses TransE's distance interpretation: lower distance means more compatible. We negate so that higher is better for the top-K computation.

Step 6: Evaluate on the test set

For a quick evaluation, compute Hits@K and MRR on the held-out triples:

python

# Framework provides filtered ranking utilities
        # (Detailed evaluation depends on protocol; see docs/knowledge_graphs.md)
        print(f"Filtered MRR:     {result.metrics.get('mrr', 'n/a')}")
        print(f"Filtered Hits@10: {result.metrics.get('hits_at_10', 'n/a')}")

Honest assessment of recommendation quality

This setup is a research demo, not a production recommendation system. Real recommendation systems use:

Multi-relation graphs (rated 5, rated 4, viewed but not rated, abandoned mid-watch).
Side information about users and movies (demographics, genres, release year).
Temporal dynamics (recent preferences weighted more).
Implicit-feedback-aware loss functions (BPR, WARP).
Specialized libraries (LightFM, Implicit, RecBole).

For MovieLens specifically, plain TransE typically lands well below specialized CF models like NGCF or LightGCN on standard metrics. The KG embedding framing is interesting research, but it is not the strongest single-model approach.

Where the KG framing earns its keep: when you have rich side information that is naturally graph-structured. Movies linked to their directors, actors, genres; users linked to demographic groups; movies linked to each other via "similar viewers" edges. KG embeddings can incorporate all of this in a uniform way. Plain CF cannot.

Adding side information

python

# Suppose we also have movie genres
        genres = pd.read_csv("ml-100k/movies.csv")  # movieId, genre
        
        # Add a new relation "has_genre"
        has_genre_relation = 1
        num_genre_entities = genres["genre"].nunique()
        # ... build additional triples for movie→genre links
        
        # Train on the augmented triple set
        augmented_triples = torch.cat([train_triples, genre_triples], dim=0)
        result = tgx.kg_completion(
            triples=augmented_triples,
            num_entities=num_entities + num_genre_entities,
            num_relations=2,    # liked + has_genre
            model="transe",
            embedding_dim=64,
            seed=42,
        )

This is where the KG approach starts to outperform pure CF: every additional relation type is information the model can use to improve generalization, especially for users with few ratings.

When to use this approach

You have multi-relational side information you want to incorporate alongside user-item interactions.
You want explainable recommendations (KG embedding scores can be traced to specific triples).
You are doing research on recommendation, not deploying production.

When NOT to use this approach

You need highest accuracy on a standard CF benchmark — use a dedicated CF library.
You have only the interaction matrix and no side information — plain CF will outperform.
You need real-time inference at scale — KG embedding inference can be expensive at very large entity counts.

Reproducibility

The complete experiment runs deterministically with:

python

with tgx.reproducible(seed=42, deterministic=True):
            result = tgx.kg_completion(...)

The run directory at result.run_dir contains the configuration and metrics for audit and reproduction.

FAQ

Q: What about negative sampling for recommendation?
A: The KG framework's default uniform negative sampler works for this setup. For better recommendation accuracy, consider filtered negative sampling (avoids sampling true positives as negatives).

Q: Can I use ComplEx or RotatE instead of TransE?
A: Yes, just change model="transe" to model="complex" or model="rotate". Each has different inductive biases. For symmetric relations like "liked," DistMult or RotatE often work well.

Q: How do I handle cold-start (new users)?
A: KG embeddings struggle with cold start because new users have no embedded representation. Hybrid approaches that combine content features with KG embeddings are the standard solution.

Q: What is the runtime on MovieLens 100k?
A: On CPU with 20 epochs, expect a few minutes. With GPU, under a minute. For MovieLens 1M, expect 5-10x longer.

Q: Where is the complete example script?
A: The TGraphX examples/knowledge_graph_demo.py shows a simpler version. The MovieLens-specific setup is not in the examples directory; this tutorial provides the recipe.