Building LLM-Friendly Graph APIs with TGraphX
If you have used Claude Code, Cursor, or similar tools to generate PyTorch Geometric code, you have probably noticed a pattern: the assistant produces something that looks right, runs once, and then fails subtly when you adapt it. The data loader has the wrong split. The layer is missing a relu between blocks. The mask is one-hot instead of boolean.
The root cause is rarely the assistant. It is the API. Graph learning libraries grew up around researcher conventions — short variable names, implicit shape contracts, dozens of layer variants that look similar but behave differently. An LLM trained on this is making reasonable guesses based on examples, but reasonable guesses are not always right.
TGraphX makes a deliberate set of design choices that work well with AI-assisted coding. This article walks through them.
Choice 1: One canonical name per concept
PyG has Data, Batch, HeteroData, Dataset, InMemoryDataset, DataLoader, NeighborSampler, NeighborLoader, LinkNeighborLoader, etc. Each has its place, but the surface is wide and easy to confuse.
TGraphX uses:
- tgx.Graph for a single graph
- tgx.GraphBatch for a batched graph
- tgx.NeighborLoader for mini-batch sampling
A short list, each with one clear purpose. An LLM asked to "build a graph" reaches for tgx.Graph and gets it right.
Choice 2: Safe aliases for common mistakes
Researchers and AI assistants both type x, y, and edge_attr from PyG muscle memory. TGraphX accepts these as aliases:
import tgraphx as tgx
import torch
# Canonical
g = tgx.Graph(node_features=x, edge_index=ei, node_labels=y)
# PyG-style aliases also work
g = tgx.Graph(x=x, edge_index=ei, y=y)
g.x # alias for node_features
g.edge_attr # alias for edge_features
The canonical form is recorded in the resulting object's metadata so artifacts are unambiguous, but code that uses the aliases still works.
Choice 3: One-call entry points
The most common workflows have a single function:
# Node classification
result = tgx.classify_nodes(x=x, edge_index=ei, labels=y, model="tensor_gcn", seed=42)
# Knowledge graph completion
result = tgx.kg_completion(triples=t, num_entities=N, num_relations=R, model="transe", seed=42)
# Synthetic graph generation
g = tgx.generate_graph("ba", num_nodes=100, m=3, seed=42)
# Evolutionary optimization
res = tgx.optimize_graph(objective="connectivity", algorithm="ga", num_nodes=30, seed=42)
# Graph reinforcement learning
res = tgx.train_graph_rl(env="maxcut", algorithm="dqn", episodes=50, seed=42)
Each one-call function is well-typed, validates its arguments, returns a structured result, and is documented in the README. An LLM asked to "train a node classifier on this graph" can produce a one-line invocation that runs.
Choice 4: Actionable error messages
When a TGraphX function fails, it returns an error that says what to do, not just what happened:
g = tgx.Graph(x=torch.randn(100, 3, 8, 8), edge_index=bad_edges, labels=y)
try:
tgx.validate_graph(g, strict=True)
except ValueError as e:
print(tgx.explain_error(e))
tgx.explain_error() translates the raw exception into a human-readable explanation with suggested fixes. This helps both human debuggers and AI assistants that read tracebacks and try to repair them.
Choice 5: A readiness check anyone can run
python -m tgraphx readiness
This prints a checklist of what is installed, what is detectable in the environment, and what would be required for common workflows. It is the kind of self-introspection that AI tooling can parse and act on.
What this means in practice for AI-assisted coding
When you ask Claude Code to write a graph experiment in TGraphX, the output is usually:
import tgraphx as tgx
import torch
# Generate or load data
x = torch.randn(1000, 3, 8, 8)
edge_index = tgx.knn_graph(x.view(1000, -1), k=8)
labels = torch.randint(0, 10, (1000,))
# One-call training
with tgx.reproducible(seed=42):
result = tgx.classify_nodes(
x=x, edge_index=edge_index, labels=labels,
model="tensor_gcn",
)
print(result.metrics)
This runs. There is little surface for the assistant to get wrong. The defaults are sensible.
For comparison, equivalent PyG code requires the assistant to set up Data, choose between several layer types, configure a NeighborLoader, write a training loop, and remember to seed everything. Each step is a chance for a subtle mistake.
What this does NOT mean
LLM-friendly is not the same as good for everyone. Researchers who want fine-grained control over message passing, loss formulation, or sampling will hit the limits of the one-call API quickly. The explicit tgx.build_model() and fit() API is there for those cases, and PyG is genuinely better for some research problems.
The point is: when AI assistance is part of your workflow, a deliberately small, canonical, well-typed API is more cooperative than a broad surface with many overlapping ways to do similar things.
Trade-offs
- The one-call API hides what is happening. Beginners need to read the source to understand the defaults.
- The alias system makes some code slightly slower to parse for humans who know only one convention.
- A small API means features that don't fit the canonical pattern get implemented as separate utilities, which can be confusing.
These are real costs. They are accepted in exchange for code that AI-assisted workflows can produce reliably.
FAQ
Q: Can I disable the alias system if I prefer strict canonical names?
A: There is no global switch. You can simply use only the canonical names in your own code; the framework ignores aliases when they are not used.
Q: Does this design help only AI-generated code?
A: No. The same properties — small surface, sensible defaults, actionable errors — help human developers, especially those new to the framework.
Q: How does tgx.explain_error() work?
A: It pattern-matches the exception type and message against a table of known error patterns and returns a paragraph with diagnostic guidance and a suggested fix. The table is in tgraphx/ux/errors.py in the source repository.
Q: Is there a documented "best practices" guide for LLM-assisted code?
A: Yes, see docs/llm_usage_guide.md in the TGraphX repository.