TGraphX Insights Seeded Operations and Deterministic Flags in TGraphX Experiments
← Back to Insights

Seeded Operations and Deterministic Flags in TGraphX Experiments

Target keyword: pytorch deterministic seeded GNN experiments

Seeded Operations and Deterministic Flags in TGraphX Experiments

Two researchers run "the same" experiment and get different numbers. This is the default state of PyTorch GNN research and the source of many reproducibility complaints. This article is about how to set up an experiment that runs deterministically and what you lose by doing so.

The summary: it is mostly a one-liner with tgx.reproducible(seed=42, deterministic=True), but understanding what that line does — and what it cannot do — matters when results still differ across runs.

What "deterministic" actually means

"Deterministic" in PyTorch means:

  1. Every PRNG source is seeded.
  2. CUDA non-deterministic kernels are replaced with deterministic alternatives where they exist.
  3. cuDNN is set to deterministic mode (no benchmark-based algorithm selection).

It does NOT mean:

  • Results are identical across different GPU models or driver versions.
  • Floating-point reductions on GPU produce the same value across runs (FMA ordering varies).
  • A different PyTorch version will produce identical numbers.

For a single machine running the same code with the same data and the same versions, deterministic runs are reproducible. For different machines or different software versions, "reproducible within a small bounded tolerance" is the realistic goal.

What set_seed actually does

python
import tgraphx as tgx
        
        tgx.set_seed(42, deterministic=False)
        

This:

  • Seeds Python's random module with 42
  • Seeds NumPy with 42
  • Seeds PyTorch CPU PRNG with 42
  • Seeds PyTorch CUDA PRNG (all devices) with 42

With deterministic=False, that is the extent. The cuDNN benchmark mode is left alone. Some CUDA kernels run their non-deterministic fast variant.

python
tgx.set_seed(42, deterministic=True)
        

This additionally:

  • Sets torch.backends.cudnn.deterministic = True
  • Sets torch.backends.cudnn.benchmark = False
  • Calls torch.use_deterministic_algorithms(True, warn_only=True)

The warn_only=True is important: with warn_only=False, PyTorch raises a RuntimeError if you call an operation that has no deterministic implementation. With warn_only=True, it warns but does not fail. The TGraphX default is warn_only=True because some legitimate operations have no deterministic CUDA kernel and crashing the experiment is rarely the right answer.

The reproducible() context manager

For most use cases, the context manager is the right interface:

python
import tgraphx as tgx
        
        with tgx.reproducible(seed=42, deterministic=True):
            result = tgx.classify_nodes(
                x=x, edge_index=edge_index, labels=y,
                model="tensor_gcn", seed=42,
            )
        

This is equivalent to set_seed(42, deterministic=True) plus:

  • Writes a reproducibility_report.json to the run directory.
  • Restores the previous RNG state on exit.

The state restoration is useful if you want to wrap individual experiments in with blocks without polluting the global PRNG state.

What the reproducibility report contains

json
{
          "seed": 42,
          "deterministic": true,
          "python_version": "3.11.5",
          "pytorch_version": "2.1.0",
          "cuda_available": true,
          "cuda_version": "12.1",
          "cudnn_version": "8.9.2",
          "device_count": 1,
          "device_name": "NVIDIA RTX 3090",
          "platform": "Linux-5.15.0-x86_64",
          "package_hash": "a3b9...",
          "timestamp_utc": "2026-05-23T14:35:22Z"
        }
        

This is the fingerprint needed to check whether two "identical" runs were actually run on similar hardware with similar software.

Common pitfalls

Forgetting to seed the data loader. A DataLoader with num_workers > 0 and no worker_init_fn does not reset per-worker RNG. Use:

python
def worker_init_fn(worker_id):
            import numpy as np, random
            seed = (torch.initial_seed() + worker_id) % (2**32)
            np.random.seed(seed)
            random.seed(seed)
        
        loader = DataLoader(dataset, num_workers=4, worker_init_fn=worker_init_fn)
        

TGraphX's NeighborLoader handles this automatically when given a seed.

Sampling with a separately-seeded RNG. If you have custom sampling code that uses numpy.random.RandomState(some_seed), that seed is separate from PyTorch's. Either use numpy.random.default_rng(seed) from your single seed source, or seed every PRNG explicitly.

Hidden randomness in third-party libraries. Some scientific libraries (scipy, scikit-learn) have their own internal randomness. Seed them too.

Augmentation that uses Python random. torchvision transforms use random under the hood. Seeding torch.manual_seed is not enough.

What you lose with deterministic=True

  • Speed. Deterministic CUDA kernels are typically slower than their non-deterministic counterparts. The slowdown depends on the operations; for typical GNN workloads, expect 1.1x to 1.5x slower.
  • Some operations unavailable. With warn_only=False, you get RuntimeErrors. With warn_only=True, you get warnings and the non-deterministic kernel is used; results are not deterministic for those operations.
  • cuDNN benchmark gains. With cudnn.benchmark = False, cuDNN does not auto-tune algorithm selection. For workloads with varying input shapes, this can be significantly slower.

For research and ablation studies, the speed cost is usually worth it. For production training where you need throughput, leave deterministic=False and accept slight variance.

A complete checklist

Before claiming a GNN experiment is reproducible:

  • tgx.set_seed(SEED, deterministic=True) or with tgx.reproducible(SEED, deterministic=True):
  • worker_init_fn set on any custom DataLoader
  • Sampler seed set explicitly
  • torch.use_deterministic_algorithms(True) either inside set_seed or called separately
  • Environment variables: CUBLAS_WORKSPACE_CONFIG=:4096:8 for CUDA 10.2+ (PyTorch requires this for full determinism)
  • Package versions recorded
  • System info recorded (the reproducibility_report.json covers this)

The TGraphX context manager handles 1, 3, 4, and 6 automatically. Items 2, 5, and 7 are external to the framework.

When determinism is impossible

Some operations (atomic adds in scatter, certain reductions) have no deterministic CUDA implementation. If you need full determinism and these operations are in your graph, options:

  1. Run on CPU. Slower but deterministic.
  2. Accept partial determinism. Document that operation X is non-deterministic in your methodology.
  3. Replace the operation. Sometimes a manual implementation in PyTorch is deterministic where the fused operation is not.

For most GNN research, option 2 is fine. The methodology section says "trained with cuDNN deterministic mode; minor variance possible from non-deterministic scatter ops."


FAQ

Q: Why is warn_only=True the default?
A: Because raising on every non-deterministic op breaks legitimate code. Many useful operations (e.g., torch.scatter_reduce on some backends) do not have deterministic kernels in older PyTorch versions. The warning lets you know it happened without crashing.

Q: Does tgx.reproducible() work in Jupyter?
A: Yes. It is a normal Python context manager.

Q: Can I nest tgx.reproducible() calls?
A: Yes. The inner call's seed takes effect for its block; on exit, the outer call's state is restored.

Q: How do I check if the current run is in deterministic mode?
A: torch.are_deterministic_algorithms_enabled() returns True when set.

Q: What about distributed training?
A: Each rank needs its own seed (typically base_seed + rank). Distributed deterministic training is more complex; consult docs/distributed_training.md. The TGraphX distributed subsystem is Experimental.