Tutorial · June 06, 2026 · 4 min read

Evolutionary Optimization over Graph Structures with TGraphX

Target keyword: evolutionary optimization graph neural networks

Evolutionary Optimization over Graph Structures with TGraphX

When the search space is over graph structures — adding edges to maximize a property, finding a sparse subgraph with high connectivity, designing a graph that optimizes multiple competing objectives — gradient-based methods do not apply directly. Evolutionary algorithms are a natural fit: they handle discrete combinatorial spaces, do not require differentiable objectives, and can explore multiple modes.

TGraphX includes an evolutionary optimization subsystem with GA, simulated annealing, NSGA-II, and hill climbing. This article walks through using it, with the caveat that the entire subsystem is Experimental.

What the subsystem includes

Algorithms:

GeneticAlgorithmOptimizer — population-based GA with mutation and crossover
SimulatedAnnealingOptimizer — single-state SA with temperature schedule
NSGAIIOptimizer — multi-objective optimization with Pareto front
HillClimbingOptimizer — local search baseline

Fitness functions (built-in):

connectivity_fitness — maximize connected component coverage
sparsity_fitness — minimize edge count
density_fitness — maximize edge density
composite_fitness — weighted sum of multiple objectives

Genome operations:

GraphGenome — encoding of a graph as a mutable genotype
mutate_add_node, mutate_add_edge, mutate_node_feature — mutation operators

A first experiment

The one-call API runs a complete optimization:

python

import tgraphx as tgx
        
        result = tgx.optimize_graph(
            objective="connectivity",
            algorithm="ga",
            num_nodes=30,
            num_generations=100,
            seed=42,
        )
        
        print(f"Best fitness: {result.best_fitness:.4f}")
        print(f"Best genome: {result.best_genome}")

This evolves a population of 30-node graphs over 100 generations to maximize connectivity. The result includes the best graph found, its fitness, and the optimization history.

Switching objectives

Each objective has a different shape and produces different optimal graphs:

python

for obj in ["connectivity", "sparsity", "density"]:
            res = tgx.optimize_graph(objective=obj, algorithm="ga", num_nodes=20, seed=42)
            print(f"{obj:15s}: best fitness = {res.best_fitness:.4f}")

Connectivity rewards graphs that are connected — penalizes isolated nodes.

Sparsity rewards graphs with few edges.

Density rewards graphs with many edges relative to the maximum possible.

These can conflict. A maximally-dense graph is also maximally-connected but not sparse.

Multi-objective optimization with NSGA-II

For competing objectives, use NSGA-II to find the Pareto front:

python

result = tgx.optimize_graph(
            objective={
                "connectivity": 1.0,    # weight
                "sparsity": -1.0,       # negative — we want few edges
            },
            algorithm="nsga2",
            num_nodes=20,
            num_generations=200,
            seed=42,
        )
        
        # Pareto front: graphs that trade off connectivity and sparsity
        for genome in result.pareto_front[:5]:
            print(f"  conn={genome.connectivity:.3f}  sparse={genome.sparsity:.3f}")

NSGA-II returns a Pareto front rather than a single best solution. Each point represents a trade-off: more connectivity at the cost of more edges, or vice versa.

This is the right tool when "best" is not a single objective.

Custom fitness functions

For objectives not built into the framework:

python

from tgraphx.evolutionary import GeneticAlgorithmOptimizer
        
        def my_fitness(genome):
            g = genome.to_graph()
            # Compute whatever you want
            return some_custom_score(g)
        
        opt = GeneticAlgorithmOptimizer(
            fitness_fn=my_fitness,
            num_nodes=30,
            population_size=50,
            mutation_rate=0.1,
        )
        
        best = opt.run(num_generations=100)

The optimizer calls my_fitness(genome) for every genome in every generation. For expensive fitness functions, this can be slow.

A real-world pattern: NAS-like architecture search

The framework's evolutionary subsystem is designed for graph structures, but the same pattern applies to neural architecture search where the search space is a computational graph:

python

def gnn_fitness(genome):
            # Construct a GNN architecture from the genome
            model = build_gnn_from_genome(genome)
            # Train briefly and return validation accuracy
            score = quick_train(model, data, epochs=5)
            return score

This is conceptually simple but computationally expensive. For practical NAS, weight-sharing methods (DARTS, ProxylessNAS) are usually faster than pure evolutionary search.

Honest limitations

The Experimental label is real. Some genome operators have edge cases. Test on small problems first.
The hill climbing and SA optimizers are simpler than GA; for hard problems, GA usually wins.
Population sizes default to 50; for hard problems consider 100-200.
The framework does not handle parallel fitness evaluation by default. For expensive fitness, you will want to parallelize manually.
There is no published comparison to dedicated NAS frameworks. Use TGraphX's evolutionary search for exploratory research, not for production NAS.

When this subsystem is the right fit

Combinatorial graph design problems with a clear objective.
Research where you want multi-objective trade-off curves.
Educational use — the algorithms are clean and easy to understand.

When it is not

Production-grade NAS for deep models.
Large-scale combinatorial optimization (use specialized tools like Gurobi or OR-Tools).
Multi-task optimization across many parallel problems.

Reproducibility

All optimizers accept a seed argument. Wrap experiments in the standard reproducibility context:

python

with tgx.reproducible(seed=42, deterministic=True):
            result = tgx.optimize_graph(objective="connectivity", algorithm="ga", ...)

Note that evolutionary algorithms have many sources of randomness (initial population, mutation, crossover, selection). Seeded runs will be reproducible; the same population evolves identically.

FAQ

Q: How large can the graphs be?
A: The framework has been tested on graphs up to a few hundred nodes. For thousands of nodes, the genome representation and operators become expensive. Consider problem-specific encodings.

Q: Can I run multiple optimizations in parallel?
A: Yes, with Python's multiprocessing. Each run is independent. Aggregate results yourself.

Q: Does the framework support population diversity tracking?
A: NSGA-II tracks Pareto diversity. The GA reports basic statistics. For detailed diversity analysis, you will need to instrument the fitness function manually.

Q: What about hybrid GA + local search?
A: Not built in. You can wrap any optimizer's output with a hill-climbing phase manually.

Q: Is there a documented example of NAS using this?
A: The examples/evolutionary_graph_optimization_demo.py in the TGraphX repository shows the basic pattern. NAS-specific examples are not shipped.