Examples: Ising Models

Ising models are fundamental in statistical mechanics and machine learning, particularly as a foundational type of Energy-Based Model (EBM). They represent systems of interacting binary variables (spins) and are widely used for modeling phenomena from magnetism to neural networks. THRML provides specialized tools for defining, sampling, and training Ising models efficiently.

This page will guide you through:

Defining an Ising model using thrml.models.IsingEBM.
Setting up and running a sampling program for an Ising model.
Estimating KL-gradients for training an Ising model.

1. Defining an Ising Model (`IsingEBM`)

An Ising model in THRML is represented by the thrml.models.IsingEBM class. This class encapsulates the model's structure (nodes and edges) and its parameters (biases, weights, and the inverse temperature beta).

Each variable in an Ising model is a binary spin, represented by a thrml.pgm.SpinNode, which takes values of -1 or +1 (internally False for -1 and True for +1 in JAX).

Let's define a simple 1D Ising chain:

In this example:

nodes is a list of SpinNode objects.
edges defines the pairs of nodes that interact.
biases is a 1D array, where biases[i] corresponds to nodes[i].
weights is a 1D array, where weights[i] corresponds to edges[i].
beta is a scalar jnp.array.

The IsingEBM internally constructs the necessary SpinEBMFactor objects to represent these biases and interactions, which are then used by the sampling and training programs.

2. Sampling from an Ising Model

Sampling an Ising model involves repeatedly updating the states of its spins to generate configurations that are distributed according to its Boltzmann distribution. THRML uses block Gibbs sampling for efficiency.

The core components for sampling are:

thrml.block_management.Block: Groups of nodes that can be sampled in parallel.
thrml.models.IsingSamplingProgram: Combines the IsingEBM with a specific partitioning of nodes into free and clamped blocks.
thrml.block_sampling.SamplingSchedule: Defines the number of warm-up steps, samples to collect, and steps between samples.
thrml.models.hinton_init: A heuristic for initializing the spin states.
thrml.block_sampling.sample_states: Executes the sampling process.

This example builds upon the previous model definition.

For a more detailed explanation of the sampling process, including the roles of Block, SamplingSchedule, and sample_states, refer to the Getting Started guide.

3. Training an Ising Model (KL-Gradient Estimation)

Training an EBM often involves minimizing the Kullback-Leibler (KL) divergence between the model's distribution and a target data distribution. For many EBMs, this gradient cannot be computed analytically and requires Monte Carlo estimation. THRML provides estimate_kl_grad for this purpose.

The KL-divergence gradient for an Ising model's parameters (weights J and biases b) is given by:

$$\Delta J_{ij} = -\beta (\langle s_i s_j \rangle_{+} - \langle s_i s_j \rangle_{-})$$ $$\Delta b_i = -\beta (\langle s_i \rangle_{+} - \langle s_i \rangle_{-})$$

Where:

$\langle\cdot\rangle_{+}$ denotes an expectation under the positive phase (data-clamped Boltzmann distribution).
$\langle\cdot\rangle_{-}$ denotes an expectation under the negative phase (model distribution).

Key components for training:

thrml.models.IsingTrainingSpec: Defines the two sampling programs (positive and negative phase) and their schedules.
thrml.models.estimate_moments: Estimates the first and second moments (averages of $s_i$ and $s_i s_j$) from samples.
thrml.models.estimate_kl_grad: Computes the KL-gradients based on moment estimations from both phases.

Let's set up a small Ising model and estimate its KL-gradients. For this illustrative example, we will compare the Monte Carlo estimate to an analytically derived exact gradient for a tiny model.

In this example, we:

Define a small IsingEBM with 4 nodes.
Partition the nodes into data_nodes (observed) and latent_nodes (hidden) for the training task.
Specify a single data point, where both data_nodes are in the -1 state (represented as False).
Construct IsingTrainingSpec which configures two distinct IsingSamplingPrograms and SamplingSchedules:
- Positive Phase: Samples latent_nodes while data_nodes are clamped to the provided data.
- Negative Phase: Samples all nodes (latent_nodes and data_nodes) freely, using the model's current parameters.
Initialize the sampling chains for both phases using hinton_init.
Call estimate_kl_grad to get the Monte Carlo estimates of the weight and bias gradients.
(Optional, for small models) Compute the exact KL-gradient by enumerating all possible states and compare it to the Monte Carlo estimate to verify the correctness of the gradient estimation.

This framework allows you to compute gradients for updating your IsingEBM parameters using an optimizer (e.g., optax or JAX's built-in optimizers) in a typical training loop.

By understanding these examples, you are well-equipped to define, sample, and train Ising models within the THRML framework, leveraging JAX's power for efficient computation.

Examples: Ising Models

Examples: Ising Models

1. Defining an Ising Model (IsingEBM)

2. Sampling from an Ising Model

3. Training an Ising Model (KL-Gradient Estimation)

1. Defining an Ising Model (`IsingEBM`)