Block sampling

The sampling engine: schedules, programs, and the entry points that run block Gibbs and read states back.

SamplingScheduleclass

SamplingSchedule(n_warmup: int, n_samples: int, steps_per_sample: int)

Represents a sampling schedule for a process.

Attributes:

n_warmup: The number of warmup steps to run before collecting samples.
n_samples: The number of samples to collect.
steps_per_sample: The number of steps to run between each sample.

n_warmupattribute

n_warmup: int

n_samplesattribute

n_samples: int

steps_per_sampleattribute

steps_per_sample: int

BlockGibbsSpecclass

BlockGibbsSpec(free_super_blocks: Sequence[tuple[Block, ...] | Block], clamped_blocks: list[Block], node_shape_dtypes: Mapping[Type[AbstractNode], PyTree[jax.ShapeDtypeStruct]] = {SpinNode: ShapeDtypeStruct(shape=(), dtype=bool), CategoricalNode: ShapeDtypeStruct(shape=(), dtype=uint8)})

A BlockGibbsSpec is a type of BlockSpec which contains additional information on free and clamped blocks.

This entity also supports SuperBlocks, which are merely groups of blocks which are sampled at the same time algorithmically, but not programmatically. That is to say, superblock = (block1, block2) means that the states input to block1 and block2 are the same, but they are not executed at the same time. This may be because they are the same color on a graph, but require vastly different sampling methods such that JAX SIMD approaches are not feasible to parallelize them.

A recurring theme in thrml is the importance of implicit indexing. One such example can be seen here. Because global states are created by concatenating lists of free and clamped blocks, providing the inputs in the same order as the blocks are defined is essential. This is almost always taken care of internally, but when writing custom functions or interfaces this is important to keep in mind.

Attributes:

free_blocks: the list of free blocks (in order)
sampling_order: a list of len(superblocks) lists, where each sampling_order[i] is the index of free_blocks to sample. Sampling is done by iterating over this order and sampling each sublist of free blocks at the same algorithmic time.
clamped_blocks: the list of clamped blocks
superblocks: the list of superblocks

free_blocksattribute

free_blocks: list[Block]

sampling_orderattribute

sampling_order: list[list[int]]

clamped_blocksattribute

clamped_blocks: list[Block]

superblocksattribute

superblocks: list[tuple[Block, ...]]

blocksattribute

blocks: list[Block]

all_block_sdsattribute

all_block_sds: list[tuple[PyTree, tuple[jax.ShapeDtypeStruct, ...]]]

global_sd_orderattribute

global_sd_order: list[tuple[PyTree, tuple[jax.ShapeDtypeStruct, ...]]]

sd_index_mapattribute

sd_index_map: dict[tuple[PyTree, tuple[jax.ShapeDtypeStruct, ...]], int]

node_global_location_mapattribute

node_global_location_map: dict[AbstractNode, tuple[int, int]]

block_to_global_slice_specattribute

block_to_global_slice_spec: list[list[int]]

node_shape_dtypesattribute

node_shape_dtypes: dict[Type[AbstractNode], tuple[PyTree, tuple[jax.ShapeDtypeStruct, ...]]]

node_shape_structattribute

node_shape_struct: dict[Type[AbstractNode], PyTree[jax.ShapeDtypeStruct]]

BlockSamplingProgramclass

BlockSamplingProgram(gibbs_spec: BlockGibbsSpec, samplers: list[AbstractConditionalSampler], interaction_groups: list[InteractionGroup])

A PGM block-sampling program.

This class encapsulates everything that is needed to run a PGM block sampling program in THRML. per_block_interactions and per_block_interaction_active are parallel to the free blocks in gibbs_spec, and their members are passed directly to a sampler when the state of the corresponding free block is being updated during a sampling program. per_block_interaction_global_inds and per_block_interaction_global_slices are also parallel to the free blocks, and are used to slice the global state of the program to produce the state information required to update the state of each block alongside the static information contained in the interactions.

Attributes:

gibbs_spec: A division of some PGM into free and clamped blocks.
samplers: A sampler to use to update every free block in gibbs_spec.
per_block_interactions: All the interactions that touch each free block in gibbs_spec.
per_block_interaction_active: indicates which interactions are real and which interactions are not part of the model and have been added to pad data structures so that they can be rectangular.
per_block_interaction_global_inds: how to find the information required to update each block within the global state list
per_block_interaction_global_slices: how to slice each array in the global state list to find the information required to update each block

gibbs_specattribute

gibbs_spec: BlockGibbsSpec

samplersattribute

samplers: list[AbstractConditionalSampler]

per_block_interactionsattribute

per_block_interactions: list[list[PyTree]]

per_block_interaction_activeattribute

per_block_interaction_active: list[list[Array]]

per_block_interaction_global_indsattribute

per_block_interaction_global_inds: list[list[list[int]]]

per_block_interaction_global_slicesattribute

per_block_interaction_global_slices: list[list[list[Array]]]

sample_statesfunction

sample_states(key: Key[Array, ''], program: BlockSamplingProgram, schedule: SamplingSchedule, init_state_free: list[PyTree[Shaped[Array, 'nodes ?*state']]], state_clamp: list[PyTree[Shaped[Array, 'nodes ?*state'], '_State']], nodes_to_sample: list[Block]) -> list[PyTree[Shaped[Array, 'n_samples nodes ?*state']]]

Convenience wrapper to collect state information for *nodes_to_sample* only.

Internally builds a [thrml.StateObserver][], runs [thrml.sample_with_observation][], and returns a stacked tensor of shape (schedule.n_samples, ...).

sample_blocksfunction

sample_blocks(key: Key[Array, ''], state_free: list[PyTree[Shaped[Array, 'nodes ?*state'], '_State']], clamp_state: list[PyTree[Shaped[Array, 'nodes ?*state'], '_State']], program: BlockSamplingProgram, sampler_state: list[~_SamplerState]) -> tuple[list[PyTree[Shaped[Array, 'nodes ?*state'], '_State']], list[~_SamplerState]]

Perform one iteration of sampling, visiting every block.

Arguments:

key: The JAX PRNG key.
state_free: The state of the free blocks.
clamp_state: The state of the clamped blocks.
program: The Gibbs program.
sampler_state: The state of the sampler.

Returns:

Updated free-block state list and sampler-state list.

sample_single_blockfunction

sample_single_block(key: Key[Array, ''], state_free: list[PyTree[Shaped[Array, 'nodes ?*state'], '_State']], clamp_state: list[PyTree[Shaped[Array, 'nodes ?*state'], '_State']], program: BlockSamplingProgram, block: int, sampler_state: ~_SamplerState, global_state: list[PyTree] | None = None) -> tuple[PyTree[Shaped[Array, 'nodes ?*state'], '_State'], ~_SamplerState]

Samples a single block within a Gibbs sampling program based on the current states and program configurations. It extracts neighboring states, processes required data, and applies a sampling function to generate output samples.

Arguments:

key: Pseudo-random number generator key to ensure reproducibility of sampling.
state_free: Current states of free blocks, representing the values to be updated during sampling.
clamp_state: Clamped states that remain fixed during the sampling process.
program: The Gibbs sampling program containing specifications, samplers, neighborhood information, and parameters.
block: Index of the block to be sampled in the current iteration.
sampler_state: The current state of the sampler that will be used to perform the update.
global_state: Optionally precomputed global state for the concatenated free and clamped blocks; when omitted the function constructs it internally.

Returns:

Updated block state and sampler state for the specified block.

sample_with_observationfunction

sample_with_observation(key: Key[Array, ''], program: BlockSamplingProgram, schedule: SamplingSchedule, init_chain_state: list[PyTree[Shaped[Array, 'nodes ?*state']]], state_clamp: list[PyTree[Shaped[Array, 'nodes ?*state'], '_State']], observation_carry_init: ~ObserveCarry, f_observe: AbstractObserver) -> tuple[~ObserveCarry, list[PyTree[Shaped[Array, 'n_samples nodes ?*state']]]]

Run the full chain and call an Observer after every recorded sample.

Arguments:

key: RNG key.
program: The sampling program.
schedule: Warm-up length, number of samples, number of steps between samples.
init_chain_state: Initial free-block state.
state_clamp: Clamped-block state.
observation_carry_init: Initial carry handed to f_observe.
f_observe: Observer instance.

Returns:

Tuple (final_observer_carry, samples) where samples is a PyTree whose leading axis has size schedule.n_samples.