Today, consensus clients cannot easily provide individual pieces of data from the BeaconState together with the proofs needed to verify them. Ethereum’s Light Client system defines some proof paths, but there is no universal or standard way for clients to generate or serve these proofs. Downloading the entire BeaconState is not realistic—the state for slot 12,145,344 is around 271 MB, which is too large to send over the network quickly and puts unnecessary load on both the node and the user. The spec even warns that the debug endpoints used for fetching full states are meant only for diagnostics, not real-world use.
A much better solution is to use Merkle proofs or multiproofs, which allow the provider to send only a very small, verifiable part of the state. This is especially useful because most of the state size comes from validators (~232 MB) and balances (~15 MB); the rest of the fields are about ~24 MB. If a user needs only one small field, it’s wasteful to download the entire 271 MB state. Instead, a Merkle proof can deliver just the requested leaf plus its authentication path—usually only a few kilobytes.
Because of this, we need a general and standardized way for clients to request only the data they need, along with the proof required to verify it. This reduces bandwidth, reduces CPU load, and replaces today’s scattered and custom implementations (for example, Nimbus’s special handling of historical_summaries).
This work is also important for the future of Ethereum. SSZ is becoming more central to the protocol: Pureth (EIP-7919) proposes replacing RLP with SSZ, and the upcoming beam chain (also called the lean chain) will leverage SSZ as its only serialization format. So building a clean, efficient, and standard method for proof-based data access is a key step toward future protocol upgrades.
Proposed Solution: Introducing the SSZ Query Language (SSZ-QL)
The idea of SSZ-QL was originally proposed by Etan Kissling. His main question was straightforward but powerful:
“What if we had a standard way to request any SSZ field — together with a Merkle proof — directly from any consensus client?”
Today, consensus clients do not offer a general or standardized method to request specific SSZ data with proofs. Some ad-hoc solutions exist (for example, Nimbus’ basic queries used by the verifying web3signer), but there is no proper, universal SSZ query language available—and certainly nothing ready at the time this idea was written.
Etan’s proposal describes what an SSZ Query Language should allow:
- Requesting any subtree inside an SSZ object
- Choosing whether a field should be fully expanded or returned only as a hash_tree_root
- Filtering (for example, finding a transaction with a certain root)
- Using back-references (e.g., retrieving the receipt at the same index as a matching transaction)
- Specifying where the proof should be anchored
- Supporting forward compatibility so clients can safely ignore unknown future fields
This kind of API could be used by both consensus and execution clients. With forward-compatible SSZ types (like those from EIP-7495), request and response structures can even be generated automatically.
Building on this idea, the proposed solution by Jun and Fernando, who are developing this as part of their EPF project in prysm, is to add a new Beacon API endpoint that supports SSZ Query Language (SSZ-QL). This endpoint lets users fetch exactly the SSZ data they need—no more, no less—together with a Merkle proof that verifies its correctness. The initial version will offer a minimal but practical feature set, which already covers most real use cases. (The draft API specification is available for review.)
Beyond this minimal version, also plan to create a full SSZ-QL specification. This expanded version will support advanced features such as filtering, requesting data ranges, and choosing custom anchor points, all with Merkle proofs included. They intend to propose this richer specification for inclusion in the official consensus specifications, and an early draft is already available for review.
Understanding Generalized Indexes (GI) Before Diving Into SSZ-QL
In SSZ, every object — including the entire BeaconState — is represented as a binary Merkle tree.
Ageneralized index (GI) is simply a number that uniquely identifies any node inside this tree.
The rules are very simple:
- Root node has generalized index:
GI = 1 - For any node with index
i:
left child =2*i,
right child =2*i + 1
So the whole tree is numbered like:
GI:1
/ \
GI:2 GI:3
/ \ / \
GI:4 GI:5 GI:6 GI:7
...
This numbering makes Merkle proofs easy. If you know the generalized index of a leaf, you know exactly where it sits in the tree and which sibling hashes must be included to verify it.
Example with Beacon State:
0 GenesisTime string
1 GenesisValidatorsRoot string
2 Slot string
3 Fork *Fork
4 LatestBlockHeader *BeaconBlockHeader
5 BlockRoots []string
6 StateRoots []string
7 HistoricalRoots []string
8 Eth1Data *Eth1Data
9 Eth1DataVotes []*Eth1Data
10 Eth1DepositIndex string
11 Validators []*Validator ← (p = 11)
12 Balances []string
13 RandaoMixes []string
14 Slashings []string
15 PreviousEpochAttestations []*pendingAttestation
16 CurrentEpochAttestations []*pedningAttestation
17 JustificationBits string
18 PreviousJustifiedCheckpoint *Checkpoint
19 CurrentJustifiedCheckpoint *Checkpoint
20 FinalizedCheckpoint *Checkpoint
There are 21 top-level fields (indexed 0..20). To place these into a Merkle tree, SSZ pads them up to the next power of two (32).
32 leaves → depth = 5.
Top-level leaves occupy the GI range:
32 ... 63
We compute the GI for a top-level field using:
Formula:
GI_top = 2^depth + field_index
For .validators, field index = 11
So:GI_validators = 2^5 + 11 = 32 + 11 = 43.
This GI (43) is the leaf commitment of the entire validator’s subtree inside the global BeaconState tree.
Multi-Level Proof: Example With validators[42].withdrawal_credentials
Now, suppose we want a proof for:
BeaconState.validators[42].withdrawal_credentials
This requires two levels of proof:
-
Prove that the entire validator’s subtree is included in the BeaconState root
We already know:
- Top-level GI for validators = 43
Using GI 43, the consensus client collects the sibling hashes on the path from leaf 43 up to root (e.g.,
GI 43 → 21 → 10 → 5 → 2 → 1).This gives the proof:
validators_root ---> BeaconState_root -
Prove that
validator[42].withdrawal_credentialsis inside the validator’s subtreeNow treat the validators list as its own Merkle tree.
Inside this subtree:
-
Validator
42is the 42-nd element → it maps to some leaf index (e.g. chunkk) inside this subtree. -
Withdrawal credentials lives inside one of the 32-byte SSZ chunks of validator #42 (for example chunk
k = 128— number doesn’t matter, just concept).We now generate:
leaf (withdrawal_credentials chunk) ---> validators_root
by collecting sibling hashes inside the local validator-subtree.
Final Combined Proof
You end up with:
1. Local Level Proof Proves withdrawal_credentials --> validator_root 2. Top-level branch proof Proves validator_root --> BeaconState_rootA verifier can now reconstruct the BeaconState root from only:
- the requested leaf
- the two lists of sibling nodes
- the known BeaconState root
No full state download needed.
┌───────────────────────────────┐ │ BeaconState Root │ └───────────────────────────────┘ ▲ │ (Top-level Merkle Proof) │ Sibling hashes for GI = 43 │ ┌─────────────────────────────────────────┐ │ validators_root (GI = 43) │ └─────────────────────────────────────────┘ ▲ │ (Local Subtree Proof) │ Proof inside validators list │ for index = 42 │ ┌─────────────────────────────────────────────────────────┐ │ Validator[42] Subtree (list element #42) │ └─────────────────────────────────────────────────────────┘ ▲ │ (Field-level Merkle Proof) │ Sibling hashes inside the │ validator struct │ ┌──────────────────────────────────────────┐ │ validator[42].withdrawal_credentials │ ← requested field └──────────────────────────────────────────┘ -
Understanding SSZ Serialization Before Computing Generalized Indices
To compute a correct generalized index, you must first understand how SSZ serializes and merklizesdifferent data types.
Generalized indices don’t exist in isolation—they are derived from theshape of the Merkle tree, and the shape of the tree depends entirely on how SSZ interprets the underlying Go struct fields.
In SSZ, each field can only be one of two categories:
-
Base Types (fixed-size values)
uint64,Bytes32,Bytes20,uint256etc. These are straightforward — they always serialize into a fixed number of bytes. -
Composite Types
Container(like BeaconState),Vector[T, N](fixed length),List[T, N](variable length),Bitvector[N],Bitlist[N]And each of them is serialized in a slightly different way.To compute a generalized index (g-index) for any field inside a state, the SSZ tree must first know how that field is serialized. This is why the generated
*.pb.gofiles include tags such as:
ssz-size:"8192,32" → Vector
ssz-max:"16" → List
ssz-size:"?,32" → List of Vector
To compute a generalized index for any field, we must first understand the SSZ structure of the object:
- which fields exist,
- whether each field is a List or Vector,
- how many chunks each field occupies,
- and how nested types should be traversed.
This is exactly what the AnalyzeObject function does in Prysm, located at encoding/ssz/query/analyzer.go
// AnalyzeObject analyzes given object and returns its SSZ information.
func AnalyzeObject(obj SSZObject) (*SszInfo, error) {
value := reflect.ValueOf(obj)
info, err := analyzeType(value, nil)
if err != nil {
return nil, fmt.Errorf("could not analyze type %s: %w", value.Type().Name(), err)
}
// Populate variable-length information using the actual value.
err = PopulateVariableLengthInfo(info, value)
if err != nil {
return nil, fmt.Errorf("could not populate variable length info for type %s: %w", value.Type().Name(), err)
}
return info, nil
}
What analyzeType Does
analyzeType is the function that examines a Go value using reflection and figures out what kind of SSZ type it is. It is a pure type-analysis step — it does not depend on the actual runtime values, only on the Go type and the struct tags.
When you give it a field or struct, it:
- Checks the Go kind (uint, struct, slice, pointer, etc.)
- Reads SSZ-related struct tags like
ssz-sizeandssz-max - Decides whether this field is:
- a basic SSZ type (
uint64,uint32,bool) - a Vector (
ssz-size:"N") - a List (
ssz-max:"N") - a Bitvector / Bitlist
- a Container (struct)
- a basic SSZ type (
- Builds an SszInfo record that describes:
- the SSZ type (List, Vector, Container...)
- whether it is fixed-sized or variable-sized
- offsets of fields (for Containers)
- nested SSZ information for child fields
Think of analyzeType as the function that scans the type definition and produces a static SSZ layout blueprint for this type.
What PopulateVariableLengthInfo Does
While analyzeType studies the type, some SSZ objects cannot be fully described without the actual value.
Examples:
- Lists (
[]T) need to know their current length - Variable-sized container fields need their actual offset
- Nested lists need each element’s actual size
PopulateVariableLengthInfo fills in this missing runtime information.
It:
- Looks at the
SszInfoblueprint created byanalyzeType - Looks at the actual value of the object passed
- Computes values that can only be known at runtime:
- length of Lists
- sizes of nested variable elements
- offsets of variable-sized fields inside Containers
- bitlist length from bytes
It processes everything recursively — for example, a Container with a List containing structs with Lists will all be filled in.
Think of PopulateVariableLengthInfo as the function that takes the blueprint from analyzeType and fills in the real measurements based on the actual value you pass.
Example:
Let's test this function with a passing BeaconState struct
type BeaconState struct {
state protoimpl.MessageState `protogen:"open.v1"`
GenesisTime uint64 `protobuf:"varint,1001,opt,name=genesis_time,json=genesisTime,proto3" json:"genesis_time,omitempty"`
GenesisValidatorsRoot []byte `protobuf:"bytes,1002,opt,name=genesis_validators_root,json=genesisValidatorsRoot,proto3" json:"genesis_validators_root,omitempty" ssz-size:"32"`
Slot github_com_OffchainLabs_prysm_v7_consensus_types_primitives.Slot `protobuf:"varint,1003,opt,name=slot,proto3" json:"slot,omitempty" cast-type:"github.com/OffchainLabs/prysm/v7/consensus-types/primitives.Slot"`
Fork *Fork `protobuf:"bytes,1004,opt,name=fork,proto3" json:"fork,omitempty"`
LatestBlockHeader *BeaconBlockHeader `protobuf:"bytes,2001,opt,name=latest_block_header,json=latestBlockHeader,proto3" json:"latest_block_header,omitempty"`
BlockRoots [][]byte `protobuf:"bytes,2002,rep,name=block_roots,json=blockRoots,proto3" json:"block_roots,omitempty" ssz-size:"8192,32"`
StateRoots [][]byte `protobuf:"bytes,2003,rep,name=state_roots,json=stateRoots,proto3" json:"state_roots,omitempty" ssz-size:"8192,32"`
HistoricalRoots [][]byte `protobuf:"bytes,2004,rep,name=historical_roots,json=historicalRoots,proto3" json:"historical_roots,omitempty" ssz-max:"16777216" ssz-size:"?,32"`
Eth1Data *Eth1Data `protobuf:"bytes,3001,opt,name=eth1_data,json=eth1Data,proto3" json:"eth1_data,omitempty"`
Eth1DataVotes []*Eth1Data `protobuf:"bytes,3002,rep,name=eth1_data_votes,json=eth1DataVotes,proto3" json:"eth1_data_votes,omitempty" ssz-max:"2048"`
Eth1DepositIndex uint64 `protobuf:"varint,3003,opt,name=eth1_deposit_index,json=eth1DepositIndex,proto3" json:"eth1_deposit_index,omitempty"`
Validators []*Validator `protobuf:"bytes,4001,rep,name=validators,proto3" json:"validators,omitempty" ssz-max:"1099511627776"`
Balances []uint64 `protobuf:"varint,4002,rep,packed,name=balances,proto3" json:"balances,omitempty" ssz-max:"1099511627776"`
RandaoMixes [][]byte `protobuf:"bytes,5001,rep,name=randao_mixes,json=randaoMixes,proto3" json:"randao_mixes,omitempty" ssz-size:"65536,32"`
Slashings []uint64 `protobuf:"varint,6001,rep,packed,name=slashings,proto3" json:"slashings,omitempty" ssz-size:"8192"`
PreviousEpochAttestations []*PendingAttestation `protobuf:"bytes,7001,rep,name=previous_epoch_attestations,json=previousEpochAttestations,proto3" json:"previous_epoch_attestations,omitempty" ssz-max:"4096"`
CurrentEpochAttestations []*PendingAttestation `protobuf:"bytes,7002,rep,name=current_epoch_attestations,json=currentEpochAttestations,proto3" json:"current_epoch_attestations,omitempty" ssz-max:"4096"`
JustificationBits github_com_OffchainLabs_go_bitfield.Bitvector4 `protobuf:"bytes,8001,opt,name=justification_bits,json=justificationBits,proto3" json:"justification_bits,omitempty" cast-type:"github.com/OffchainLabs/go-bitfield.Bitvector4" ssz-size:"1"`
PreviousJustifiedCheckpoint *Checkpoint `protobuf:"bytes,8002,opt,name=previous_justified_checkpoint,json=previousJustifiedCheckpoint,proto3" json:"previous_justified_checkpoint,omitempty"`
CurrentJustifiedCheckpoint *Checkpoint `protobuf:"bytes,8003,opt,name=current_justified_checkpoint,json=currentJustifiedCheckpoint,proto3" json:"current_justified_checkpoint,omitempty"`
FinalizedCheckpoint *Checkpoint `protobuf:"bytes,8004,opt,name=finalized_checkpoint,json=finalizedCheckpoint,proto3" json:"finalized_checkpoint,omitempty"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
package main
import (
"fmt"
"github.com/OffchainLabs/prysm/v7/encoding/ssz/query"
eth "github.com/OffchainLabs/prysm/v7/proto/prysm/v1alpha1"
)
func main() {
v := ð.BeaconState{}
// Analyze it with Prysm’s existing SSZ analyzer
info, _ := query.AnalyzeObject(v)
fmt.Println(info.Print())
}
Output:
BeaconState (Variable-size / size: 2687377)
├─ genesis_time (offset: 0) uint64 (Fixed-size / size: 8)
├─ genesis_validators_root (offset: 8) Bytes32 (Fixed-size / size: 32)
├─ slot (offset: 40) Slot (Fixed-size / size: 8)
├─ fork (offset: 48) Fork (Fixed-size / size: 16)
│ ├─ previous_version (offset: 0) Bytes4 (Fixed-size / size: 4)
│ ├─ current_version (offset: 4) Bytes4 (Fixed-size / size: 4)
│ └─ epoch (offset: 8) Epoch (Fixed-size / size: 8)
├─ latest_block_header (offset: 64) BeaconBlockHeader (Fixed-size / size: 112)
│ ├─ slot (offset: 0) Slot (Fixed-size / size: 8)
│ ├─ proposer_index (offset: 8) ValidatorIndex (Fixed-size / size: 8)
│ ├─ parent_root (offset: 16) Bytes32 (Fixed-size / size: 32)
│ ├─ state_root (offset: 48) Bytes32 (Fixed-size / size: 32)
│ └─ body_root (offset: 80) Bytes32 (Fixed-size / size: 32)
├─ block_roots (offset: 176) Vector[Bytes32, 8192] (Fixed-size / size: 262144)
├─ state_roots (offset: 262320) Vector[Bytes32, 8192] (Fixed-size / size: 262144)
├─ historical_roots (offset: 2687377) List[Bytes32, 16777216] (Variable-size / length: 0, size: 0)
├─ eth1_data (offset: 524468) Eth1Data (Fixed-size / size: 72)
│ ├─ deposit_root (offset: 0) Bytes32 (Fixed-size / size: 32)
│ ├─ deposit_count (offset: 32) uint64 (Fixed-size / size: 8)
│ └─ block_hash (offset: 40) Bytes32 (Fixed-size / size: 32)
├─ eth1_data_votes (offset: 2687377) List[Eth1Data, 2048] (Variable-size / length: 0, size: 0)
├─ eth1_deposit_index (offset: 524544) uint64 (Fixed-size / size: 8)
├─ validators (offset: 2687377) List[Validator, 1099511627776] (Variable-size / length: 0, size: 0)
├─ balances (offset: 2687377) List[uint64, 1099511627776] (Variable-size / length: 0, size: 0)
├─ randao_mixes (offset: 524560) Vector[Bytes32, 65536] (Fixed-size / size: 2097152)
├─ slashings (offset: 2621712) Vector[uint64, 8192] (Fixed-size / size: 65536)
├─ previous_epoch_attestations (offset: 2687377) List[PendingAttestation, 4096] (Variable-size / length: 0, size: 0)
├─ current_epoch_attestations (offset: 2687377) List[PendingAttestation, 4096] (Variable-size / length: 0, size: 0)
├─ justification_bits (offset: 2687256) Bitvector[8] (Fixed-size / size: 1)
├─ previous_justified_checkpoint (offset: 2687257) Checkpoint (Fixed-size / size: 40)
│ ├─ epoch (offset: 0) Epoch (Fixed-size / size: 8)
│ └─ root (offset: 8) Bytes32 (Fixed-size / size: 32)
├─ current_justified_checkpoint (offset: 2687297) Checkpoint (Fixed-size / size: 40)
│ ├─ epoch (offset: 0) Epoch (Fixed-size / size: 8)
│ └─ root (offset: 8) Bytes32 (Fixed-size / size: 32)
└─ finalized_checkpoint (offset: 2687337) Checkpoint (Fixed-size / size: 40)
├─ epoch (offset: 0) Epoch (Fixed-size / size: 8)
└─ root (offset: 8) Bytes32 (Fixed-size / size: 32)
In the SSZ analyzer output, the offset shown for each field represents the exact byte position where that field begins when the entire struct is serialized according to SSZ rules. SSZ serialization lays out all fixed-size fields first, tightly packed one after another, and the offset tells you where each of these fields starts within that packed byte stream. For example, in the line root (offset: 8) Bytes32 (Fixed-size / size: 32), the field root is a 32-byte fixed-size value, and its serialized bytes begin at position 8 in the SSZ-encoded byte array. The size indicates how many bytes the field contributes to the serialized output (32 bytes in this case). For fixed-size types, the size is predetermined, while for variable-size types, the analyzer computes the size based on the actual value. Together, the offset and size show exactly how the SSZ layout is organized in memory when the struct is serialized.
Example: Finding the Merkle Leaf for a Field Using the Offset
Let’s take a real field from the SSZ Analyzer Output:
├─ fork (offset: 48) Fork (Fixed-size / size: 16)
│ ├─ previous_version (offset: 0) Bytes4 (Fixed-size / size: 4)
│ ├─ current_version (offset: 4) Bytes4 (Fixed-size / size: 4)
│ └─ epoch (offset: 8) Epoch (Fixed-size / size: 8)
We want to prove the field:
fork.epoch
The “fork” field in BeaconState starts at offset 48 in the serialized byte stream.
Inside fork, the epoch field starts at offset 8 (relative to the start of Fork).
So:
absolute_offset = base_offset_of_fork + offset_of_epoch_inside_fork
absolute_offset = 48 + 8 = 56 bytes
fork.epoch begins at byte 56 of the full serialized BeaconState.
SSZ divides serialization into 32-byte chunks:
- Chunk 0 → bytes 0–31
- Chunk 1 → bytes 32–63
- Chunk 2 → bytes 64–95
- …
Now find which chunk contains byte 56:
chunk_index = floor(56 / 32) = 1
So:
The leaf containing fork.epoch is Leaf / Chunk 1.
fork.epoch is an 8-byte integer
Within chunk 1 (bytes 32–63):
local_offset = 56 - 32 = 24
So inside the 32-byte leaf, the bytes look like:
[ 0 … 23 ] → unrelated fields
[ 24 … 31 ] → fork.epoch (8 bytes)
To prove this value, you:
- Take chunk 1 → this is your leaf.
- When hashing up the tree, at each level:
- If chunk is a left child → record the right sibling hash.
- If chunk is a right child → record the left sibling hash.
- Continue until you reach the top Merkle root.
The collected sibling hashes form your:
➡ SSZ Merkle proof branch for fork.epoch
Anyone can verify this by recomputing:
hash_tree_root(leaf + all_siblings) == state_root
This introduces two new endpoints that expose the initial version of SSZ Query Language (SSZ-QL) in Prysm:
/prysm/v1/beacon/states/{state_id}/query
/prysm/v1/beacon/blocks/{block_id}/query
Both endpoints follow the SSZ-QL endpoint specification and allow clients to request specific fields inside a BeaconState or BeaconBlock using a query string. The server returns the requested SSZ field encoded as raw SSZ bytes. For now, at the time of writing this, the feature supports only a single query per request, and the include_proof flag is ignored — the PR always returns responses without Merkle proofs.
The request structure is:
type SSZQueryRequest struct {
Query string `json:"query"`
IncludeProof bool `json:"include_proof,omitempty"`
}
And both endpoints return an SSZ-encoded response of this form:
type SSZQueryResponse struct {
state protoimpl.MessageState `protogen:"open.v1"`
Root []byte `protobuf:"bytes,1,opt,name=root,proto3" json:"root,omitempty" ssz-size:"32"`
Result []byte `protobuf:"bytes,2,opt,name=result,proto3" json:"result,omitempty" ssz-max:"1073741824"`
unknownFields protoimpl.UnknownFields
sizeCache protoimpl.SizeCache
}
For the full specification and examples, you can refer to this link
For now, the implementation locates the requested field using the computed offset and size information from the SSZ analyzer, rather than using a generalized index.
