JAM Virtual Machine: Program Blob Structure

AI generated abstract image representing the program blob

Understanding the JAM Program Blob Structure

The JAM virtual machine uses a highly structured program blob to execute its programs, where each component of the blob plays a critical role in managing jumps, instructions, and their metadata. While the underlying formula defining this structure may appear complex at first glance, a closer look reveals a logical and efficient encoding system that allows the JAM virtual machine to interpret and execute bytecode programs. In this article, we will break down the JAM program blob structure step by step, demystifying its components and illustrating how they come together to enable dynamic execution within the virtual machine.

In Appendix A.1. of the JAM graypaper, the program blob structure is defined by a formula that may seem complex at first. However, when broken down, the structure becomes more logical and approachable. The definition looks something like this:

p=E(j)E1(z)E(c)Ez(j)E(c)E(k), k=cp = \mathcal{E}(|j|) \frown \mathcal{E}_1(z) \frown \mathcal{E}(|c|) \frown \mathcal{E}_z(j) \frown \mathcal{E}(c) \frown \mathcal{E}(k), \ |k| = |c|

Let’s break this down step by step:

1. The Program Structure pp


The entire program blob is denoted as pp. This represents the collection of encoded data that constitutes the program in the JAM virtual machine.

2. The Encodings E\mathcal{E}

The symbol E\mathcal{E} represents an encoding function that applies to different elements of the program. The encoding function varies depending on the element being encoded, as indicated by the subscripts or different arguments.

3. Component Encodings

The program consists of a sequence of encoded components, separated by the symbol \frown. This indicates a concatenation of encodings that make up the overall program blob.

- E(j)\mathcal{E}(|j|): This represents the variable integer encoding of the size of the dynamic jump table jj, as described in Appendix C.1.2. Here, j|j| refers to the length, i.e., number of elements inside the dynamic jump table jj.

- E1(z)\mathcal{E}_1(z): This is the one-byte little-endian encoding of zz as described in Appendix C.1.2. The value zz is later used in the term Ez\mathcal{E}_z, denoting the used encoding.

- E(c)\mathcal{E}(|c|): Similarly, this represents the variable integer encoding of the size of the instruction data cc, where cc, refers to the length of cc, i.e., how many bytes the encoded cc,consists of.

- Ez(j)\mathcal{E}_z(j): This indicates that the encoding Ez\mathcal{E}_z is applied to each value inside the dynamic jump table jj.

- E(c)\mathcal{E}(c) This is the actual encoding of the instruction data cc, which includes the opcodes and their argument values. This is the core part of the program that will be executed by the JAM virtual machine.

- E(k)\mathcal{E}(k): Here, kk represents the opcode bitmask. The bitmask is a crucial part of the program because each set bit in kk indicates the position within cc where an instruction starts, i.e., the location of each opcode. The condition k=c|k| = |c| ensures that the bitmask has the same size as the instruction data, allowing precise identification of where instructions begin within cc.

Example Breakdown of the JAM Program Blob

Let’s walk through an example to better understand how the components of the JAM program blob fit together. The structure of the blob, as outlined in the formula, includes the dynamic jump table, the instruction data, and the bit-mask, among other elements. Here's a visual representation:

Sections:

1. Jump Table Size (j|j|): 1 byte
2. Jump Table Entry Size (zz): 1 byte
3. Instruction Data Length (c|c|): 1 byte
4. Jump Table Entries (jj): 1 entry
5. Instruction Data (cc): 15 bytes
6. Opcode Bitmask (kk): 2 bytes

From this example, let's break down the components step by step:

1. Jump Table Size (j|j|) and Encoding (E(j)\mathcal{E}(|j|))


- In the first part of the blob, we encode the size of the dynamic jump table, denoted as |j|. This size is encoded using variable integer encoding, as described in Appendix C.1.2.
- In this example, the jump table has a size of 1 We encode this size, and it takes 1 byte of space.

2. Jump Table Entry Size with One-Byte Encoding of z(E1(z))z (\mathcal{E}_1(z))


- Next, we encounter the one-byte encoding of zz. This value represents the number of bytes used for encoding each value the dynamic jump table. In our example it has the value of 1.
- Here, zz is also 1 byte long, as shown by the notation.

3. Instruction Data Length (cc) and Encoding (E(c)\mathcal{E}(|c|))


- Moving forward, we encode the size of the instruction data (c|c|) using the same variable integer encoding method. This provides a compact representation of the instruction data's size.
- In this case, the instruction data has a length of 15 bytes, and the encoded length uses 1 byte.

4. Jump Table Entries (Ez(j)\mathcal{E}_z(j))


- The next component is the dynamic jump table itself, with each entry encoded using the encoding Ez\mathcal{E}_z.
- In the example, the jump table has only one entry: 49. Values are encoded according to the specified Ez\mathcal{E}_z encoding method, which in this case is E1\mathcal{E}_1 as the value for zz is 1.

5. Instruction Data (E(c)\mathcal{E}(c))


- The instruction data, c, contains the actual opcodes and argument values that make up the program's executable code. This is where the core logic of the program resides.
- In the example, the instruction data is simply represented, with each position in c being indexed for reference later by the bitmask.

6. Opcode Bitmask (kk) and Encoding (E(k)\mathcal{E}(k))


- The final component is the bitmask, denoted as k. This bitmask serves as a map to indicate where instructions start within the instruction data c.
- In the bitmask, each set bit (i.e., a bit with value 1) corresponds to a position in c where an opcode starts. In this example, the bitmask is represented as a series of bits 1000 0011 0000 1001 which we can

Here is also a representation of the opcode bitmask and its meaning:

Each bit corresponds to whether an instruction starts at that position in the instruction data (c):

- 1: Indicates the start of an instruction.
- 0: No instruction starts at this position, i.e., argument data.

Since we know that the instruction data is 15 bytes long, we only look at the first 15 bitmask bits (starting from right).

Putting It All Together

The jump table helps manage dynamic jumps, allowing for flexibility in the program's execution flow. The instruction data cc contains the program's instructions, while the opcode bitmask kk ensures that we know exactly where each instruction starts.

The size of kk is always equal to the size of cc , ensuring that each bit can map to a specific byte in the instruction data.

Summary

This example illustrates how the JAM program blob structure is constructed and how each component — whether it's the jump table, instruction data, or opcode bitmask — fits into the overall structure. By using compact encodings and a clear organization, the JAM virtual machine can efficiently interpret and execute the program.


More Posts

Visualization of a Patricia Trie structure showing compressed binary paths leading to encoded state components, with nodes representing state transitions and leaves containing encoded values such as α, β, and other state elements.

Patricia Trie: Key to Efficient State Management in JAM Protocol

In large-scale blockchain systems, the Patricia Trie efficiently organizes and stores state components in a compressed binary structure. This article explains how the trie leverages database storage for scalability, enabling efficient updates to the root hash and supporting Merkle proofs for light clients. By storing node hashes and updating only the necessary portions of the trie, implementers can maintain optimal performance while providing cryptographic proofs of past states.

An in-depth explanation of Merkle tree structures from Appendix E of the Gray Paper, covering well-balanced and constant-depth Merkle trees. The post also introduces the node operator as an abstraction to highlight similarities between these trees and explains the trace operator, which is used to generate cryptographic proofs. Visual examples are provided to illustrate how input values are divided in well-balanced trees and padded for constant-depth trees, ensuring efficient verification and data integrity in cryptographic systems.

Merkle Tree Structure: Appendix E of the Gray Paper

In this post, we break down the key Merkle tree structures from Appendix E of the Gray Paper, including well-balanced and constant-depth Merkle trees. We'll also explain the trace operator used for cryptographic proofs and explore how these structures ensure data integrity and efficient verification in cryptographic systems.

llustration of the Safrole algorithm’s process for block creation and fork prevention, showing how validators, tickets, and bandersnatch signatures work in decentralized blockchain systems.

Safrole Algorithm Demystified: JAM Block Creation

The Safrole algorithm plays a critical role in preventing forks in blockchain by managing block creation through a preselected set of validators. Learn how it functions, the key state components, ticket submission process, and how blocks are verified and imported in a decentralized system.