DocumentationSubstreamsBasicsData Types in Substreams

Data Types in Substreams

In this chapter, we’ll explore the various data types available in Substreams and their practical applications. Understanding these data types will enable you to harness the full potential of Substreams for data extraction and transformation. We’ll cover:

  • Block
  • Clock
  • Calls
  • Traces
  • Receipts
  • Logs

Each section will include a brief overview, an explanation of key fields, and practical examples. For a comprehensive overview of all the data available in Substreams, please refer to the following data reference guides:

Block

The Block object represents a block on the source blockchain. It contains information about the block itself, its transactions, and balance and code changes if an Extended Block.

All fields on the Block for Ethereum mainnet can be found in the sf.ethereum.type.v2 package.

Key Fields

  • number: The block number.
  • hash: The hash of the block.
  • header: A BlockHeader object containing the block’s header information like its parent hash, the merkel root hash and all other information the form a block.
  • timestamp: The timestamp when the block was mined.
  • transaction_traces: A list of TransactionTrace objects (see Transaction Trace) included in the block ordered by their order of exectuion in the block.

Practical Example

Extracting a subset of basic information about each block:

#[substreams::handlers::map]
fn map_block_info(blk: eth::Block) -> Result<BlockInfo, substreams::errors::Error> {
    Ok(BlockInfo {
        number: blk.number,
        hash: blk.hash.to_string(),
        parent_hash: blk.parent_hash.to_string(),
        timestamp: blk.timestamp_seconds(),
    })
}

Clock

The Clock object provides a simpler view of time, particularly helpful for timeframe aggregations. It contains a subset of fields from the Block object, and is far more lightweight.

All fields in a Clock for EVM networks can be found in the sf.substreams.v1 package.

Key Fields

  • id: ID of the block.
  • number: The block number.
  • timestamp: The timestamp of the current block.

Given Clock is more lightweight than Block, it’s advisable to use this object if you only require some of the above fields.

Practical Example

The Clock object is especially useful in scenarios where you need to perform time-based aggregations or calculations. For example, you might want to aggregate events on a daily basis, track metrics over specific timeframes, or trigger certain actions at regular intervals.

Let’s use Clock to store daily event counts. This example demonstrates how to aggregate events on a daily basis using the Clock object.

#[substreams::handlers::store]
pub fn store_daily_events_count(
    clock: Clock,
    events: Events,
    store: StoreAddInt64,
) {
    // Convert the timestamp to seconds and calculate the day ID
    let timestamp_seconds = clock.timestamp.unwrap().seconds;
    let day_id: i64 = timestamp_seconds / 86400;
    let prev_day_id = day_id - 1;
 
    // Delete the previous day's count to clear the store of stale data
    store.delete_prefix(0, &format!("DailyEventsCount:{prev_day_id}:"));
 
    // Increment the count for the current day for each event
    for event in events.pool_events {
        store.add(0, &format!("DailyEventsCount:{day_id}:"), 1);
    }
}

Transaction Trace

The TransactionTrace object provides a detailed execution trace of a transaction.

All fields in a TransactionTrace for EVM networks can be found in the sf.ethereum.type.v2 package.

Key Fields

  • from: The address initiating the trace represented in bytes format.
  • to: The address receiving the trace action represented in bytes format.
  • value: The value transferred in the trace in the networks native currency (ETH on Ethereum Mainnet).
  • input: The input data the transaction receives for execution.

Practical Example

Let’s create a module to track all transactions to the GRT token contract address. This example will help monitor interactions with the GRT token contract.

const GRT_TRACKED_CONTRACT: [u8; 20] = hex!("c944e90c64b2c07662a292be6244bdf05cda44a7");
 
#[substreams::handlers::map]
pub fn map_transactions_to_grt(blk: eth::Block) -> Result<GrtTransactions, substreams::errors::Error> {
    // Create a new GrtTransactions object to store the results
    let mut grt_txs = GrtTransactions::default();
 
    // Iterate through each transaction trace in the block
    for trace in blk.transactions() {
        // Check if the transaction trace is directed to the GRT contract address
        if trace.to == GRT_TRACKED_CONTRACT {
            // If true, create a new Transaction object with relevant details
            grt_txs.transactions.push(Transaction {
                hash: trace.hash.clone(),
                from: trace.from.clone(),
                to: trace.to.clone(),
                value: trace.value.clone(),
                input: trace.input.clone(),
                block_number: blk.number,
                timestamp: blk.timestamp_seconds(),
            });
        }
    }
    // Return the GrtTransactions object containing all matching transactions
    Ok(grt_txs)
}

Call

The Call object represents function calls made within a transaction trace. Call includes details about the contract interactions, providing insights into what functions were executed, who it was exectued by, and with what parameters.

All fields in a Call for EVM networks can be found in the sf.ethereum.type.v2 package.

Key Fields

  • call_type: The type of call (e.g., call, delegatecall).
  • caller: The address making the call.
  • value: The amount of the networks native currency transferred with the call (ETH on Ethereum Mainnet).
  • input: The input data for the call, which often includes the function signature and parameters.
  • gas_limit: The maximum amount of gas that can be used for the call.
  • gas_consumed: The amount of gas consumed by the call.

Practical Example

Let’s create a module to track failed contract calls and capture the reasons for their failure. This example will help monitor interactions with smart contracts and understand why certain calls fail.

#[substreams::handlers::map]
pub fn map_reverted_contract_calls(blk: eth::Block) -> Result<FailedCalls, substreams::errors::Error> {
    // Create a vector to store the failed calls
    let failed_calls: Vec<FailedCall> = blk
        .calls()
        .filter_map(|call_view| {
            // Check if the call has failed
            if call_view.call.status_failed {
                // If true, create a new FailedCall object with relevant details
                return Some(FailedCall {
                    call_type: call_view.call.call_type().as_str_name().to_string(),
                    caller: Hex::encode(&call_view.call.caller),
                    reason: call_view.call.failure_reason.clone(),
                    gas_consumed: call_view.call.gas_consumed,
                });
            }
            None
        })
        .collect();
 
    // Return the FailedCalls object containing all failed calls
    Ok(FailedCalls { failed_calls })
}

Logs

A Log is one of the most useful data types for a Substreams developer as they provide full event logs, which allow subsequent event extraction. By leveraging logs, you can monitor specific events emitted by contracts and output relevant data.

All fields in a Log for EVM networks can be found in the sf.ethereum.type.v2 package.

Key Fields

  • address: The address of the contract that generated the log in bytes format.
  • topics: The indexed arguments for the event.
  • data: The data contained in the log in bytes format.
  • ordinal: The log’s position within the block.

Practical Example

Substreams provides a lot of helper functionality that enables us to extract useful information out of the data field of logs.

Let’s create a module to check if a log is coming from a specific contract (e.g., the Uniswap V3 Factory) and then extract a specific event, such as PoolCreated, from these logs.

const UNISWAP_V3_FACTORY: [u8; 20] = hex!("1f98431c8ad98523631ae4a59f267346ea31f984");
 
#[substreams::handlers::map]
pub fn map_pools_created(blk: eth::Block) -> Result<Pools, substreams::errors::Error> {
    // Create a new Pools object to store the results
    Ok(Pools {
        pools: blk
            .logs()
            .filter_map(|log| {
                // Check if the log is from the Uniswap V3 Factory contract
                if log.address() == UNISWAP_V3_FACTORY {
                    // Attempt to match and decode the PoolCreated event
                    if let Some(event) = PoolCreated::match_and_decode(&log) {
                        // If the event matches, create a new Pool object with relevant details
                        return Some(Pool {
                            address: Hex::encode(event.pool),
                            token0: Hex::encode(event.token0),
                            token1: Hex::encode(event.token1),
                            created_at_tx_hash: Hex(&log.receipt.transaction.hash).to_string(),
                            created_at_block_number: blk.number,
                            created_at_timestamp: blk.timestamp_seconds(),
                            log_ordinal: log.ordinal(),
                        });
                    }
                }
                None
            })
            .collect(),
    })
}

The above example can be simplified by utilizing the events helper function, which streamlines the process of filtering and decoding events from logs. It essentially performs the same functionality as the previous code snippet above, but in a more concise manner.

#[substreams::handlers::map]
pub fn map_pools_created(blk: eth::Block) -> Result<Pools, substreams::errors::Error> {
    // Create a new Pools object to store the results
    Ok(Pools {
        pools: blk
            .events::<PoolCreated>(&[&UNISWAP_V3_FACTORY])
            .map(|(event, log)| Pool {
                address: Hex::encode(event.pool),
                token0: Hex::encode(event.token0),
                token1: Hex::encode(event.token1),
                created_at_tx_hash: Hex(&log.receipt.transaction.hash).to_string(),
                created_at_block_number: blk.number,
                created_at_timestamp: blk.timestamp_seconds(),
                log_ordinal: log.ordinal(),
            })
            .collect(),
    })
}

Transaction Receipt

The TransactionReceipt object contains the receipt data of a transaction. Receipts provide valuable information about the execution of a transaction, including cumulative gas used, and logs generated.

All fields on the TransactionReceipt for Ethereum mainnet can be found in the sf.ethereum.type.v2 package.

Key Fields

  • cumulative_gas_used: The cumulative gas used in the transaction.
  • logs: A list of Log objects associated with the transaction.

Practical Example

Let’s create a module to extract logs from transaction receipts within a block. This example will capture details about each receipt, such as the transaction hash, cumulative gas used, and the logs.

#[substreams::handlers::map]
pub fn map_transaction_receipts(blk: eth::Block) -> Result<TransactionReceipts, substreams::errors::Error> {
    // Create a new TransactionReceipts object to store the results
    let receipts: Vec<Receipt> = blk.receipts().map(|receipt| {
        Receipt {
            transaction_hash: Hex::encode(&receipt.transaction.hash),
            cumulative_gas_used: receipt.cumulative_gas_used,
            logs: receipt.logs().map(|log| Log {
                address: Hex::encode(&log.address),
                data: Hex::encode(&log.data),
                topics: log.topics.iter().map(|topic| Hex::encode(topic)).collect(),
            }).collect(),
        }
    }).collect();
 
    // Return the TransactionReceipts object containing all transaction receipts
    Ok(TransactionReceipts { receipts })
}

A Note on Data Types

When you call helper functions like .calls() or .logs() in Substreams, you get an iterator of CallView or LogView respectively. These Rust structs encapsulate the transaction and the call/log, providing a richer and more structured access to the data compared to raw protobuf messages.

Here are the types:

#[derive(Copy, Clone)]
pub struct ReceiptView<'a> {
    pub transaction: &'a pb::TransactionTrace,
    pub receipt: &'a pb::TransactionReceipt,
}
 
#[derive(Copy, Clone)]
pub struct LogView<'a> {
    pub receipt: ReceiptView<'a>,
    pub log: &'a pb::Log,
}
 
#[derive(Copy, Clone, Debug, PartialEq)]
pub struct CallView<'a> {
    pub transaction: &'a pb::TransactionTrace,
    pub call: &'a pb::Call,
}

These views are helpful because they provide a more structured way to access and manipulate the data, making your code cleaner and easier to understand. They bridge the gap between raw protobuf messages and the higher-level abstractions used in Rust.