Data Types in Substreams
In this chapter, we’ll explore the various data types available in Substreams and their practical applications. Understanding these data types will enable you to harness the full potential of Substreams for data extraction and transformation. We’ll cover:
- Block
- Clock
- Calls
- Traces
- Receipts
- Logs
Each section will include a brief overview, an explanation of key fields, and practical examples. For a comprehensive overview of all the data available in Substreams, please refer to the following data reference guides:
- EVM Data Reference Guide
- Cosmos Data Reference Guide
- Solana Data Reference Guide
- StarkNet Data Reference Guide
Block
The Block
object represents a block on the source blockchain. It contains information about the block itself, its transactions, and balance and code changes if an Extended Block
.
All fields on the Block
for Ethereum mainnet can be found in the sf.ethereum.type.v2
package.
Key Fields
number
: The block number.hash
: The hash of the block.header
: ABlockHeader
object containing the block’s header information like its parent hash, the merkel root hash and all other information the form a block.timestamp
: The timestamp when the block was mined.transaction_traces
: A list ofTransactionTrace
objects (see Transaction Trace) included in the block ordered by their order of exectuion in the block.
Practical Example
Extracting a subset of basic information about each block:
#[substreams::handlers::map]
fn map_block_info(blk: eth::Block) -> Result<BlockInfo, substreams::errors::Error> {
Ok(BlockInfo {
number: blk.number,
hash: blk.hash.to_string(),
parent_hash: blk.parent_hash.to_string(),
timestamp: blk.timestamp_seconds(),
})
}
Clock
The Clock object provides a simpler view of time, particularly helpful for timeframe aggregations. It contains a subset of fields from the Block
object, and is far more lightweight.
All fields in a Clock
for EVM networks can be found in the sf.substreams.v1
package.
Key Fields
id
: ID of the block.number
: The block number.timestamp
: The timestamp of the current block.
Given Clock
is more lightweight than Block
, it’s advisable to use this object if you only require some of the above fields.
Practical Example
The Clock
object is especially useful in scenarios where you need to perform time-based aggregations or calculations. For example, you might want to aggregate events on a daily basis, track metrics over specific timeframes, or trigger certain actions at regular intervals.
Let’s use Clock
to store daily event counts. This example demonstrates how to aggregate events on a daily basis using the Clock object.
#[substreams::handlers::store]
pub fn store_daily_events_count(
clock: Clock,
events: Events,
store: StoreAddInt64,
) {
// Convert the timestamp to seconds and calculate the day ID
let timestamp_seconds = clock.timestamp.unwrap().seconds;
let day_id: i64 = timestamp_seconds / 86400;
let prev_day_id = day_id - 1;
// Delete the previous day's count to clear the store of stale data
store.delete_prefix(0, &format!("DailyEventsCount:{prev_day_id}:"));
// Increment the count for the current day for each event
for event in events.pool_events {
store.add(0, &format!("DailyEventsCount:{day_id}:"), 1);
}
}
Transaction Trace
The TransactionTrace
object provides a detailed execution trace of a transaction.
All fields in a TransactionTrace
for EVM networks can be found in the sf.ethereum.type.v2
package.
Key Fields
from
: The address initiating the trace represented inbytes
format.to
: The address receiving the trace action represented inbytes
format.value
: The value transferred in the trace in the networks native currency (ETH on Ethereum Mainnet).input
: The input data the transaction receives for execution.
Practical Example
Let’s create a module to track all transactions to the GRT token contract address. This example will help monitor interactions with the GRT token contract.
const GRT_TRACKED_CONTRACT: [u8; 20] = hex!("c944e90c64b2c07662a292be6244bdf05cda44a7");
#[substreams::handlers::map]
pub fn map_transactions_to_grt(blk: eth::Block) -> Result<GrtTransactions, substreams::errors::Error> {
// Create a new GrtTransactions object to store the results
let mut grt_txs = GrtTransactions::default();
// Iterate through each transaction trace in the block
for trace in blk.transactions() {
// Check if the transaction trace is directed to the GRT contract address
if trace.to == GRT_TRACKED_CONTRACT {
// If true, create a new Transaction object with relevant details
grt_txs.transactions.push(Transaction {
hash: trace.hash.clone(),
from: trace.from.clone(),
to: trace.to.clone(),
value: trace.value.clone(),
input: trace.input.clone(),
block_number: blk.number,
timestamp: blk.timestamp_seconds(),
});
}
}
// Return the GrtTransactions object containing all matching transactions
Ok(grt_txs)
}
Call
The Call
object represents function calls made within a transaction trace. Call
includes details about the contract interactions, providing insights into what functions were executed, who it was exectued by, and with what parameters.
All fields in a Call
for EVM networks can be found in the sf.ethereum.type.v2
package.
Key Fields
call_type
: The type of call (e.g.,call
,delegatecall
).caller
: The address making the call.value
: The amount of the networks native currency transferred with the call (ETH on Ethereum Mainnet).input
: The input data for the call, which often includes the function signature and parameters.gas_limit
: The maximum amount of gas that can be used for the call.gas_consumed
: The amount of gas consumed by the call.
Practical Example
Let’s create a module to track failed contract calls and capture the reasons for their failure. This example will help monitor interactions with smart contracts and understand why certain calls fail.
#[substreams::handlers::map]
pub fn map_reverted_contract_calls(blk: eth::Block) -> Result<FailedCalls, substreams::errors::Error> {
// Create a vector to store the failed calls
let failed_calls: Vec<FailedCall> = blk
.calls()
.filter_map(|call_view| {
// Check if the call has failed
if call_view.call.status_failed {
// If true, create a new FailedCall object with relevant details
return Some(FailedCall {
call_type: call_view.call.call_type().as_str_name().to_string(),
caller: Hex::encode(&call_view.call.caller),
reason: call_view.call.failure_reason.clone(),
gas_consumed: call_view.call.gas_consumed,
});
}
None
})
.collect();
// Return the FailedCalls object containing all failed calls
Ok(FailedCalls { failed_calls })
}
Logs
A Log
is one of the most useful data types for a Substreams developer as they provide full event logs, which allow subsequent event extraction. By leveraging logs, you can monitor specific events emitted by contracts and output relevant data.
All fields in a Log
for EVM networks can be found in the sf.ethereum.type.v2
package.
Key Fields
address
: The address of the contract that generated the log inbytes
format.topics
: The indexed arguments for the event.data
: The data contained in the log inbytes
format.ordinal
: The log’s position within the block.
Practical Example
Substreams provides a lot of helper functionality that enables us to extract useful information out of the data
field of logs.
Let’s create a module to check if a log is coming from a specific contract (e.g., the Uniswap V3 Factory) and then extract a specific event, such as PoolCreated
, from these logs.
const UNISWAP_V3_FACTORY: [u8; 20] = hex!("1f98431c8ad98523631ae4a59f267346ea31f984");
#[substreams::handlers::map]
pub fn map_pools_created(blk: eth::Block) -> Result<Pools, substreams::errors::Error> {
// Create a new Pools object to store the results
Ok(Pools {
pools: blk
.logs()
.filter_map(|log| {
// Check if the log is from the Uniswap V3 Factory contract
if log.address() == UNISWAP_V3_FACTORY {
// Attempt to match and decode the PoolCreated event
if let Some(event) = PoolCreated::match_and_decode(&log) {
// If the event matches, create a new Pool object with relevant details
return Some(Pool {
address: Hex::encode(event.pool),
token0: Hex::encode(event.token0),
token1: Hex::encode(event.token1),
created_at_tx_hash: Hex(&log.receipt.transaction.hash).to_string(),
created_at_block_number: blk.number,
created_at_timestamp: blk.timestamp_seconds(),
log_ordinal: log.ordinal(),
});
}
}
None
})
.collect(),
})
}
The above example can be simplified by utilizing the events
helper function, which streamlines the process of filtering and decoding events from logs. It essentially performs the same functionality as the previous code snippet above, but in a more concise manner.
#[substreams::handlers::map]
pub fn map_pools_created(blk: eth::Block) -> Result<Pools, substreams::errors::Error> {
// Create a new Pools object to store the results
Ok(Pools {
pools: blk
.events::<PoolCreated>(&[&UNISWAP_V3_FACTORY])
.map(|(event, log)| Pool {
address: Hex::encode(event.pool),
token0: Hex::encode(event.token0),
token1: Hex::encode(event.token1),
created_at_tx_hash: Hex(&log.receipt.transaction.hash).to_string(),
created_at_block_number: blk.number,
created_at_timestamp: blk.timestamp_seconds(),
log_ordinal: log.ordinal(),
})
.collect(),
})
}
Transaction Receipt
The TransactionReceipt
object contains the receipt data of a transaction. Receipts provide valuable information about the execution of a transaction, including cumulative gas used, and logs generated.
All fields on the TransactionReceipt
for Ethereum mainnet can be found in the sf.ethereum.type.v2
package.
Key Fields
cumulative_gas_used
: The cumulative gas used in the transaction.logs
: A list ofLog
objects associated with the transaction.
Practical Example
Let’s create a module to extract logs from transaction receipts within a block. This example will capture details about each receipt, such as the transaction hash, cumulative gas used, and the logs.
#[substreams::handlers::map]
pub fn map_transaction_receipts(blk: eth::Block) -> Result<TransactionReceipts, substreams::errors::Error> {
// Create a new TransactionReceipts object to store the results
let receipts: Vec<Receipt> = blk.receipts().map(|receipt| {
Receipt {
transaction_hash: Hex::encode(&receipt.transaction.hash),
cumulative_gas_used: receipt.cumulative_gas_used,
logs: receipt.logs().map(|log| Log {
address: Hex::encode(&log.address),
data: Hex::encode(&log.data),
topics: log.topics.iter().map(|topic| Hex::encode(topic)).collect(),
}).collect(),
}
}).collect();
// Return the TransactionReceipts object containing all transaction receipts
Ok(TransactionReceipts { receipts })
}
A Note on Data Types
When you call helper functions like .calls()
or .logs()
in Substreams, you get an iterator of CallView
or LogView
respectively. These Rust structs encapsulate the transaction and the call/log, providing a richer and more structured access to the data compared to raw protobuf messages.
Here are the types:
#[derive(Copy, Clone)]
pub struct ReceiptView<'a> {
pub transaction: &'a pb::TransactionTrace,
pub receipt: &'a pb::TransactionReceipt,
}
#[derive(Copy, Clone)]
pub struct LogView<'a> {
pub receipt: ReceiptView<'a>,
pub log: &'a pb::Log,
}
#[derive(Copy, Clone, Debug, PartialEq)]
pub struct CallView<'a> {
pub transaction: &'a pb::TransactionTrace,
pub call: &'a pb::Call,
}
These views are helpful because they provide a more structured way to access and manipulate the data, making your code cleaner and easier to understand. They bridge the gap between raw protobuf messages and the higher-level abstractions used in Rust.