Channels
Last updated
Last updated
In the realm of blockchain data management, streamlining the process of building datasets is crucial for enhancing efficiency and reducing the computational overhead involved in data retrieval. One effective strategy to achieve this involves categorizing data into various channels. This approach not only simplifies data access but also significantly cuts down on the number of read operations required.
Block streams are tailored to operate on a per-network basis, such as mainnet
and devnet
. This means that each network has a dedicated channel, streamlining access to its block data. On the other hand, more granular data types, such as transactions and accounts, are divided into multiple streams. These categorizations are designed with specificity in mind; for instance, separating oracle-related transactions from general transaction streams or distinguishing token accounts from more general account streams. This level of organization facilitates targeted data retrieval, making it easier for developers to access the precise information they need without wading through irrelevant data.
A byproduct of creating these separate channels is the inevitability of data duplication. Given that a single transaction may interact with multiple programs and, therefore, belong to more than one category, it might appear in several streams. While at first glance, duplicating data across channels could seem like a costly strategy in terms of storage, it's a trade-off that pays dividends.
For the ecosystems we're contributing to and the data providers integrating with our solutions, handling duplicated data isn't a notable concern. In fact, it's a small price to pay compared to the alternative. Processing vast quantities of unrelated records places a significant strain on network resources and CPU usage. When compared, the cost of additional storage is minimal, especially given the efficiency gains in data retrieval and the consequent reduction in processing overhead.
In an environment where the specifics of decoding and parsing binary data evolves daily, it's crucial for us to adapt and align our strategies accordingly. All data within our channels is stored in its original, unparsed form. This approach underscores a foundational principle of our operation—parse data on-the-fly.
Relying on real-time parsing allows us to ensure that data sets are reconstructed as swiftly as possible, utilizing the most up-to-date techniques and knowledge derived from the latest updates and contributions within the DH3 community. This method not only maximizes efficiency and speed but also ensures that our datasets are enriched using the freshest insights, keeping us and our developers at the leading edge of data handling practices.
ID | Description | Providers |
---|---|---|
sol.blocks.mainnet
Full mainnet blocks
DexterLab
sol.blocks.devnet
Full devnet blocks
DexterLab
sol.transactions.oracles
Oracles, Pyth, ChainLink.
DexterLab
sol.transactions.failed
Transactions that failed to execute, are still relevant data for some, as they can reduce lamports or fail just partially.
DexterLab
sol.transactions.other
Transactions that do not fall into any other category, for now, this is the most relevant stream as it contains the majority of transactions
DexterLab
sol.accounts.token
TokenProgram owner accounts
DexterLab
sol.accounts.system
SystemProgram owner accounts
DexterLab
sol.accounts.other
All other accounts that do not fall in specific channels
DexterLab
eth.blocks.mainnet
Full mainnet blocks with enriched traces.
Parsiq
eth.transactions.oracles
Oracles
Parsiq
eth.transactions.other
Transactions that do not fall into any other category
Parsiq
eth.transactions.mem
Mempool transactions are only applied with real-time streams.
Parsiq