How to solve the problem of long-term storage and access of Ethereum historical data?

Author: 0xNatalie Source: ChainFeeds

Ethereum state data bloat problem and solution

With the popularity of the Ethereum network and the demand for application increases, its historical state data begins to grow rapidly.To cope with this problem, Ethereum has improved step by step, from the initial full node to light client, to the recent Dencun upgrade, introducing a status expiration feature to automatically clean up long-term unused data.

One of Ethereum’s long-term goals is to reduce the load on a single blockchain by implementing sharding to spread data across different blockchains. The EIP-4844 implemented in the Dencun upgrade is the Ethereum network to fully implement sharding.An important step.EIP-4844 introduces the “blobs” temporary data type, allowing Rollup to submit more data to the Ethereum main chain at a lower cost.To control state data blobs data, Ethereum deletes blobs data at consensus layer nodes for about 18 days.

In addition to Ethereum’s own improvements, there are also projects such as Celestia, Avail and EigenDA that are also building solutions to improve data problems.They provide effective short-term data availability (DA) solutions that enhance real-time operation and scalability of blockchains.These solutions then do not address applications that require long-term access to historical data, such as those that rely on long-term storage of user authentication data or dApps that require AI model training.

In order to solve the challenges of long-term data storage in the Ethereum ecosystem, EthStorage, Pinax, Covalent and other projects have proposed solutions.EthStorage provides Rollup with a long-term DA, ensuring that data can be accessed and used for a long time.Pinax, The Graph and StreamingFast jointly developed a solution to store and retrieve blobs packets for a long time.Covalent’s Ethereum Wayback Machine (EWM) is not only a long-term data storage solution, but also a complete system that can enable data query and analysis.

As artificial intelligence becomes the mainstream trend in global technological development, its combination with blockchain technology is also regarded as the future development direction.This trend has led to a growing demand for historical data access and analysis.In this context, EWM demonstrates its unique advantages.EWM provides archives and data processing of Ethereum historical data, allowing users to retrieve complex data structures and conduct in-depth analysis and query of the internal states, transaction results, event logs, etc. of smart contracts.

Introduction to Ethereum Wayback Machine (EWM)

Ethereum Wayback Machine (EWM) draws on the concept of Wayback Machine to save historical data on Ethereum and enable it to be accessed and verified.Wayback Machine is a digital archive project created by the Internet Archive to record and preserve the history of the Internet.This tool allows users to view archived versions of a website at different points in the past, helping people understand the historical changes in the website content.

Historical data is the fundamental reason for the birth of blockchain, which not only supports the technical architecture of blockchain, but also the cornerstone of its economic model.At the beginning of blockchain design, it was designed to provide a public and unchangeable historical record.For example, Bitcoin is to create an immutable and decentralized ledger that records the history of each transaction to ensure the transparency and security of transactions.The demand scenarios for historical data are very widespread, but there is currently a lack of an efficient and verifiable storage method.EWM as a long-term DA solution, canPermanent storage of data, including blob data, can deal with historical data accessibility issues caused by state expiration and data sharding.EWM focuses on EthereumArchives and long-term accessibility of historical data, supporting complex data structure queries.Next, we will explore in detail how EWM can achieve this through its unique data processing flow.

EWM’s data processing flow: extraction, refining and indexing

Covalent is a platform that provides users with access and query services for blockchain data.It enables reliable storage and fast access to data by capturing and indexing blockchain data and storing it on multiple nodes on the network.Covalent processes data through Ethereum Wayback Machine (EWM), ensuring the sustained accessibility of blockchain historical data.The EWM data processing process includes three key steps: Extraction and Export, Refinement, Indexing and Query.

  1. Extract and export: This is the first step in the process, involving the direct extraction of historical transaction data from the blockchain network.This step is performed by a specialized entity, namely Block Specimen Producers (BSP).The main task of BSP is to create and save “block samples”, that is, the original snapshot of blockchain data.These block samples are the key to maintaining the integrity and accuracy of the data as a normative representation of the blockchain’s historical state.After creation, these block samples are uploaded to a distributed server (built based on IPFS) and published and verified through the ProofChain contract.This not only ensures the security of the data, but also provides others with signals that the data has been securely saved.

  2. Refining: After data extraction, it is refined by Block Results Producers (BRP).BRP is responsible for converting the underlying data into more useful forms.Traditional blockchain data access methods usually only provide limited information and are not easy to query complex data structures.By re-executing and converting data,BRP can provide more detailed information, such as internal contract status, transaction execution path, etc..In addition, BRP significantly reduces the need to rerun the full node per query or data analysis by preprocessing and storing processed data, thereby increasing query speed and reducing storage and computing costs.At this point, the original “block sample” is converted into a form of “block result” that is easier to be queried and analyzed.This process not only accelerates the performance of the Covalent network, but also provides more possibilities for further query and analysis of data.

  3. Indexing and querying: Finally, the query operators organize and save the processed data in a convenient location.According to the needs of API users, extract data from distributed servers to ensure that both historical and real-time data can be used to respond to API queries.This allows users to effectively access and utilize blockchain data stored in the Covalent network.

Covalent provides a unified GoldRush API, which supports obtaining historical data from multiple blockchains (such as Ethereum, Polygon, Solana, etc.).This GoldRush API provides developers with a one-stop data solution, allowing developers to obtain the account’s ERC20 token balance and NFT data through a single call, thereby easily building cryptocurrencies and NFT wallets (such as Rainbow, Zerion), greatly simplifyingDevelopment process.Additionally, use the API to access DA dataNeed to consume credit points(Credit), different types of requests are divided into different categories (Class A, Class B, Class C, etc.), and each category has its own specific credit cost.This revenue is used to support the operator network.

Future Outlook

With the rapid development of AI, the trend of combining AI and blockchain is becoming more and more obvious.Blockchain technology provides AI with an immutable and distributed verification data source, enhancing data transparency and trust, making AI models more accurate and reliable in data analysis and decision-making.By analyzing on-chain data, AI can optimize algorithms and predict trends, thereby directly performing complex tasks and transactions, significantly improving dApp efficiency and reducing costs.With EWM, AI models can access a wide range of on-chain structured Web3 datasets, and these data are complete and verifiable.As a bridge between AI model and blockchain, EWM greatly facilitates data retrieval and utilization of AI developers.

There are currently some AI projects that integrate Covalent:

  • SmartWhales: A platform that uses AI technology to optimize copy trading investment strategies.Copy transactions rely on the analysis of historical data to identify successful transaction patterns and strategies.Covalent provides a comprehensive and detailed blockchain dataset, which SmartWhales analyzes past transaction behaviors and results to identify which strategies perform well under specific market conditions and recommends them to users.

  • BotFi: DeFi trading robot.By integrating Covalent’s data, analyzing market trends and automated trading strategies, and automatically buying and selling operations based on market changes.

  • Laika AI: Use AI to conduct comprehensive on-chain analysis.The Laika AI platform integrates structured blockchain data provided by Covalent to drive its AI model and helps users conduct complex on-chain data analysis.

  • Entendre Finance: Automated DeFi asset management to provide real-time insights and predictive analytics.Its AI uses Covalent’s structured data to simplify and automate asset management, such as monitoring and managing the holdings of digital assets, automating specific trading strategies, etc.

EWM is also constantly improving and upgrading as demand changes. Covalent engineer Pranay Valson said that in the future, EWM will expand protocol specifications to support other blockchains such as Polygon and Arbitrum, and will integrate BSP forks into Ethernet such as Nethermind and Besu.for wider compatibility and applications.In addition, when EWM processes blob transactions on beacon chains, it will use KZG promises to improve data storage and retrieval efficiency and reduce storage costs.

  • Related Posts

    Ethereum’s crossroads: a strategic breakthrough in reconstructing the L2 ecosystem

    Author: Momir @IOSG TL;DR The craze of Web3 vision has faded in 2021, and Ethereum is facing severe challenges.Not only is the market’s cognitive shift in Web3.0, Ethereum is also…

    Ethereum is brewing a deep technological change led by ZK technology

    Author: Haotian A friend asked me what I think @VitalikButerin proposed an aggressive solution to replace Ethereum virtual machine EVM bytecode with an open source RISC-V instruction set architecture?Ethereum is…

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    On the “Pattern” of Digital City-State

    • By jakiro
    • April 21, 2025
    • 0 views
    On the “Pattern” of Digital City-State

    After the tariff war: How global capital rebalancing will affect Bitcoin

    • By jakiro
    • April 21, 2025
    • 0 views
    After the tariff war: How global capital rebalancing will affect Bitcoin

    Ethereum’s crossroads: a strategic breakthrough in reconstructing the L2 ecosystem

    • By jakiro
    • April 21, 2025
    • 0 views
    Ethereum’s crossroads: a strategic breakthrough in reconstructing the L2 ecosystem

    Ethereum is brewing a deep technological change led by ZK technology

    • By jakiro
    • April 21, 2025
    • 2 views
    Ethereum is brewing a deep technological change led by ZK technology

    BTC 2025 Q3 Outlook: When will the crypto market top again?

    • By jakiro
    • April 21, 2025
    • 0 views
    BTC 2025 Q3 Outlook: When will the crypto market top again?

    Is Base “stealing” Ethereum’s GDP?

    • By jakiro
    • April 21, 2025
    • 5 views
    Is Base “stealing” Ethereum’s GDP?
    Home
    News
    School
    Search