# Disaster recovery

#### Introduction

This disaster recovery plan details the steps to be taken in response to reorganizations (reorgs) and disruptions within the Solana blockchain network as they affect [DH3.io](http://dh3.io) operations. The aim is to ensure a systematic and reliable approach to managing potential inconsistencies and maintaining the integrity of both real-time streams and historical data within the [DH3.io](http://dh3.io) ecosystem.

#### Regular Monitoring Procedures

* Validator Consistency Check: Every 5 seconds, a consistency verification is carried out among our validators to ensure they are in agreement on slot positions and block hashes.
* External RPC Verification: At 30-second intervals, an additional check with a randomly selected RPC from our list of trusted sources (e.g., QuickNode) is performed to confirm data consistency with external networks.

#### Disaster Recovery Plan for Real Time Streams and Historical Database

* Initial Response:
  * In the event of detected inconsistencies, the Kubernetes operator will halt all related instances to prevent the spread of erroneous data.
* Issue Assessment:
  * Determine whether the inconsistency is due to a global blockchain event or localized to our validators.
* Localized Event Handling:
  * If the issue is found to be local, the affected validator is removed from the pool.
* Identifying Last Correct Block ID:
  * Establish the most recent valid block ID before the discrepancy was detected.
* Data Correction:
  * Delete incorrect records from HBase (RPC) and HDFS archives. Thanks to our LiteRPC, this process allows for the straightforward rollback and deletion of inaccurate data.

#### Disaster Recovery Plan for User Generated Data Sets

* Data Set Generation Delay:
  * For most scenarios, it is recommended to delay the generation of data sets for 5-10 minutes. This delay, combined with the cessation of all related downstream processes and jobs by our Kubernetes operator, usually suffices to address any inconsistencies.
* Real-Time Data Set Handling:
  * For real-time data sets, the same corrective measures are applied as with the historical database, involving the deletion of erroneous blocks of data.
* Data Integrity and Duplication Prevention:
  * Each record within the data sets is assigned a unique ID, derived from the block timestamp, transaction position within the block, and several other characteristics. This unique identifier helps quickly eliminate unwanted data and acts as a measure against data duplication.

#### Conclusion

This disaster recovery plan is crucial for maintaining the operational integrity and reliability of [DH3.io](http://dh3.io) in the face of Solana blockchain reorganizations. By following these procedures, [DH3.io](http://dh3.io) ensures the consistency, accuracy, and security of both the real-time and historical data within its Web3 data hub.
