Skip to Content

Data Journey

Every blob submitted to Hyve H3 passes through five distinct stages:

  1. Client Encoding - Client prepares data using erasure coding
  2. Data Ingestion - Data Nodes receive and verify shares
  3. Canonical Inclusion - The Liveliness Layer finalizes inclusion via consensus
  4. Retention Enforcement - Continuous auditing ensures availability

This separation leverages Hyve’s three-layer architecture where storage, consensus, and economics operate independently.

Stage 1: Client Encoding

Hyve H3 leverages Distributed Edge Encoding to optimize data ingestion. This is notably different from other systems that either rely on a central encoder to perform erasure coding over a batch of blobs or rely on each blob being individually encoded at the edge and individually audited.

  1. The client reads the system parameters to determine the erasure coding scheme (n, k) and Data Node set.
  2. The client through the H3 SDK or an H3 Gateway RPC encodes the blob using Reed Solomon, generating n primary shares and k parity shares.
  3. The client generates a BlobHeader containing a commitment to the data, payment info, expiration epoch, encoding proof, and optionally arbitrary metadata tags.
  4. The client derives the Data Node assignment for each share based on the system parameters.
  5. The client broadcasts the network with each Data Node receiving its assigned primary share, parity share, and the BlobHeader.

The network traffic scales linearly with blob size rather than quadratically as in replicated systems.

Stage 2: Data Ingestion

Each Data Node receives shares directly from clients through the P2P network. The ingestion process:

  1. Data Node receives share in parallel from multiple clients using streaming P2P protocol
  2. Node verifies the BlobHeader and assignment
  3. Node stream the share data to a HDD drive
  4. Node verifies the share matches the declared blob commitment and encoding proof
  5. Node acknowledges receipt to client
  6. Node periodically informs the Liveliness Layer of newly ingested blobs with their headers

At this point, data exists in the Data Layer but hasn’t achieved canonical inclusion. The Liveliness Layer coordinates the next stage.

Stage 3: Consensus Inclusion

The Liveliness Layer coordinates the inclusion of ingested data into the canonical record.

  1. A new round starts
  2. All Liveliness Nodes select newly ingested BlobHeaders
  3. Unexpired and paid blobs are batched into a proposal
  4. Liveliness Nodes send the proposal to all Data Nodes and Liveliness Nodes
  5. Data Nodes seal the batch and derive the final encoding proof for the batch
  6. Liveliness Nodes sample Data Nodes to verify correct availability
  7. The Liveliness Layer runs DAG BFT consensus to agree on availability
  8. Upon finality, a certificate is issued and communicated back to Data Nodes

Stage 4: Retention Enforcement

After finalization, Data Nodes must retain their assigned shares until the blob’s expiration epoch. Hyve enforces this through a trust-minimized auditing process with trustless verification.

Retention Structure

Blobs are organized into Buckets by expiration epoch:

Epoch 150 Bucket ├── Tile 1 (blobs 1-1000) │ ├── Share → Data Node 1 │ ├── Share → Data Node 2 │ └── Share → Data Node N ├── Tile 2 (blobs 1001-2000) └── Tile 3 (blobs 2001-3000)

Each Bucket forms an append-only sequence of Tiles. Tile headers contain commitments and metadata, accumulated into a compact Bucket root proving which Tiles belong to each epoch.

Continuous Auditing

The Liveliness Layer continuously and randomly challenges Data Nodes to verify share retention. The auditing process:

  1. Liveliness Nodes pseudo-randomly select Data Nodes and shares
  2. Nodes prove they hold the requested data
  3. Proofs are aggregated into availability certificates
  4. Certificates are submitted to an Ethereum smart contract for reward distribution

Nodes that fail audits decrease in reputation and can eventually risk slashing.

Last updated on