High Availability (HA)¶

HeliosDB-Lite provides multi-tier high availability through WAL streaming, multi-primary replication, and sharding.

HA Tiers Overview¶

Tier	Name	Architecture	Use Case
Tier 1	Warm Standby	Active-Passive	Basic HA, disaster recovery
Tier 2	Multi-Primary	Active-Active	Geographic distribution
Tier 3	Sharding	Distributed	Horizontal scaling

Feature Flags¶

Enable HA features via Cargo feature flags:

[dependencies]
heliosdb-lite = { version = "3.4", features = ["ha-tier1"] }

Feature	Description
`ha-tier1`	Warm standby replication
`ha-tier2`	Multi-primary replication
`ha-tier3`	Sharding support
`ha-dedup`	Content-addressed deduplication
`ha-branch-replication`	Branch-to-server replication

Tier 1: Warm Standby¶

Active-passive replication with automatic failover.

Architecture¶

┌─────────────┐     WAL Stream    ┌─────────────┐
│   Primary   │ ───────────────→ │   Standby   │
│   (Active)  │                   │  (Passive)  │
└─────────────┘                   └─────────────┘
       ↓                                 ↓
   Read/Write                       Read-Only

Components¶

Component	Description
`WalReplicator`	Streams WAL from primary
`WalApplicator`	Applies WAL on standby
`FailoverWatcher`	Monitors primary health
`LsnManager`	Tracks replication position
`SplitBrainProtector`	Prevents dual-primary scenarios

Configuration¶

use heliosdb_lite::replication::{ReplicationConfig, SyncMode};

let config = ReplicationConfig::builder()
    .primary_endpoint("primary.example.com:5432")
    .sync_mode(SyncMode::Synchronous)  // or Asynchronous
    .build();

Sync Modes¶

Mode	Description	Durability	Latency
Synchronous	Wait for standby ACK	Strong	Higher
Asynchronous	Fire-and-forget	Eventual	Lower
Quorum	Wait for N/2+1 ACKs	Configurable	Medium

Failover¶

use heliosdb_lite::replication::FailoverWatcher;

let watcher = FailoverWatcher::new(config);
watcher.on_failover(|event| {
    println!("Failover triggered: {:?}", event);
    // Promote standby to primary
});

Split-Brain Protection¶

use heliosdb_lite::replication::{SplitBrainProtector, ObserverConfig};

let protector = SplitBrainProtector::new(ObserverConfig {
    observers: vec!["observer1.example.com", "observer2.example.com"],
    quorum_size: 2,
});

protector.start();

Tier 2: Multi-Primary¶

Active-active replication with conflict resolution.

Architecture¶

┌─────────────┐   Branch Sync   ┌─────────────┐
│  Region A   │ ←─────────────→ │  Region B   │
│  (Primary)  │                 │  (Primary)  │
└─────────────┘                 └─────────────┘
    ↓     ↓                       ↓     ↓
  Writes Reads                  Writes Reads

Components¶

Component	Description
`MultiPrimarySyncManager`	Coordinates multi-region sync
`ConflictMergeEngine`	Resolves write conflicts
`RegionCoordinator`	Manages region topology

Conflict Resolution Strategies¶

Strategy	Description	Use Case
Last-Write-Wins	Timestamp-based	Simple, no conflicts visible
Branch-Wins	Prefer local changes	Low-latency local writes
Merge	Combine changes	Collaborative editing
Custom	User-defined logic	Complex business rules

Configuration¶

use heliosdb_lite::replication::{
    MultiPrimarySyncManager,
    ConflictResolution,
};

let sync = MultiPrimarySyncManager::new()
    .add_region("us-east", "us-east.example.com:5432")
    .add_region("eu-west", "eu-west.example.com:5432")
    .conflict_resolution(ConflictResolution::LastWriteWins)
    .build();

Branch-Based Replication¶

Multi-primary uses HeliosDB-Lite's branching for conflict-free merges:

-- Each region maintains its own branch
-- Sync merges branches across regions

-- Region A writes
INSERT INTO orders (id, total) VALUES (1, 100);

-- Region B writes (concurrent)
INSERT INTO orders (id, total) VALUES (2, 200);

-- After sync: both rows present in all regions

Tier 3: Sharding¶

Horizontal scaling with consistent hashing.

Architecture¶

                    ┌─────────────┐
                    │   Router    │
                    └──────┬──────┘
           ┌───────────────┼───────────────┐
           ↓               ↓               ↓
    ┌──────────┐    ┌──────────┐    ┌──────────┐
    │ Shard 1  │    │ Shard 2  │    │ Shard 3  │
    │  (0-33%) │    │ (34-66%) │    │ (67-100%)│
    └──────────┘    └──────────┘    └──────────┘

Components¶

Component	Description
`HashRing`	Consistent hashing for key distribution
`ShardRouter`	Routes queries to correct shard
`ReshardManager`	Online resharding with minimal downtime
`VectorPartitioner`	Special partitioning for vector data

Sharding Strategies¶

Strategy	Description	Best For
Hash	Consistent hash of shard key	Even distribution
Range	Key ranges per shard	Time-series data
Geographic	Location-based routing	Multi-region
Vector	Centroid-based partitioning	Vector search

Configuration¶

use heliosdb_lite::replication::{HashRing, ShardRouter};

let ring = HashRing::new()
    .add_node("shard1.example.com:5432", 100)  // weight: 100
    .add_node("shard2.example.com:5432", 100)
    .add_node("shard3.example.com:5432", 100)
    .build();

let router = ShardRouter::new(ring)
    .shard_key("tenant_id")  // Shard by tenant
    .build();

Vector Partitioning¶

Special support for vector workloads:

use heliosdb_lite::replication::{VectorPartitioner, CentroidManager};

let partitioner = VectorPartitioner::new()
    .dimensions(768)
    .num_centroids(16)  // 16 partitions based on vector similarity
    .build();

// Vectors routed to shard containing nearest centroid

Resharding¶

Online resharding without downtime:

use heliosdb_lite::replication::ReshardManager;

let reshard = ReshardManager::new(ring)
    .target_shards(6)  // Scale from 3 to 6 shards
    .parallel_streams(4)
    .build();

reshard.execute().await?;  // Non-blocking migration

Logical Replication¶

For selective table replication:

use heliosdb_lite::replication::{
    LogicalReplicationPipeline,
    TableFilter,
    ColumnMapping,
};

let pipeline = LogicalReplicationPipeline::new()
    .source("source.example.com:5432")
    .destination("dest.example.com:5432")
    .table_filter(TableFilter::include(&["users", "orders"]))
    .column_mapping(ColumnMapping::new()
        .rename("old_name", "new_name")
        .exclude("sensitive_column"))
    .build();

pipeline.start().await?;

CLI Options¶

Start HeliosDB-Lite in HA mode:

# Primary mode
heliosdb-lite server --ha-mode primary --ha-bind 0.0.0.0:5433

# Standby mode
heliosdb-lite server --ha-mode standby --ha-primary primary.example.com:5433

# Multi-primary mode
heliosdb-lite server --ha-mode multi-primary \
    --ha-region us-east \
    --ha-peers eu-west.example.com:5433

Docker Support¶

Docker Compose for HA cluster:

version: '3.8'
services:
  primary:
    image: heliosdb/heliosdb-lite:latest
    command: server --ha-mode primary
    ports:
      - "5432:5432"
      - "5433:5433"
    environment:
      - HA_SYNC_MODE=synchronous

  standby:
    image: heliosdb/heliosdb-lite:latest
    command: server --ha-mode standby --ha-primary primary:5433
    depends_on:
      - primary

Transparent Write Forwarding (TWF)¶

HeliosDB-Lite implements transparent write routing - an innovative feature that allows applications to connect to any node (primary or standby) and have writes automatically forwarded to the primary.

How It Works¶

Application → Standby → (DML/DDL forwarded) → Primary
                ↓
         (SELECT executed locally)

Behavior by Sync Mode¶

Sync Mode	DQL (SELECT)	DML (INSERT/UPDATE/DELETE)
sync	Execute locally on standby	Forward to primary, return result
semi-sync	Execute locally on standby	Forward to primary, return result
async	Execute locally on standby	Reject (traditional read-only)

Operations Subject to Routing¶

When connected to a standby in sync/semi-sync mode:

Operation	Behavior
`SELECT`	Execute locally (DQL)
`INSERT`	Forward to primary (DML)
`UPDATE`	Forward to primary (DML)
`DELETE`	Forward to primary (DML)
`CREATE`	Forward to primary (DDL)
`DROP`	Forward to primary (DDL)
`ALTER`	Forward to primary (DDL)
`TRUNCATE`	Forward to primary (DDL)

Example: Transparent Routing¶

-- Connect to STANDBY and execute INSERT (forwarded to primary)
INSERT INTO users VALUES (3, 'Charlie');
-- Result: INSERT 0 1 (success - executed on primary)

-- SELECT always executes locally on the connected standby
SELECT * FROM users;

Benefits¶

Load Distribution: Applications can connect to any node; reads distributed, writes auto-routed
Simplified Application Logic: No need for separate read/write connection strings
High Availability: Application continues working if it connects to standby
Transparent Failover: Combined with connection pooling, provides seamless failover

Monitoring¶

HA System Views¶

HeliosDB-Lite provides SQL system views for monitoring HA configuration and replication metrics.

pg_replication_status¶

View node configuration and role:

SELECT * FROM pg_replication_status;

Column	Description
`node_id`	Unique identifier for this node
`role`	primary, standby, observer, or standalone
`sync_mode`	async, semi-sync, or sync
`listen_address`	Host and port
`replication_port`	WAL streaming port
`current_lsn`	Current log sequence number
`is_read_only`	true/false
`standby_count`	Number of connected standbys (primary only)
`uptime_seconds`	Time since node started

pg_replication_standbys (Primary Only)¶

View connected standbys:

SELECT * FROM pg_replication_standbys;

Column	Description
`node_id`	Standby's unique identifier
`address`	Standby's connection address
`sync_mode`	Replication mode for this standby
`state`	connecting, streaming, catching_up, synced, disconnected
`current_lsn`	Standby's current LSN position
`flush_lsn`	Flushed LSN
`apply_lsn`	Applied LSN
`lag_bytes`	Replication lag in bytes
`lag_ms`	Replication lag in milliseconds
`connected_at`	Connection timestamp
`last_heartbeat`	Last heartbeat received

pg_replication_primary (Standby Only)¶

View primary connection status:

SELECT * FROM pg_replication_primary;

Column	Description
`node_id`	Primary's unique identifier
`address`	Primary's address
`state`	disconnected, connecting, connected, streaming, error
`primary_lsn`	Primary's current LSN
`local_lsn`	Local LSN position
`lag_bytes`	Replication lag in bytes
`lag_ms`	Replication lag in milliseconds
`fencing_token`	Split-brain protection token
`connected_at`	Connection timestamp
`last_heartbeat`	Last heartbeat received

pg_replication_metrics¶

View performance metrics:

SELECT * FROM pg_replication_metrics;

Column	Description
`wal_writes`	Total WAL write operations
`wal_bytes_written`	Total WAL bytes written
`records_replicated`	Records sent to standbys
`bytes_replicated`	Bytes sent to standbys
`heartbeats_sent`	Health check counts sent
`heartbeats_received`	Health check counts received
`reconnect_count`	Number of reconnections
`last_wal_write`	Timestamp of last WAL write
`last_replication`	Timestamp of last replication

Monitoring Examples¶

-- Check if standbys are in sync
SELECT
    node_id,
    CASE
        WHEN lag_ms < 1000 THEN 'IN_SYNC'
        WHEN lag_ms < 60000 THEN 'CATCHING_UP'
        ELSE 'LAGGING'
    END as status,
    lag_ms
FROM pg_replication_standbys;

-- View all nodes in cluster
SELECT node_id, role, current_lsn
FROM pg_replication_status;

Best Practices¶

Network: Use dedicated replication network
Monitoring: Alert on replication lag > threshold
Testing: Regularly test failover procedures
Backups: Continue point-in-time backups even with HA
Quorum: Use odd number of nodes for consensus

High Availability (HA)¶

HA Tiers Overview¶

Feature Flags¶

Tier 1: Warm Standby¶

Architecture¶

Components¶

Configuration¶

Sync Modes¶

Failover¶

Split-Brain Protection¶

Tier 2: Multi-Primary¶

Architecture¶

Components¶

Conflict Resolution Strategies¶

Configuration¶

Branch-Based Replication¶

Tier 3: Sharding¶

Architecture¶

Components¶

Sharding Strategies¶

Configuration¶

Vector Partitioning¶

Resharding¶

Logical Replication¶

CLI Options¶

Docker Support¶

Transparent Write Forwarding (TWF)¶

How It Works¶

Behavior by Sync Mode¶

Operations Subject to Routing¶

Example: Transparent Routing¶

Benefits¶

Monitoring¶

HA System Views¶

pg_replication_status¶

pg_replication_standbys (Primary Only)¶

pg_replication_primary (Standby Only)¶

pg_replication_metrics¶

Monitoring Examples¶

Best Practices¶

See Also¶