Skip to content

High Availability (HA)

HeliosDB-Lite provides multi-tier high availability through WAL streaming, multi-primary replication, and sharding.

HA Tiers Overview

Tier Name Architecture Use Case
Tier 1 Warm Standby Active-Passive Basic HA, disaster recovery
Tier 2 Multi-Primary Active-Active Geographic distribution
Tier 3 Sharding Distributed Horizontal scaling

Feature Flags

Enable HA features via Cargo feature flags:

[dependencies]
heliosdb-lite = { version = "3.4", features = ["ha-tier1"] }
Feature Description
ha-tier1 Warm standby replication
ha-tier2 Multi-primary replication
ha-tier3 Sharding support
ha-dedup Content-addressed deduplication
ha-branch-replication Branch-to-server replication

Tier 1: Warm Standby

Active-passive replication with automatic failover.

Architecture

┌─────────────┐     WAL Stream    ┌─────────────┐
│   Primary   │ ───────────────→ │   Standby   │
│   (Active)  │                   │  (Passive)  │
└─────────────┘                   └─────────────┘
       ↓                                 ↓
   Read/Write                       Read-Only

Components

Component Description
WalReplicator Streams WAL from primary
WalApplicator Applies WAL on standby
FailoverWatcher Monitors primary health
LsnManager Tracks replication position
SplitBrainProtector Prevents dual-primary scenarios

Configuration

use heliosdb_lite::replication::{ReplicationConfig, SyncMode};

let config = ReplicationConfig::builder()
    .primary_endpoint("primary.example.com:5432")
    .sync_mode(SyncMode::Synchronous)  // or Asynchronous
    .build();

Sync Modes

Mode Description Durability Latency
Synchronous Wait for standby ACK Strong Higher
Asynchronous Fire-and-forget Eventual Lower
Quorum Wait for N/2+1 ACKs Configurable Medium

Failover

use heliosdb_lite::replication::FailoverWatcher;

let watcher = FailoverWatcher::new(config);
watcher.on_failover(|event| {
    println!("Failover triggered: {:?}", event);
    // Promote standby to primary
});

Split-Brain Protection

use heliosdb_lite::replication::{SplitBrainProtector, ObserverConfig};

let protector = SplitBrainProtector::new(ObserverConfig {
    observers: vec!["observer1.example.com", "observer2.example.com"],
    quorum_size: 2,
});

protector.start();

Tier 2: Multi-Primary

Active-active replication with conflict resolution.

Architecture

┌─────────────┐   Branch Sync   ┌─────────────┐
│  Region A   │ ←─────────────→ │  Region B   │
│  (Primary)  │                 │  (Primary)  │
└─────────────┘                 └─────────────┘
    ↓     ↓                       ↓     ↓
  Writes Reads                  Writes Reads

Components

Component Description
MultiPrimarySyncManager Coordinates multi-region sync
ConflictMergeEngine Resolves write conflicts
RegionCoordinator Manages region topology

Conflict Resolution Strategies

Strategy Description Use Case
Last-Write-Wins Timestamp-based Simple, no conflicts visible
Branch-Wins Prefer local changes Low-latency local writes
Merge Combine changes Collaborative editing
Custom User-defined logic Complex business rules

Configuration

use heliosdb_lite::replication::{
    MultiPrimarySyncManager,
    ConflictResolution,
};

let sync = MultiPrimarySyncManager::new()
    .add_region("us-east", "us-east.example.com:5432")
    .add_region("eu-west", "eu-west.example.com:5432")
    .conflict_resolution(ConflictResolution::LastWriteWins)
    .build();

Branch-Based Replication

Multi-primary uses HeliosDB-Lite's branching for conflict-free merges:

-- Each region maintains its own branch
-- Sync merges branches across regions

-- Region A writes
INSERT INTO orders (id, total) VALUES (1, 100);

-- Region B writes (concurrent)
INSERT INTO orders (id, total) VALUES (2, 200);

-- After sync: both rows present in all regions

Tier 3: Sharding

Horizontal scaling with consistent hashing.

Architecture

                    ┌─────────────┐
                    │   Router    │
                    └──────┬──────┘
           ┌───────────────┼───────────────┐
           ↓               ↓               ↓
    ┌──────────┐    ┌──────────┐    ┌──────────┐
    │ Shard 1  │    │ Shard 2  │    │ Shard 3  │
    │  (0-33%) │    │ (34-66%) │    │ (67-100%)│
    └──────────┘    └──────────┘    └──────────┘

Components

Component Description
HashRing Consistent hashing for key distribution
ShardRouter Routes queries to correct shard
ReshardManager Online resharding with minimal downtime
VectorPartitioner Special partitioning for vector data

Sharding Strategies

Strategy Description Best For
Hash Consistent hash of shard key Even distribution
Range Key ranges per shard Time-series data
Geographic Location-based routing Multi-region
Vector Centroid-based partitioning Vector search

Configuration

use heliosdb_lite::replication::{HashRing, ShardRouter};

let ring = HashRing::new()
    .add_node("shard1.example.com:5432", 100)  // weight: 100
    .add_node("shard2.example.com:5432", 100)
    .add_node("shard3.example.com:5432", 100)
    .build();

let router = ShardRouter::new(ring)
    .shard_key("tenant_id")  // Shard by tenant
    .build();

Vector Partitioning

Special support for vector workloads:

use heliosdb_lite::replication::{VectorPartitioner, CentroidManager};

let partitioner = VectorPartitioner::new()
    .dimensions(768)
    .num_centroids(16)  // 16 partitions based on vector similarity
    .build();

// Vectors routed to shard containing nearest centroid

Resharding

Online resharding without downtime:

use heliosdb_lite::replication::ReshardManager;

let reshard = ReshardManager::new(ring)
    .target_shards(6)  // Scale from 3 to 6 shards
    .parallel_streams(4)
    .build();

reshard.execute().await?;  // Non-blocking migration

Logical Replication

For selective table replication:

use heliosdb_lite::replication::{
    LogicalReplicationPipeline,
    TableFilter,
    ColumnMapping,
};

let pipeline = LogicalReplicationPipeline::new()
    .source("source.example.com:5432")
    .destination("dest.example.com:5432")
    .table_filter(TableFilter::include(&["users", "orders"]))
    .column_mapping(ColumnMapping::new()
        .rename("old_name", "new_name")
        .exclude("sensitive_column"))
    .build();

pipeline.start().await?;

CLI Options

Start HeliosDB-Lite in HA mode:

# Primary mode
heliosdb-lite server --ha-mode primary --ha-bind 0.0.0.0:5433

# Standby mode
heliosdb-lite server --ha-mode standby --ha-primary primary.example.com:5433

# Multi-primary mode
heliosdb-lite server --ha-mode multi-primary \
    --ha-region us-east \
    --ha-peers eu-west.example.com:5433

Docker Support

Docker Compose for HA cluster:

version: '3.8'
services:
  primary:
    image: heliosdb/heliosdb-lite:latest
    command: server --ha-mode primary
    ports:
      - "5432:5432"
      - "5433:5433"
    environment:
      - HA_SYNC_MODE=synchronous

  standby:
    image: heliosdb/heliosdb-lite:latest
    command: server --ha-mode standby --ha-primary primary:5433
    depends_on:
      - primary

Transparent Write Forwarding (TWF)

HeliosDB-Lite implements transparent write routing - an innovative feature that allows applications to connect to any node (primary or standby) and have writes automatically forwarded to the primary.

How It Works

Application → Standby → (DML/DDL forwarded) → Primary
         (SELECT executed locally)

Behavior by Sync Mode

Sync Mode DQL (SELECT) DML (INSERT/UPDATE/DELETE)
sync Execute locally on standby Forward to primary, return result
semi-sync Execute locally on standby Forward to primary, return result
async Execute locally on standby Reject (traditional read-only)

Operations Subject to Routing

When connected to a standby in sync/semi-sync mode:

Operation Behavior
SELECT Execute locally (DQL)
INSERT Forward to primary (DML)
UPDATE Forward to primary (DML)
DELETE Forward to primary (DML)
CREATE Forward to primary (DDL)
DROP Forward to primary (DDL)
ALTER Forward to primary (DDL)
TRUNCATE Forward to primary (DDL)

Example: Transparent Routing

-- Connect to STANDBY and execute INSERT (forwarded to primary)
INSERT INTO users VALUES (3, 'Charlie');
-- Result: INSERT 0 1 (success - executed on primary)

-- SELECT always executes locally on the connected standby
SELECT * FROM users;

Benefits

  1. Load Distribution: Applications can connect to any node; reads distributed, writes auto-routed
  2. Simplified Application Logic: No need for separate read/write connection strings
  3. High Availability: Application continues working if it connects to standby
  4. Transparent Failover: Combined with connection pooling, provides seamless failover

Monitoring

HA System Views

HeliosDB-Lite provides SQL system views for monitoring HA configuration and replication metrics.

pg_replication_status

View node configuration and role:

SELECT * FROM pg_replication_status;
Column Description
node_id Unique identifier for this node
role primary, standby, observer, or standalone
sync_mode async, semi-sync, or sync
listen_address Host and port
replication_port WAL streaming port
current_lsn Current log sequence number
is_read_only true/false
standby_count Number of connected standbys (primary only)
uptime_seconds Time since node started

pg_replication_standbys (Primary Only)

View connected standbys:

SELECT * FROM pg_replication_standbys;
Column Description
node_id Standby's unique identifier
address Standby's connection address
sync_mode Replication mode for this standby
state connecting, streaming, catching_up, synced, disconnected
current_lsn Standby's current LSN position
flush_lsn Flushed LSN
apply_lsn Applied LSN
lag_bytes Replication lag in bytes
lag_ms Replication lag in milliseconds
connected_at Connection timestamp
last_heartbeat Last heartbeat received

pg_replication_primary (Standby Only)

View primary connection status:

SELECT * FROM pg_replication_primary;
Column Description
node_id Primary's unique identifier
address Primary's address
state disconnected, connecting, connected, streaming, error
primary_lsn Primary's current LSN
local_lsn Local LSN position
lag_bytes Replication lag in bytes
lag_ms Replication lag in milliseconds
fencing_token Split-brain protection token
connected_at Connection timestamp
last_heartbeat Last heartbeat received

pg_replication_metrics

View performance metrics:

SELECT * FROM pg_replication_metrics;
Column Description
wal_writes Total WAL write operations
wal_bytes_written Total WAL bytes written
records_replicated Records sent to standbys
bytes_replicated Bytes sent to standbys
heartbeats_sent Health check counts sent
heartbeats_received Health check counts received
reconnect_count Number of reconnections
last_wal_write Timestamp of last WAL write
last_replication Timestamp of last replication

Monitoring Examples

-- Check if standbys are in sync
SELECT
    node_id,
    CASE
        WHEN lag_ms < 1000 THEN 'IN_SYNC'
        WHEN lag_ms < 60000 THEN 'CATCHING_UP'
        ELSE 'LAGGING'
    END as status,
    lag_ms
FROM pg_replication_standbys;

-- View all nodes in cluster
SELECT node_id, role, current_lsn
FROM pg_replication_status;

Best Practices

  1. Network: Use dedicated replication network
  2. Monitoring: Alert on replication lag > threshold
  3. Testing: Regularly test failover procedures
  4. Backups: Continue point-in-time backups even with HA
  5. Quorum: Use odd number of nodes for consensus

See Also