Architecture Overview¶
HeliosDB-Lite is a high-performance embedded database with PostgreSQL compatibility, built entirely in Rust for memory safety and performance.
System Architecture¶
┌─────────────────────────────────────────────────────────────────┐
│ Client Layer │
├─────────────────────────────────────────────────────────────────┤
│ PostgreSQL Wire │ REST API │ Embedded Rust API │
│ Protocol │ (HTTP) │ (Direct Linking) │
└────────┬───────────┴───────┬───────┴──────────┬─────────────────┘
│ │ │
┌────────▼───────────────────▼──────────────────▼─────────────────┐
│ Query Layer │
├─────────────────────────────────────────────────────────────────┤
│ SQL Parser → Planner → Optimizer → Executor │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ Parse Tree Logical Plan Physical Plan Results │
└────────────────────────────────────────────────────────────────┬┘
│
┌─────────────────────────────────────────────────────────────────▼┐
│ Storage Layer │
├─────────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Branching │ │ Time-Travel │ │ MVCC │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Catalog │ │ WAL │ │ Compression │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└────────────────────────────────────────────────────────────────┬┘
│
┌─────────────────────────────────────────────────────────────────▼┐
│ RocksDB Engine │
├─────────────────────────────────────────────────────────────────┤
│ LSM-Tree Storage │ SST Files │ Block Cache │ Bloom Filter│
└─────────────────────────────────────────────────────────────────┘
Core Components¶
Query Engine¶
| Component | Location | Responsibility |
|---|---|---|
| Parser | src/sql/parser.rs |
SQL parsing to AST |
| Planner | src/sql/planner.rs |
Logical plan generation |
| Optimizer | src/sql/optimizer/ |
Cost-based query optimization |
| Executor | src/sql/executor/ |
Physical plan execution |
Storage Engine¶
| Component | Location | Responsibility |
|---|---|---|
| Engine | src/storage/engine.rs |
RocksDB interface |
| Catalog | src/storage/catalog.rs |
Schema management |
| MVCC | src/storage/mvcc.rs |
Multi-version concurrency |
| WAL | src/storage/wal.rs |
Write-ahead logging |
| Branching | src/storage/branch.rs |
Database branching |
| Time-Travel | src/storage/time_travel.rs |
Historical queries |
Vector Search¶
| Component | Location | Responsibility |
|---|---|---|
| Index | src/vector/index.rs |
HNSW/IVF-PQ indexing |
| Search | src/vector/search.rs |
Similarity search |
| Embeddings | src/vector/embeddings.rs |
Embedding generation |
Data Flow¶
Query Execution Flow¶
1. Client sends SQL query
↓
2. Parser tokenizes and builds AST
↓
3. Planner creates logical plan
↓
4. Optimizer applies transformations:
- Predicate pushdown
- Join reordering
- Index selection
↓
5. Executor runs physical operations:
- Table scans (with SMFI)
- Index lookups
- Joins (nested loop, hash, merge)
- Aggregations
↓
6. Results returned to client
Transaction Flow¶
1. BEGIN TRANSACTION
↓
2. Acquire snapshot (MVCC)
↓
3. Execute operations:
- Read from snapshot
- Write to transaction buffer
↓
4. COMMIT:
- Write to WAL
- Apply to storage
- Update catalog
↓
5. Release resources
Key Design Decisions¶
Why RocksDB?¶
- LSM-tree architecture: Optimized for write-heavy workloads
- Compression: Native support for multiple codecs
- Column families: Efficient separation of data types
- Proven reliability: Used in production at scale
Why PostgreSQL Wire Protocol?¶
- Ecosystem compatibility: Works with existing tools (psql, pgAdmin)
- Driver support: Use existing PostgreSQL drivers
- No migration cost: Drop-in replacement for simple use cases
Branching Implementation¶
Branches are implemented using RocksDB column families with copy-on-write semantics:
SMFI (Storage-Level Metadata Filtering)¶
Parquet-style metadata filtering at the storage level:
Query: WHERE timestamp > '2024-01-01'
↓
Check block metadata:
- Block A: min=2023-01-01, max=2023-12-31 → SKIP
- Block B: min=2024-01-01, max=2024-06-30 → SCAN
- Block C: min=2024-07-01, max=2024-12-31 → SCAN
Module Dependencies¶
heliosdb_lite (lib.rs)
├── sql/
│ ├── parser (sqlparser)
│ ├── planner
│ ├── optimizer
│ └── executor
│ └── storage/
│ ├── engine (rocksdb)
│ ├── catalog
│ ├── mvcc
│ └── compression/
│ ├── fsst
│ └── alp
├── vector/
│ ├── index (hnsw, ivf)
│ └── embeddings
├── server/
│ ├── postgres (wire protocol)
│ └── http (REST API)
└── repl/ (CLI interface)
Performance Characteristics¶
| Operation | Complexity | Notes |
|---|---|---|
| Point lookup | O(log n) | B-tree index lookup |
| Range scan | O(log n + k) | k = result size |
| Full scan | O(n) | With SMFI optimization |
| Vector search | O(log n) | HNSW approximate |
| Branch creation | O(1) | Copy-on-write |
| Time-travel query | O(log n) | MVCC snapshot |
Configuration¶
Key configuration parameters affecting architecture:
| Parameter | Default | Impact |
|---|---|---|
storage.block_size |
4KB | I/O granularity |
storage.cache_size |
256MB | Memory usage |
storage.compression |
lz4 | CPU vs space |
mvcc.snapshot_retention |
1h | Time-travel range |
vector.index_type |
hnsw | Search performance |
See Configuration Reference for complete options.