Learned Semantic Cost Models for LLM-Native Relational Engines
Treating AI/LLM inference as a core database primitive
A paradigm shift in relational query optimization. SEMANTIX couples LLM inference with learned semantic cost estimation, eliminating token waste and semantic misalignment. 3.2ร token reduction, 1.8ร speedup, 97.1% accuracy.
โก Rust implementation โข PostgreSQL 14+ โข Production-ready v0.1.0
"Current systems decouple LLM retrieval from cost-aware query planning."
This architectural choice results in token waste, semantic misalignment, and unbounded latency. The database doesn't know how to optimize for AI. The AI doesn't know its cost. They operate in isolation. SEMANTIX unifies them.
SEMANTIX treats LLM inference as a first-class database primitive, coupled with learned semantic cost estimation through information-theoretic foundations.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ SEMANTIX Query Optimizer โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ โ โ Phase 1: Semantic Parsing โ โ โโ NL Query โ Bidirectional Semantic Anchor โ โ โโ Output: LogicalPlan + Initial Cost Estimates โ โ โ โ Phase 2: Cost Refinement โ โ โโ Learned Cost Model (GBDT) โ โ โโ Output: Refined token cost estimates โ โ โ โ Phase 3: Adaptive Token Scheduling โ โ โโ Constrained Optimization (Lagrangian) โ โ โโ Output: Token allocation schedule โ โ โ โ Phase 4: Execution + Feedback Loop โ โ โโ Execute with schedule โ โ โโ Update cost model with actual execution data โ โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Unified cost model embedding semantic entropy, relational context preservation, and execution schedule conditioning.
Learned projections mapping NL intent to cost-parametric logical plans with full provenance.
Dynamic token allocation under latency constraints using Lagrangian relaxation and iterative refinement.
Feedback loop integrating actual execution metrics to refine cost models in real-time.
Evaluated on extended TPC-H with semantic annotations. SEMANTIX demonstrates:
Inference Token Cost
3.2ร
Reduction vs PostgreSQL
End-to-End Latency
1.8ร
Speedup (25.3ms vs 45.3ms)
Semantic Accuracy
97.1%
Maintained
Energy Reduction
65.6%
vs Classical Systems
| System | Tokens (K) | Latency (ms) | Accuracy (%) | Energy (Wh) |
|---|---|---|---|---|
| SEMANTIX | 3.1 | 25.3 | 97.1 | 1.24 |
| Classical PostgreSQL | 9.9 | 45.3 | 89.4 | 3.61 |
| RAG-Optimized | 8.2 | 42.1 | 91.3 | 3.04 |
| Semantic Entropy | 5.4 | 33.7 | 94.8 | 1.89 |
The core cost model combines information-theoretic entropy with execution schedule conditioning:
where:
H(i | ฮฃ^ctx(i)) = Conditional semantic entropy of token i given contextฮณ = Delay weight parameter (controls latency penalty)ฮฒ = Staleness weight parameter (controls stale context penalty)ฯ = Execution schedule (allocation of computational resources)Maps natural language queries to cost-parametric logical plans:
Solves the constrained optimization problem using Lagrangian relaxation:
Convergence: The iterative algorithm converges to an ฮต-optimal solution in O(log(1/ฮต)) iterations, bounded by a configurable threshold (default 0.001).
# Clone repository
git clone https://github.com/novas-workshop-2026/learned-semantic-costs.git
cd semantix
# Build project (release optimized)
cargo build --release
# Create PostgreSQL database
createdb semantix
# Initialize schema
psql -d semantix -f schema/tpch_schema.sql
# Generate TPC-H with semantic annotations
cargo run --release --bin data-generator
# Load data
psql -d semantix -c "COPY orders FROM 'tpch_orders_semantic.csv' CSV HEADER;"
# Profile operator latencies
cargo run --release --bin cost-profiler
# Run benchmark
cargo run --release --bin benchmarkcargo run --release --bin semantix-daemonuse semantix::SemanticQueryOptimizer;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Initialize optimizer
let mut optimizer = SemanticQueryOptimizer::new(
"postgresql://localhost/semantix"
).await?;
// Execute query with full semantic optimization
let result = optimizer.optimize_and_execute(
"SELECT * FROM orders WHERE custkey = 1"
).await?;
// Check metrics
let metrics = optimizer.get_metrics();
println!("Tokens: {}, Latency: {}ms, Accuracy: {:.2}%",
metrics.avg_token_cost,
metrics.avg_latency_ms,
metrics.avg_semantic_accuracy * 100.0
);
// Provide feedback for continuous learning
optimizer.feedback(&result.context);
Ok(())
}
# Profile specific query
cargo run --release --bin benchmark -- --query "SELECT * FROM orders LIMIT 100"
# Generate data with custom scale
cargo run --release --bin data-generator -- --scale-factor 10
# Profile specific operators
cargo run --release --bin cost-profiler -- --operators "Scan,Filter,Join"[anchor_config]
encoder_model_path = "models/bert-encoder-semantic.bin"
decoder_model_path = "models/bert-decoder-semantic.bin"
max_sequence_length = 512
embedding_dim = 768
semantic_drift_threshold = 0.15
[cost_model_config]
model_type = "gbdt"
model_path = "models/cost_model.xgb"
entropy_weight = 1.0
delay_weight = 0.3
staleness_weight = 0.5
min_token_budget = 100
max_token_budget = 10000
[scheduler_config]
max_latency_ms = 50
latency_sigma = 0.1
alpha = 0.01
convergence_threshold = 0.001
max_iterations = 1000
[database]
url = "postgresql://localhost/semantix"
log_level = "info"export DATABASE_URL="postgresql://user:password@localhost/semantix"
export LOG_LEVEL="debug"
export SEMANTIX_CONFIG="path/to/semantix.toml"# Run all tests
cargo test
cargo test --doc
cargo test --all-features
# Integration tests (requires PostgreSQL)
cargo test --test integration_tests -- --test-threads=1
# Benchmark tests
cargo bench# CPU profiling with flamegraph
cargo install flamegraph
cargo flamegraph --bin benchmark
# Open flamegraph.svg in browser
# Memory profiling
valgrind --tool=massif ./target/release/benchmark
ms_print massif.out.
# Latency profiling
cargo run --release --bin cost-profiler -- --detailed-report semantix/ โโโ Cargo.toml # Rust dependencies โโโ src/ โ โโโ lib.rs # Main library exports โ โโโ semantic_anchors.rs # NL โ LogicalPlan translation โ โโโ cost_model.rs # Learned cost estimation โ โโโ scheduler.rs # Adaptive token scheduling (Alg 1) โ โโโ database.rs # PostgreSQL integration โ โโโ executor.rs # Query execution engine โ โโโ metrics.rs # Performance tracking โ โโโ config.rs # Configuration management โ โโโ errors.rs # Error types โ โโโ bin/ โ โโโ daemon.rs # Main optimizer service โ โโโ profiler.rs # Latency profiler โ โโโ data_gen.rs # TPC-H data generation โ โโโ benchmark.rs # Performance evaluation โโโ schema/ โ โโโ tpch_schema.sql # PostgreSQL schema โโโ tests/ โ โโโ integration_tests.rs # End-to-end tests โ โโโ unit_tests.rs # Component tests โโโ docker/ โ โโโ Dockerfile # Container image โ โโโ docker-compose.yml # Multi-container setup โโโ README.md
We welcome contributions! The project uses:
Full type safety and memory safety guarantees.
Unit and integration tests for all components.
Inline docs, rustdoc, and comprehensive guides.
# Fork, create branch, and submit PR
git checkout -b feature/amazing-feature
git commit -m 'Add amazing feature'
git push origin feature/amazing-feature
# Ensure code quality
cargo test
cargo fmt
cargo clippy -- -D warnings@inproceedings{semantix2026,
title={Learned Semantic Cost Models for Adaptive Token-Efficient
Query Optimization in LLM-Native Relational Engines},
author={Prakul Sunil Hiremath},
year={2026}
}
SEMANTIX โ LLM-Native Relational Engines with Learned Semantic Costs
Apache License 2.0 ยท Rust 1.70+ ยท PostgreSQL 14+ ยท Open Source
"The database finally learned to talk to AI."