// the search engineering layer

Run search experiments. Not infra.

Your search team is doing too much.

Learn how hev layer simplifies your search team concerns.

0.1-map v0.1

╔════════════╗      ╔════════════╗          ╔═══ vector store ════════════════════════╗
║   layer    ║░     ║   layer    ║░         ║                                         ║░
║   client   ║◀────▶║  gateway   ║◀──API───▶║                                         ║░
║            ║░     ║            ║░         ║                                         ║░
╚════════════╝░     ╚═════╤══════╝░         ║  ┏━━━━━━━━━━━━━━┓     ┏━━━━━━━━━━━━━━┓  ║░
 ░░░░░░░░░░░░░░      ░░░░░│░░░░░░░░         ║  ┃    BM 25     ┃     ┃  KNN / ANN   ┃  ║░
                          │                 ║  ┃              ┃     ┃              ┃  ║░
╔════════════╗      ╔═════▼══════╗          ║  ┗━━━━━━━━━━━━━━┛     ┗━━━━━━━━━━━━━━┛  ║░
║   layer    ║░     ║   layer    ║░         ║                                         ║░
║ dashboard  ║◀────▶║  operator  ║◀──API───▶║                                         ║░
║            ║░     ║            ║░         ║                                         ║░
╚════════════╝░     ╚═════╤══════╝░         ╚═════════════════════════════════════════╝░
 ░░░░░░░░░░░░░░      ░░░░░│░░░░░░░░          ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
                          ▼
                   ┏━━━━━━━━━━━━━━┓
                   ┃   document   ┃
                   ┃    cache     ┃
                   ┗━━━━━━┯━━━━━━━┛
                          │
                          ▼
                   ┏━━━━━━━━━━━━━━┓
                   ┃ Object Store ┃
                   ┃ Bucket (S3)  ┃
                   ┗━━━━━━━━━━━━━━┛

# jobs to be done

Your search team's jobs to be done.

Search teams need to solve many of the most complex and expensive jobs to be done in your org's data platform. Most of them have nothing to do with making search better for users.

// ship embeddings

Ship Python. Layer runs the GPU pool.

Building CUDA images, writing Kubernetes autoscalers, managing Spark — the time sink every search team underestimates, and managed services trade one kind of pain for another. Layer collapses it: declare a Python UDF and layer runs the work on CPU or GPU, scaling pods and nodes between bursts.

Read the Docs

// stay consistent

Track every state change your index makes.

Keeping the index in sync with source data usually means hand-rolled watchers and event hooks glued together by the team that wrote them. Layer ships the operator: it scans the index for consistency, watermarks state changes, and rolls up facets your application can read directly.

Read the Docs

// serve fetches

A doc cache deep enough to forget about.

Whether it's a near-bottomless queue for building your pipeline, or serving full datasets from a pull-through cache, your search system needs O(1) read/write capabilities. Layer ships Aerospike as a production-hardened document cache with NVMe price:performance.

Read the Docs

// see search

Metrics, traces, clickstream, alerts — without the plumbing.

Observability in 2026 has plenty of options and still demands plumbing. Layer bundles clickstream from the doc cache and operational metrics from the gateway into an opinionated dashboard, backed by a PromQL-compatible time series.

Read the Docs

// scope access — coming soon

Scoped access without writing the auth proxy yourself.

Today every search team inside a multi-tenant product writes the auth proxy themselves: scope API keys to namespaces, gate the write paths, ship audit events somewhere security will accept. Layer ships scoped keys, per-namespace RBAC, and an audit feed — the pattern your security team always asks for, as a primitive.

// track cost — coming soon

Know exactly how much you're spending on search.

Today "what does search cost us per million docs" is a question nobody can answer in under a week. AWS line items live in one bill, Turbopuffer in another, GPU pool minutes nowhere obvious. Layer pulls every line item into one invoice and derives the unit metrics — cost per million docs, cost per TiB indexed, cost per query — that scrub with the timeframe.

# layer run

Experiment faster in production.

Ever needed to backfill your production data? With layer that's as easy as creating a docker container. Layer handles compute, and can backfill as much or as little of your index as you specify.

$ layer run -f udf.yaml --index products

✓ submitted product-tags
→ watching   142 rows · 0 failed · 8 rows/s
→ watching 1,284 rows · 0 failed · 11 rows/s
→ watching 4,510 rows · 0 failed · 13 rows/s
✓ complete · 12,840 rows · 23s · 0 failed

You build and push your container to the configured registry. Layer handles queueing and scaling semantics for you, while you track progress. No Kubernetes experience necessary.

# what's in the box

hev layer is a BYOC product installed with Terraform and Helm. Read the Docs.

Gateway: Rust gateway, wire-compatible with your vector store. Adds the read-path machinery your client doesn't have.
Kube Operator: Kubernetes operator owning index consistency, snapshots, and per-workload autoscaling.
Dashboard: Operator console for click-ops and fin-ops — namespaces, snapshots, jobs, cost in one place.
Clients: Python, Go, and TypeScript SDKs generated from a single OpenAPI spec — each drops into your existing vector-store code with the same call shape, plus layer's extensions.

See the docs for the full SBOM.

# vector systems

One layer, many vector stores.

Layer puts one operator surface in front of whichever vector store the team already chose. Turbopuffer is the backend Layer runs against today; the rest are next.

turbopuffer ready
Pinecone coming soon
Chroma coming soon
Weaviate coming soon
Milvus coming soon
pgvector coming soon

# design preview / ready today

Bring a search workload your team is tired of operating.

The current program is for teams with a Turbopuffer-shaped retrieval path and a search team small enough to feel every concern on this page.

// fit criteria

1–3 person search team carrying a real retrieval workload
using, evaluating, or seriously considering Turbopuffer
3–5 TB of managed source data — the goldilocks range for a first engagement
no CMEK requirement right now