Database Operations

Infrastructure|Remote (India)|Full-time

Own the data layer across our client platforms. You'll design, deploy, and operate database infrastructure spanning ChromaDB for vector workloads and ScyllaDB for high-throughput transactional systems.

About the Role

We're looking for someone who thinks in data models, lives in terminal sessions, and has strong opinions about consistency vs. availability trade-offs. This role sits at the intersection of infrastructure and application engineering — you'll work directly with product teams to design schemas, optimize queries, tune clusters, and keep our data layer fast and reliable across multiple client deployments. ChromaDB and ScyllaDB are core to how we build. ChromaDB powers our vector search and embedding pipelines for AI-augmented features across client products. ScyllaDB handles the high-throughput, low-latency workloads that our fintech and food-tech clients depend on — think millions of writes per second with single-digit millisecond p99 latencies. You'll own both.

What You'll Do

Design and maintain ScyllaDB clusters for high-throughput transactional workloads across client platforms

Deploy and operate ChromaDB instances for vector search, RAG pipelines, and embedding storage

Define data models, partition strategies, and compaction policies for ScyllaDB tables

Build and maintain collection schemas, indexing strategies, and query optimization for ChromaDB

Set up monitoring, alerting, and capacity planning for all database infrastructure

Collaborate with application engineers on query patterns, caching strategies, and data access layers

Write runbooks, conduct incident post-mortems, and improve operational reliability

Evaluate and benchmark new database tooling as workload requirements evolve

Requirements

3+ years operating distributed databases in production (ScyllaDB, Cassandra, or DynamoDB)

Hands-on experience with vector databases — ChromaDB preferred, Pinecone/Weaviate/Milvus acceptable

Strong understanding of the CQL data model, partition design, and ScyllaDB internals (compaction, repair, streaming)

Familiarity with embedding models and vector similarity search concepts

Experience with infrastructure-as-code (Terraform, Ansible, or Pulumi)

Comfortable in Linux environments, shell scripting, and container orchestration (Docker, Kubernetes)

Solid understanding of distributed systems fundamentals — CAP theorem, eventual consistency, consensus protocols

Nice to Have

Experience with ScyllaDB Alternator (DynamoDB-compatible API)

Contributions to open-source database projects

Familiarity with LangChain, LlamaIndex, or similar frameworks that integrate with ChromaDB

Prior experience in fintech, food-tech, or healthtech data infrastructure

Performance tuning experience at scale (>1M ops/sec)

TECH STACK

ScyllaDBChromaDBPostgreSQLRedisKafkaDockerKubernetesTerraformGrafanaPython

Database Operations

About the Role

What You'll Do

Requirements

Nice to Have

Apply