DbSchema | Cassandra - How to Create a Keyspace?
Table of Contents
- Introduction
- What a keyspace controls
- Prerequisites
- Replication strategies
- Replication factor and durable writes
- Create a keyspace in cqlsh
- Create a keyspace in DbSchema
- Common mistakes
- Conclusion
- References
Introduction
A keyspace is the top-level namespace in Apache Cassandra. It groups related tables and defines how their data is replicated across the cluster. In practice, creating a keyspace is not just naming a container; it is where you decide the durability and availability model for the data that will live inside it.
For many applications, a sensible starting point is one keyspace per application, then several tables inside that keyspace.
What a keyspace controls
A Cassandra keyspace controls:
- the logical namespace for tables
- the replication strategy
- the replication factor for each data center
- the
durable_writessetting
This is why keyspace creation belongs early in the design process. If you choose the wrong replication layout, every table in that keyspace inherits the problem.
Prerequisites
Before creating a keyspace, make sure you have:
- a running Cassandra cluster and permission to create schema objects
- access through
cqlshor a GUI such as DbSchema - the correct data center names if you plan to use
NetworkTopologyStrategy - a rough idea of the replication factor you want in each environment
Replication strategies
Replication strategy determines where Cassandra places replicas.
SimpleStrategy
SimpleStrategy is easy to configure and useful for single-data-center development or test environments. It only needs one setting:
{'class': 'SimpleStrategy', 'replication_factor': 1}
For production workloads, the official Cassandra documentation recommends NetworkTopologyStrategy instead, because SimpleStrategy does not respect data center layouts.
NetworkTopologyStrategy
NetworkTopologyStrategy is the production-ready choice. It lets you set the replication factor independently for each data center:
{'class': 'NetworkTopologyStrategy', 'DC1': 3, 'DC2': 3}
This is the standard approach for production because it is aware of data center placement and makes availability planning much clearer.
Replication factor and durable writes
The replication factor (RF) is the number of copies Cassandra keeps for each token range.
Useful rules of thumb:
RF = 1is fine for disposable local testing, but not for important dataRF = 3per data center is a common production default- each data center should have at least as many nodes as its replication factor
- leave
durable_writes = trueunless you deliberately accept the risk of bypassing the commit log
If you later change replication with ALTER KEYSPACE, plan that operation carefully. Replication changes affect how data is distributed and can require follow-up operational work.
Create a keyspace in cqlsh
Open cqlsh and create the keyspace with the strategy that matches the environment.
For a local development cluster:
CREATE KEYSPACE IF NOT EXISTS app_lab
WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
For a production-style single data center deployment:
CREATE KEYSPACE IF NOT EXISTS ecommerce
WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': 3}
AND durable_writes = true;
For a multi-data-center deployment:
CREATE KEYSPACE IF NOT EXISTS ecommerce
WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': 3, 'DC2': 3}
AND durable_writes = true;
Then verify the result:
DESCRIBE KEYSPACE ecommerce;
That DESCRIBE output is worth checking before you move on to table creation, especially in clusters with multiple data centers.
Create a keyspace in DbSchema
DbSchema gives you a safer way to model and document Cassandra visually before you deploy changes.
- Open DbSchema and create a connection to your Cassandra cluster.
- In the schema tree or design model, create a new keyspace.
- Choose the replication strategy and enter the replication factor for each data center.
- Review the generated CQL before deploying it.
- Save the model so the replication settings remain documented alongside the rest of your schema.
DbSchema is especially helpful when you want the keyspace definition to live together with the table design and documentation instead of only in migration scripts.
Common mistakes
The most common problems are:
- using
SimpleStrategyin production - choosing
RF = 1for data you cannot afford to lose - misspelling data center names in
NetworkTopologyStrategy - creating too many keyspaces for a single application without a clear reason
- treating keyspace creation as a purely logical step instead of a replication decision
Conclusion
Creating a keyspace in Cassandra is simple syntactically, but the replication choices behind it matter a lot. Pick the right strategy for the environment, use a sensible replication factor, and verify the keyspace before you start creating tables.
Once the keyspace exists, the next step is to create tables with a primary key designed around your query patterns. For that, see How to Create a Table in Cassandra.