DbSchema | Cassandra - How to Create a Table?
Table of Contents
- Introduction
- Prerequisites
- Model for queries first
- Primary key basics
- Useful table options
- Create a table in cqlsh
- Create a table in DbSchema
- Common mistakes
- Conclusion
- References
Introduction
Creating a table in Cassandra looks simple, but the most important decision is not the column list - it is the primary key design. Cassandra stores data by partition and reads data best when the table is modeled for the exact queries the application needs.
That means table creation in Cassandra is less about translating a relational schema and more about modeling for access patterns, distribution, and predictable read performance.
Prerequisites
Before creating a table, make sure you have:
- a running Cassandra cluster
- a keyspace already created, such as the one from How to Create a Keyspace in Cassandra
cqlshaccess or DbSchema connected to the cluster- a clear idea of the queries the table must support
Model for queries first
Cassandra table design is query-driven:
- start from the queries you need to run
- design the primary key so those queries are efficient
- duplicate or denormalize data when different query patterns need different table shapes
Trying to model Cassandra like a relational database usually leads to hot partitions, expensive reads, or unsupported query patterns.
Primary key basics
The primary key in Cassandra has two parts: the partition key and the clustering columns.
Partition key
The partition key decides which replica set stores the row. Good partition keys spread traffic evenly across the cluster.
Good partition-key design usually means:
- high enough cardinality to avoid hot partitions
- partition sizes that stay manageable
- explicit bucketing for time-series workloads when needed
If one partition key value receives most of the traffic, that partition becomes a hotspot.
Clustering columns
Clustering columns define the order of rows inside a partition. They are what make range queries efficient after the partition has been identified.
Important points:
- clustering columns are ordered
CLUSTERING ORDER BYis set when the table is created- if you need a different clustering order later, you create a new table and migrate data
Static columns
A STATIC column is shared by all rows in the same partition. It is useful when a partition-level value would otherwise be repeated in every row.
Static columns have two important restrictions:
- they are allowed only when the table has clustering columns
- they cannot be part of the primary key
Useful table options
Here are some table options worth knowing at creation time:
| Option | Why it matters |
|---|---|
comment | Documents the purpose of the table. |
default_time_to_live | Sets a default TTL for rows when data should expire automatically. |
gc_grace_seconds | Controls how long tombstones are kept before garbage collection. |
compaction | Affects write amplification and read performance. |
compression | Reduces SSTable size on disk. |
caching | Controls key cache and row cache behavior. |
cdc | Includes the table in CDC when CDC is enabled in Cassandra. |
You do not need to predict every future column on day one. Cassandra can add new non-key columns later with a lightweight schema change.
Create a table in cqlsh
Start by switching to the correct keyspace:
USE ecommerce;
Then create a table designed for a specific query pattern. The example below stores orders by customer and day, with newest rows first inside each partition:
CREATE TABLE IF NOT EXISTS orders_by_customer_day (
customer_id uuid,
order_day date,
order_time timeuuid,
order_id uuid,
status text,
total decimal,
sales_rep text STATIC,
PRIMARY KEY ((customer_id, order_day), order_time, order_id)
) WITH CLUSTERING ORDER BY (order_time DESC)
AND comment = 'Orders grouped by customer and day'
AND compaction = {'class': 'TimeWindowCompactionStrategy'};
Why this definition is useful:
(customer_id, order_day)groups rows into partitions that are easier to keep boundedorder_timesorts the rows inside the partitionorder_idpreserves uniquenesssales_repis static because it belongs to the partition rather than each row
After creation, verify the schema:
DESCRIBE TABLE ecommerce.orders_by_customer_day;
Create a table in DbSchema
DbSchema makes Cassandra modeling much easier when you want to visualize tables before deployment.
- Open DbSchema and connect to the Cassandra cluster.
- Open the keyspace where the table should be created.
- Create the table from the schema tree or diagram canvas.
- Define the partition key and clustering columns explicitly.
- Set clustering order and table options if needed.
- Review the generated CQL and deploy it to Cassandra.
DbSchema is useful here because it helps you keep the design model, documentation, and deployed schema aligned.
Common mistakes
Watch out for these common problems:
- designing tables like normalized SQL tables and expecting joins later
- picking a low-cardinality partition key that creates hotspots
- letting partitions grow without bounds in time-series data
- assuming clustering order can be changed later with
ALTER TABLE - pre-creating many speculative columns even though Cassandra can add non-key columns later
Conclusion
Creating a Cassandra table is really an exercise in data modeling. Choose the partition key for even distribution, use clustering columns for the query order you need, and set table options intentionally instead of copying defaults blindly.
Once the table exists, the next schema task is usually controlled evolution with ALTER TABLE. For that, see How to Alter a Table in Cassandra.