DbSchema | Cassandra - How to Create a Table?

Cassandra • 18-May-2023

by Malik Umair Sajjad

Introduction
Prerequisites
Model for queries first
Primary key basics
Useful table options
Create a table in cqlsh
Create a table in DbSchema
Common mistakes
Conclusion
References

Introduction

Creating a table in Cassandra looks simple, but the most important decision is not the column list - it is the primary key design. Cassandra stores data by partition and reads data best when the table is modeled for the exact queries the application needs.

That means table creation in Cassandra is less about translating a relational schema and more about modeling for access patterns, distribution, and predictable read performance.

Prerequisites

Before creating a table, make sure you have:

a running Cassandra cluster
a keyspace already created, such as the one from How to Create a Keyspace in Cassandra
cqlsh access or DbSchema connected to the cluster
a clear idea of the queries the table must support

Model for queries first

Cassandra table design is query-driven:

start from the queries you need to run
design the primary key so those queries are efficient
duplicate or denormalize data when different query patterns need different table shapes

Trying to model Cassandra like a relational database usually leads to hot partitions, expensive reads, or unsupported query patterns.

Primary key basics

The primary key in Cassandra has two parts: the partition key and the clustering columns.

Partition key

The partition key decides which replica set stores the row. Good partition keys spread traffic evenly across the cluster.

Good partition-key design usually means:

high enough cardinality to avoid hot partitions
partition sizes that stay manageable
explicit bucketing for time-series workloads when needed

If one partition key value receives most of the traffic, that partition becomes a hotspot.

Clustering columns

Clustering columns define the order of rows inside a partition. They are what make range queries efficient after the partition has been identified.

Important points:

clustering columns are ordered
CLUSTERING ORDER BY is set when the table is created
if you need a different clustering order later, you create a new table and migrate data

Static columns

A STATIC column is shared by all rows in the same partition. It is useful when a partition-level value would otherwise be repeated in every row.

Static columns have two important restrictions:

they are allowed only when the table has clustering columns
they cannot be part of the primary key

Useful table options

Here are some table options worth knowing at creation time:

Option	Why it matters
`comment`	Documents the purpose of the table.
`default_time_to_live`	Sets a default TTL for rows when data should expire automatically.
`gc_grace_seconds`	Controls how long tombstones are kept before garbage collection.
`compaction`	Affects write amplification and read performance.
`compression`	Reduces SSTable size on disk.
`caching`	Controls key cache and row cache behavior.
`cdc`	Includes the table in CDC when CDC is enabled in Cassandra.

You do not need to predict every future column on day one. Cassandra can add new non-key columns later with a lightweight schema change.

Create a table in cqlsh

Start by switching to the correct keyspace:

USE ecommerce;

Then create a table designed for a specific query pattern. The example below stores orders by customer and day, with newest rows first inside each partition:

CREATE TABLE IF NOT EXISTS orders_by_customer_day (
    customer_id uuid,
    order_day date,
    order_time timeuuid,
    order_id uuid,
    status text,
    total decimal,
    sales_rep text STATIC,
    PRIMARY KEY ((customer_id, order_day), order_time, order_id)
) WITH CLUSTERING ORDER BY (order_time DESC)
  AND comment = 'Orders grouped by customer and day'
  AND compaction = {'class': 'TimeWindowCompactionStrategy'};

Why this definition is useful:

(customer_id, order_day) groups rows into partitions that are easier to keep bounded
order_time sorts the rows inside the partition
order_id preserves uniqueness
sales_rep is static because it belongs to the partition rather than each row

After creation, verify the schema:

DESCRIBE TABLE ecommerce.orders_by_customer_day;

Create a table in DbSchema

DbSchema makes Cassandra modeling much easier when you want to visualize tables before deployment.

Open DbSchema and connect to the Cassandra cluster.
Open the keyspace where the table should be created.
Create the table from the schema tree or diagram canvas.
Define the partition key and clustering columns explicitly.
Set clustering order and table options if needed.
Review the generated CQL and deploy it to Cassandra.

DbSchema is useful here because it helps you keep the design model, documentation, and deployed schema aligned.

Common mistakes

Watch out for these common problems:

designing tables like normalized SQL tables and expecting joins later
picking a low-cardinality partition key that creates hotspots
letting partitions grow without bounds in time-series data
assuming clustering order can be changed later with ALTER TABLE
pre-creating many speculative columns even though Cassandra can add non-key columns later

Conclusion

Creating a Cassandra table is really an exercise in data modeling. Choose the partition key for even distribution, use clustering columns for the query order you need, and set table options intentionally instead of copying defaults blindly.

Once the table exists, the next schema task is usually controlled evolution with ALTER TABLE. For that, see How to Alter a Table in Cassandra.