DbSchema | Cassandra - How to Create a Table?



Table of Contents

  1. Introduction
  2. Prerequisites
  3. Model for queries first
  4. Primary key basics
  5. Useful table options
  6. Create a table in cqlsh
  7. Create a table in DbSchema
  8. Common mistakes
  9. Conclusion
  10. References

Introduction

Creating a table in Cassandra looks simple, but the most important decision is not the column list - it is the primary key design. Cassandra stores data by partition and reads data best when the table is modeled for the exact queries the application needs.

That means table creation in Cassandra is less about translating a relational schema and more about modeling for access patterns, distribution, and predictable read performance.

Prerequisites

Before creating a table, make sure you have:

  • a running Cassandra cluster
  • a keyspace already created, such as the one from How to Create a Keyspace in Cassandra
  • cqlsh access or DbSchema connected to the cluster
  • a clear idea of the queries the table must support

Model for queries first

Cassandra table design is query-driven:

  • start from the queries you need to run
  • design the primary key so those queries are efficient
  • duplicate or denormalize data when different query patterns need different table shapes

Trying to model Cassandra like a relational database usually leads to hot partitions, expensive reads, or unsupported query patterns.

Primary key basics

The primary key in Cassandra has two parts: the partition key and the clustering columns.

Partition key

The partition key decides which replica set stores the row. Good partition keys spread traffic evenly across the cluster.

Good partition-key design usually means:

  • high enough cardinality to avoid hot partitions
  • partition sizes that stay manageable
  • explicit bucketing for time-series workloads when needed

If one partition key value receives most of the traffic, that partition becomes a hotspot.

Clustering columns

Clustering columns define the order of rows inside a partition. They are what make range queries efficient after the partition has been identified.

Important points:

  • clustering columns are ordered
  • CLUSTERING ORDER BY is set when the table is created
  • if you need a different clustering order later, you create a new table and migrate data

Static columns

A STATIC column is shared by all rows in the same partition. It is useful when a partition-level value would otherwise be repeated in every row.

Static columns have two important restrictions:

  • they are allowed only when the table has clustering columns
  • they cannot be part of the primary key

Useful table options

Here are some table options worth knowing at creation time:

OptionWhy it matters
commentDocuments the purpose of the table.
default_time_to_liveSets a default TTL for rows when data should expire automatically.
gc_grace_secondsControls how long tombstones are kept before garbage collection.
compactionAffects write amplification and read performance.
compressionReduces SSTable size on disk.
cachingControls key cache and row cache behavior.
cdcIncludes the table in CDC when CDC is enabled in Cassandra.

You do not need to predict every future column on day one. Cassandra can add new non-key columns later with a lightweight schema change.

Create a table in cqlsh

Start by switching to the correct keyspace:

USE ecommerce;

Then create a table designed for a specific query pattern. The example below stores orders by customer and day, with newest rows first inside each partition:

CREATE TABLE IF NOT EXISTS orders_by_customer_day (
    customer_id uuid,
    order_day date,
    order_time timeuuid,
    order_id uuid,
    status text,
    total decimal,
    sales_rep text STATIC,
    PRIMARY KEY ((customer_id, order_day), order_time, order_id)
) WITH CLUSTERING ORDER BY (order_time DESC)
  AND comment = 'Orders grouped by customer and day'
  AND compaction = {'class': 'TimeWindowCompactionStrategy'};

Why this definition is useful:

  • (customer_id, order_day) groups rows into partitions that are easier to keep bounded
  • order_time sorts the rows inside the partition
  • order_id preserves uniqueness
  • sales_rep is static because it belongs to the partition rather than each row

After creation, verify the schema:

DESCRIBE TABLE ecommerce.orders_by_customer_day;

Create a table in DbSchema

DbSchema makes Cassandra modeling much easier when you want to visualize tables before deployment.

  1. Open DbSchema and connect to the Cassandra cluster.
  2. Open the keyspace where the table should be created.
  3. Create the table from the schema tree or diagram canvas.
  4. Define the partition key and clustering columns explicitly.
  5. Set clustering order and table options if needed.
  6. Review the generated CQL and deploy it to Cassandra.

DbSchema is useful here because it helps you keep the design model, documentation, and deployed schema aligned.

Common mistakes

Watch out for these common problems:

  • designing tables like normalized SQL tables and expecting joins later
  • picking a low-cardinality partition key that creates hotspots
  • letting partitions grow without bounds in time-series data
  • assuming clustering order can be changed later with ALTER TABLE
  • pre-creating many speculative columns even though Cassandra can add non-key columns later

Conclusion

Creating a Cassandra table is really an exercise in data modeling. Choose the partition key for even distribution, use clustering columns for the query order you need, and set table options intentionally instead of copying defaults blindly.

Once the table exists, the next schema task is usually controlled evolution with ALTER TABLE. For that, see How to Alter a Table in Cassandra.

References

DbSchema ER Diagrams & Team Collaboration

Desktop App for All OS
DbSchema ER Diagram Free Download
Visual Design & Schema Diagram

✓ Create and manage your database schema visually through a user-friendly graphical interface.

✓ Easily arrange tables, columns, and foreign keys to simplify complex database structures, ensuring clarity and accessibility.

GIT & Collaboration
Version Control & Collaboration

✓ Manage schema changes through version control with built-in Git integration, ensuring every update is tracked and backed up.

✓ Collaborate efficiently with your team to maintain data integrity and streamline your workflow for accurate, consistent results.

Data Explorer & Query Builder
Relational Data & Query Builder

✓ Seamlessly navigate and visually explore your database, inspecting tables and their relationships.

✓ Build complex SQL queries using an intuitive drag-and-drop interface, providing instant results for quick, actionable insights.

Interactive Documentation & Reporting
HTML5 Documentation & Reporting

✓ Generate HTML5 documentation that provides an interactive view of your database schema.

✓ Include comments for columns, use tags for better organization, and create visually reports.