Lyra

Online
Database Design Mistakes That Will Destroy Your Application (And How to Fix Them)
Software Development 13 min read

Database Design Mistakes That Will Destroy Your Application (And How to Fix Them)

S
Squalltec Team November 12, 2018

Why Database Design Matters More Than You Think

Your application can have perfect code. Beautiful UI. Clean architecture.

But if your database design is wrong, none of that matters.

We’ve rescued 50+ projects with database problems. The pattern is always the same:

  • Application works fine with 100 records
  • Starts slowing down at 10,000 records
  • Crashes at 100,000 records
  • Becomes unusable at 1 million records

The cause? Database design decisions made on Day 1 that seemed fine at the time.

This article covers the 10 most destructive database design mistakes we see repeatedly, real-world consequences, and how to fix them (or avoid them).

Mistake #1: No Indexes (or Too Many Indexes)

The Problem:

No Indexes:

-- Query takes 45 seconds on 1M records
SELECT * FROM users WHERE email = 'user@example.com';

Database scans ALL records sequentially. Slow.

Too Many Indexes:

-- 15 indexes on users table
CREATE INDEX idx_1 ON users(email);
CREATE INDEX idx_2 ON users(name);
CREATE INDEX idx_3 ON users(created_at);
... (12 more indexes)

Every INSERT/UPDATE/DELETE must update all indexes. Slow writes.

Real Impact:

Client’s e-commerce platform:

  • No index on products.category
  • Product search took 8-12 seconds
  • Lost 60% of users to timeout

After adding one index:

  • Search: 8 seconds → 50 milliseconds
  • 160x faster

The Fix:

Index Strategy:

  1. Index foreign keys (always)
  2. Index frequently searched columns
  3. Index columns used in WHERE, JOIN, ORDER BY
  4. Don’t index:
    • Low-cardinality columns (gender: 2 values)
    • Rarely queried columns
    • Columns that change frequently

Example:

-- Good indexes
CREATE INDEX idx_users_email ON users(email);  -- Login lookups
CREATE INDEX idx_orders_user_id ON orders(user_id);  -- Foreign key
CREATE INDEX idx_products_category_status ON products(category, status);  -- Compound

-- Bad indexes
CREATE INDEX idx_users_is_active ON users(is_active);  -- Only 2 values
CREATE INDEX idx_logs_description ON logs(description);  -- Text field, rarely searched

Mistake #2: Using Wrong Data Types

The Problem:

Storing Dates as Strings:

CREATE TABLE events (
  event_date VARCHAR(50)  -- Storing "2018-11-12 14:30:00"
);

Problems:

  • Can’t do date math
  • Can’t sort chronologically (string sort: “2018-11-2” comes before “2018-11-12”)
  • Uses more space
  • No date validation

Storing Numbers as Strings:

CREATE TABLE products (
  price VARCHAR(20)  -- Storing "49.99"
);

Problems:

  • Can’t do arithmetic
  • Can’t compare properly (“9” > “10” as strings)
  • Wastes space

Storing JSON as Text (When You Need to Query It):

CREATE TABLE orders (
  items TEXT  -- Storing JSON as text
);

Can’t query: SELECT * FROM orders WHERE items contains 'product_id = 123'

Real Impact:

Booking platform stored prices as VARCHAR:

SELECT * FROM rooms WHERE price < 100;

Returned rooms with price “90” but NOT “95” because string comparison.

The Fix:

Use Proper Data Types:

CREATE TABLE better_design (
  -- Dates
  created_at TIMESTAMP,
  birth_date DATE,
  event_time TIME,
  
  -- Numbers
  price DECIMAL(10,2),  -- Money: exact decimal
  quantity INTEGER,
  rating FLOAT,  -- Scientific: approximate
  
  -- JSON (PostgreSQL)
  metadata JSONB,  -- Queryable JSON
  
  -- Booleans
  is_active BOOLEAN,  -- Not TINYINT or VARCHAR
  
  -- Text
  email VARCHAR(255),  -- Limited length
  description TEXT,  -- Unlimited
  status VARCHAR(20) CHECK (status IN ('pending', 'active', 'cancelled'))  -- Enum-like
);

Space Savings:

Data TypeBad (VARCHAR)GoodSavings
Date19 bytes8 bytes58%
Integer10 bytes4 bytes60%
Boolean5 bytes1 byte80%

On 1M records, proper data types save gigabytes.

Mistake #3: Not Normalizing (or Over-Normalizing)

The Problem:

Under-Normalized (Duplication):

CREATE TABLE orders (
  id INT,
  customer_name VARCHAR(100),
  customer_email VARCHAR(100),
  customer_phone VARCHAR(20),
  customer_address TEXT,
  ... (repeated for every order)
);

Problems:

  • Customer data duplicated in every order
  • Update customer email = update 1000 orders
  • Data inconsistency (email changed in order #1 but not #2-1000)
  • Wasted space

Over-Normalized (Too Many Joins):

-- 8 tables for one simple query
SELECT *
FROM orders o
JOIN customers c ON o.customer_id = c.id
JOIN addresses a ON c.address_id = a.id
JOIN cities ci ON a.city_id = ci.id
JOIN states s ON ci.state_id = s.id
JOIN countries co ON s.country_id = co.id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id;

8 joins for every order query = slow.

Real Impact:

Healthcare app over-normalized patient data:

  • Simple patient lookup required 12 joins
  • Query took 3-5 seconds
  • Doctor interface unusable

The Fix:

Find the Balance:

Normalize to 3rd Normal Form (Usually):

-- Customers (normalize)
CREATE TABLE customers (
  id INT PRIMARY KEY,
  name VARCHAR(100),
  email VARCHAR(100)
);

-- Orders (reference customers)
CREATE TABLE orders (
  id INT PRIMARY KEY,
  customer_id INT REFERENCES customers(id),
  order_date TIMESTAMP,
  total DECIMAL(10,2)
);

-- Order Items (separate for flexibility)
CREATE TABLE order_items (
  id INT PRIMARY KEY,
  order_id INT REFERENCES orders(id),
  product_id INT REFERENCES products(id),
  quantity INT,
  price DECIMAL(10,2)
);

Denormalize for Performance (When Needed):

-- Add calculated/cached fields
ALTER TABLE orders ADD COLUMN items_count INT;
ALTER TABLE orders ADD COLUMN customer_name VARCHAR(100);  -- Denormalized

-- Update via trigger
CREATE TRIGGER update_order_cache
AFTER INSERT OR UPDATE ON order_items
FOR EACH ROW
UPDATE orders SET items_count = (
  SELECT COUNT(*) FROM order_items WHERE order_id = NEW.order_id
) WHERE id = NEW.order_id;

Rule: Normalize for data integrity. Denormalize strategically for performance.

Mistake #4: No Foreign Key Constraints

The Problem:

CREATE TABLE orders (
  id INT PRIMARY KEY,
  customer_id INT  -- No foreign key constraint
);

CREATE TABLE customers (
  id INT PRIMARY KEY,
  name VARCHAR(100)
);

What Can Go Wrong:

-- Delete customer
DELETE FROM customers WHERE id = 123;

-- Orders still reference customer 123
SELECT * FROM orders WHERE customer_id = 123;
-- Returns orders for non-existent customer!

Orphaned Records:

  • Orders without customers
  • Comments without posts
  • Payments without invoices

Real Impact:

Inventory system had no foreign keys:

  • Products deleted
  • 15,000 orphaned inventory records
  • Reports showed stock for non-existent products
  • Took 3 days to clean up data

The Fix:

Add Foreign Keys:

CREATE TABLE orders (
  id INT PRIMARY KEY,
  customer_id INT REFERENCES customers(id) ON DELETE CASCADE
);

Deletion Strategies:

CASCADE: Delete child records automatically

ON DELETE CASCADE  -- Delete orders when customer deleted

RESTRICT: Prevent deletion if children exist

ON DELETE RESTRICT  -- Can't delete customer with orders

SET NULL: Set child reference to NULL

ON DELETE SET NULL  -- Orders remain, customer_id becomes NULL

SET DEFAULT: Set child reference to default value

ON DELETE SET DEFAULT  -- Orders reference default "deleted customer"

Choose Based on Business Logic:

  • Financial records: RESTRICT (never delete with transactions)
  • Comments/reviews: CASCADE (delete with parent)
  • Soft-delete scenarios: SET NULL or custom trigger

Mistake #5: Storing Files in Database

The Problem:

CREATE TABLE documents (
  id INT PRIMARY KEY,
  filename VARCHAR(255),
  file_content BYTEA  -- Storing entire PDF/image in database
);

Consequences:

  • Database size explodes (1000 PDFs × 2MB = 2GB)
  • Backups take forever
  • Queries slow down (scanning through binary data)
  • Memory issues (loading files into RAM)
  • Can’t use CDN for delivery

Real Impact:

Document management system:

  • Stored 50,000 PDFs in database
  • Database size: 120GB
  • Backup time: 6 hours
  • Full table scan queries: 45 seconds
  • Monthly hosting: $800

The Fix:

Store Files Externally, Paths in Database:

CREATE TABLE documents (
  id INT PRIMARY KEY,
  filename VARCHAR(255),
  file_path VARCHAR(500),  -- "s3://bucket/documents/2018/11/doc123.pdf"
  file_size INT,
  mime_type VARCHAR(50),
  uploaded_at TIMESTAMP
);

Storage Options:

  • Amazon S3
  • Azure Blob Storage
  • Google Cloud Storage
  • Local filesystem (with backup)

After Migration:

  • Database size: 120GB → 500MB
  • Backup time: 6 hours → 5 minutes
  • Query speed: 45 seconds → 200ms
  • Monthly hosting: $800 → $150
  • Savings: $650/month

When to Store in Database:

  • Very small files (< 100KB)
  • Files that require transactions with data
  • Security requirements demand database storage

Mistake #6: Not Planning for Soft Deletes

The Problem:

-- Hard delete
DELETE FROM users WHERE id = 123;

Gone forever. No recovery. No audit trail.

Real Scenarios:

  • Customer wants to restore deleted account
  • Need to audit who deleted what
  • Legal requirement to retain data
  • “Oops, I deleted the wrong record”

Real Impact:

E-commerce platform:

  • Customer accidentally deleted account
  • 5 years of order history gone
  • No way to recover
  • Lost customer trust

The Fix:

Implement Soft Deletes:

CREATE TABLE users (
  id INT PRIMARY KEY,
  email VARCHAR(255),
  deleted_at TIMESTAMP NULL,  -- NULL = active, timestamp = deleted
  deleted_by INT REFERENCES users(id)
);

-- "Delete" user
UPDATE users 
SET deleted_at = NOW(), deleted_by = 456
WHERE id = 123;

-- Query active users
SELECT * FROM users WHERE deleted_at IS NULL;

-- Restore user
UPDATE users 
SET deleted_at = NULL, deleted_by = NULL
WHERE id = 123;

Create Views for Convenience:

CREATE VIEW active_users AS
SELECT * FROM users WHERE deleted_at IS NULL;

-- Now use: SELECT * FROM active_users

Cascade Soft Deletes:

-- When order soft-deleted, soft-delete items too
CREATE TRIGGER soft_delete_cascade
AFTER UPDATE ON orders
FOR EACH ROW
WHEN (NEW.deleted_at IS NOT NULL AND OLD.deleted_at IS NULL)
BEGIN
  UPDATE order_items 
  SET deleted_at = NEW.deleted_at 
  WHERE order_id = NEW.id;
END;

Mistake #7: Poor Query Performance from Day One

The Problem:

SELECT * Everywhere:

-- Loading 50 columns when you need 3
SELECT * FROM users WHERE id = 123;

N+1 Query Problem:

// Load 100 orders
$orders = query("SELECT * FROM orders LIMIT 100");

// For each order, load customer (100 queries!)
foreach ($orders as $order) {
  $customer = query("SELECT * FROM customers WHERE id = ?", [$order->customer_id]);
}

Total queries: 101 (1 + 100)

Should be: 2 queries (orders + customers)

No LIMIT on Queries:

-- Accidentally loading 1,000,000 records
SELECT * FROM logs;  -- No LIMIT

Real Impact:

Admin dashboard loaded every user:

SELECT * FROM users;  -- No limit, 500,000 users
  • Page load: 45 seconds
  • Memory usage: 2GB
  • Server crash under load

The Fix:

Select Only What You Need:

-- Good
SELECT id, name, email FROM users WHERE id = 123;

-- Bad
SELECT * FROM users WHERE id = 123;

Fix N+1 with Joins or IN:

-- Approach 1: JOIN
SELECT o.*, c.name, c.email
FROM orders o
JOIN customers c ON o.customer_id = c.id
LIMIT 100;

-- Approach 2: IN
SELECT * FROM customers 
WHERE id IN (SELECT DISTINCT customer_id FROM orders LIMIT 100);

Always Use LIMIT:

-- Pagination
SELECT * FROM users 
ORDER BY created_at DESC 
LIMIT 50 OFFSET 0;

Add Explain to Find Slow Queries:

EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 123;

Shows:

  • Which indexes used
  • Number of rows scanned
  • Query execution time
  • Bottlenecks

Mistake #8: No Backup Strategy

The Problem:

“We don’t need backups. Our hosting provider handles it.”

What Actually Happens:

  • Hosting provider backup fails (silently)
  • Database corrupted
  • User accidentally DROP TABLE
  • Ransomware encrypts database

Recovery: Impossible

Real Impact:

SaaS startup, no backups:

  • Developer ran DROP DATABASE on production (thought it was dev)
  • 18,000 customer accounts gone
  • 3 months of data lost
  • Company shut down

The Fix:

3-2-1 Backup Rule:

  • 3 copies of data
  • 2 different storage types
  • 1 offsite backup

Backup Strategy:

Daily Full Backups:

#!/bin/bash
# Daily at 2 AM
pg_dump database_name | gzip > backup-$(date +%Y%m%d).sql.gz

# Upload to S3
aws s3 cp backup-$(date +%Y%m%d).sql.gz s3://backups/daily/

Point-in-Time Recovery: Enable WAL (Write-Ahead Logging) for PostgreSQL:

-- postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'cp %p /archive/%f'

Can restore to any second in time.

Test Restores Monthly:

# Restore to test server
pg_restore -d test_db backup-20181112.sql.gz

# Verify data
psql test_db -c "SELECT COUNT(*) FROM users;"

Backup Retention:

  • Daily: Keep 7 days
  • Weekly: Keep 4 weeks
  • Monthly: Keep 12 months
  • Yearly: Keep 7 years (compliance)

Automated Monitoring:

  • Alert if backup fails
  • Alert if backup size drastically changes
  • Alert if restore test fails

Mistake #9: Ignoring Database Security

The Problem:

Root Access Everywhere:

// Application using root database user
const connection = {
  host: 'localhost',
  user: 'root',  // Full database access
  password: 'password123',
  database: 'myapp'
};

If application compromised = entire database compromised.

No Encryption:

CREATE TABLE users (
  password VARCHAR(255)  -- Plain text passwords!
);

Public Database Access:

# Database open to internet
0.0.0.0:5432  -- Any IP can connect

Real Impact:

Healthcare app security issues:

  • Database open to public
  • Application used root user
  • Passwords stored plain text
  • Hacked in 2 days
  • 50,000 patient records stolen
  • $2M HIPAA fine

The Fix:

Principle of Least Privilege:

-- Create app-specific user with limited permissions
CREATE USER myapp_user WITH PASSWORD 'strong_random_password';

-- Grant only necessary permissions
GRANT SELECT, INSERT, UPDATE, DELETE ON users TO myapp_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON orders TO myapp_user;

-- NOT: GRANT ALL PRIVILEGES

Encrypt Sensitive Data:

-- Use PostgreSQL pgcrypto
CREATE EXTENSION IF NOT EXISTS pgcrypto;

-- Encrypt column
INSERT INTO users (email, ssn) 
VALUES ('user@example.com', pgp_sym_encrypt('123-45-6789', 'encryption_key'));

-- Decrypt when needed
SELECT email, pgp_sym_decrypt(ssn, 'encryption_key') 
FROM users;

Network Security:

# postgresql.conf
listen_addresses = 'localhost'  -- Only local connections

# pg_hba.conf
# TYPE  DATABASE  USER        ADDRESS      METHOD
local   all       all                      peer
host    myapp     myapp_user  10.0.0.0/24  md5

Only specific IPs can connect.

Connection String Security:

// DON'T hardcode credentials
const connection = {
  connectionString: process.env.DATABASE_URL  // From environment variable
};

Audit Logging:

-- Enable audit logging
CREATE EXTENSION pg_audit;

-- Log all data changes
ALTER SYSTEM SET pgaudit.log = 'write';

Track who changed what, when.

Mistake #10: No Performance Monitoring

The Problem:

You don’t know there’s a problem until users complain.

By then:

  • Database is crawling
  • Users are leaving
  • Revenue is lost

The Fix:

Query Performance Monitoring:

-- PostgreSQL: Enable pg_stat_statements
CREATE EXTENSION pg_stat_statements;

-- Find slow queries
SELECT 
  query,
  calls,
  total_time,
  mean_time,
  max_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;

Index Usage Monitoring:

-- Find unused indexes (waste of space)
SELECT 
  schemaname,
  tablename,
  indexname,
  idx_scan,  -- Number of times used
  idx_tup_read,
  idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan = 0  -- Never used
ORDER BY pg_relation_size(indexrelid) DESC;

Table Bloat Monitoring:

-- Find bloated tables
SELECT
  schemaname,
  tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) - pg_relation_size(schemaname||'.'||tablename)) AS index_size
FROM pg_tables
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 10;

Automated Alerts:

Set up alerts for:

  • Query time > 1 second
  • Connection pool exhaustion
  • Disk space < 20%
  • Replication lag > 1 minute
  • Lock wait time > 5 seconds

Tools:

  • pgAdmin (PostgreSQL)
  • MySQL Workbench (MySQL)
  • DataDog
  • New Relic
  • Grafana + Prometheus

Bonus: Migration Strategy for Existing Databases

If you have these problems already:

Step 1: Audit Current State

  • Run EXPLAIN on all queries
  • Find missing indexes
  • Check data types
  • Review normalization
  • Assess security

Step 2: Prioritize Fixes

  1. Security issues (immediate)
  2. Backup strategy (immediate)
  3. Performance killers (high impact)
  4. Data integrity issues
  5. Nice-to-have improvements

Step 3: Test on Copy

  • Clone production database
  • Apply fixes to clone
  • Test thoroughly
  • Measure improvements

Step 4: Staged Migration

  • Start with read-only fixes (indexes)
  • Then schema changes (during low traffic)
  • Finally data migrations
  • Always have rollback plan

Step 5: Monitor

  • Watch performance metrics
  • Track query times
  • Monitor errors
  • Adjust as needed

Conclusion: Database Design is Forever

Bad code can be refactored. Bad UI can be redesigned. Bad database design haunts you forever.

Every decision in database design has long-term consequences:

  • Today’s shortcut = tomorrow’s bottleneck
  • Missing index = permanent slow queries
  • Wrong data type = wasted space forever
  • No foreign keys = data integrity issues

The good news: Following best practices from day one prevents 90% of problems.

Time investment:

  • Poor design: 1 day initially, months fixing later
  • Good design: 3 days initially, minimal maintenance forever

Spend those extra 2 days upfront. Your future self will thank you.

Key Takeaways:

  1. Indexes: Add where needed, but not everywhere
  2. Data types: Use proper types for performance and correctness
  3. Normalization: Balance between 3NF and performance
  4. Foreign keys: Enforce data integrity at database level
  5. File storage: Keep files out of database (use external storage)
  6. Soft deletes: Preserve data, enable audit trails
  7. Query optimization: Select specific columns, avoid N+1, use LIMIT
  8. Backups: 3-2-1 rule, test monthly, automate everything
  9. Security: Least privilege, encryption, network restrictions
  10. Monitoring: Track performance, find issues before users do

Need database architecture review?

We’ve designed databases for 150+ applications. Free 30-minute database audit available.

[Schedule Database Review →]