reduce database queries

How We Reduced Database Queries by 80% Without Touching the Schema

Database Queries

Last month, our production dashboard was showing 15,000 database queries per minute during peak hours. Our response times were creeping up, and the database CPU was constantly hovering around 75%. We knew we had a problem, but here’s the thing—our schema was actually fine. Didn’t need more indexes, we didn’t need to restructure our tables, and we definitely didn’t need to spin up a bigger database instance.

We needed to stop hitting the database so damn much.

Three weeks later, we were down to 3,000 queries per minute with the same traffic load. Response times dropped by 60%, and our database was practically yawning at 20% CPU. I want to share exactly how we did it, because chances are, you’re facing the same issues we were.

The Wake-Up Call

It started with a Slack message from our DevOps lead: “DB connection pool is maxing out again. Third time this week.”

I pulled up our APM tool and started tracing through a typical user session. What I found was embarrassing. A single page load was triggering 47 separate database queries. Not because we were doing anything particularly complex—just because we were lazy.

Every time we needed a user’s name, we queried the database. Every time we needed to check a permission, another query. Loading a list of 20 items? That’s 20 queries for the items, plus one query per item to get the author details, plus another round to check user permissions. You see where this is going.

Strategy 1: Implement Smart Caching (The Biggest Win)

This was our first and most impactful move. We implemented a two-tier caching strategy using Redis.

Application-Level Caching

We started by identifying our most frequently accessed, rarely changing data:

  • User profiles and permissions
  • Configuration settings
  • Lookup tables (countries, categories, statuses)
  • Relatively static content

For these, we implemented a cache-aside pattern with generous TTLs:

def get_user_profile(user_id):
    cache_key = f"user_profile:{user_id}"
    
    # Try cache first
    cached_data = redis_client.get(cache_key)
    if cached_data:
        return json.loads(cached_data)
    
    # Cache miss - hit the database
    user_data = db.query("SELECT * FROM users WHERE id = %s", user_id)
    
    # Store in cache for 1 hour
    redis_client.setex(cache_key, 3600, json.dumps(user_data))
    
    return user_data

Result: This alone cut our user-related queries by 70%. User profiles were being fetched dozens of times per request—now it’s once per hour per user.

Query Result Caching

For more complex queries, especially those used in dashboards and reports, we cached the entire result set:

def get_monthly_statistics(month, year):
    cache_key = f"stats:{year}:{month}"
    
    cached_result = redis_client.get(cache_key)
    if cached_result:
        return json.loads(cached_result)
    
    # Expensive aggregation query
    results = db.query("""
        SELECT date, COUNT(*), SUM(revenue)
        FROM transactions
        WHERE YEAR(created_at) = %s AND MONTH(created_at) = %s
        GROUP BY date
    """, year, month)
    
    # Cache for 5 minutes
    redis_client.setex(cache_key, 300, json.dumps(results))
    
    return results

The rule we followed: If a query takes more than 100ms or is called more than 10 times per minute, cache it.

Strategy 2: Batch Your Database Operations

This is where we found our second biggest win. We had code like this everywhere:

# Bad: N+1 query problem
posts = db.query("SELECT * FROM posts WHERE status = 'published' LIMIT 20")

for post in posts:
    author = db.query("SELECT * FROM users WHERE id = %s", post.author_id)
    post.author_name = author.name

That’s 21 queries for what should be one. We refactored to use eager loading:

# Good: 2 queries total
posts = db.query("SELECT * FROM posts WHERE status = 'published' LIMIT 20")

author_ids = [post.author_id for post in posts]
authors = db.query("SELECT * FROM users WHERE id IN %s", tuple(author_ids))
author_map = {author.id: author for author in authors}

for post in posts:
    post.author_name = author_map[post.author_id].name

We created a simple batching utility that became our best friend:

class QueryBatcher:
    def __init__(self, batch_size=100):
        self.batch_size = batch_size
        self.queries = []
    
    def add(self, query, params):
        self.queries.append((query, params))
        
        if len(self.queries) >= self.batch_size:
            self.flush()
    
    def flush(self):
        if not self.queries:
            return
        
        # Execute all queries in a single transaction
        with db.transaction():
            for query, params in self.queries:
                db.execute(query, params)
        
        self.queries = []

We used this for bulk inserts, updates, and deletes. What used to be 1,000 individual INSERT statements became a single batch operation.

Strategy 3: Leverage Application-Level Joins

Sometimes you don’t need the database to do all the work. We had several reports that were joining 5-6 tables, creating massive query execution times.

We broke these apart:

# Instead of one massive JOIN
# We do separate cached queries and join in application code

users = get_cached_users()  # Cached for 1 hour
orders = get_user_orders(user_id)  # Smaller, faster query
products = get_cached_products()  # Cached for 24 hours

# Join in memory
report_data = []
for order in orders:
    user = users[order.user_id]
    product = products[order.product_id]
    report_data.append({
        'user_name': user.name,
        'product_name': product.name,
        'order_total': order.total
    })

Yes, this uses more application memory. But memory is cheap, and it scales horizontally. Database connections don’t.

Strategy 4: Implement Request-Level Caching

This one’s subtle but powerful. We added a request-scoped cache that lives only for the duration of a single HTTP request:

from werkzeug.local import LocalProxy
from flask import g

def get_request_cache():
    if 'request_cache' not in g:
        g.request_cache = {}
    return g.request_cache

request_cache = LocalProxy(get_request_cache)

def get_user(user_id):
    cache_key = f"user:{user_id}"
    
    # Check request cache first
    if cache_key in request_cache:
        return request_cache[cache_key]
    
    # Then check Redis
    # Then check database
    # ...
    
    # Store in request cache
    request_cache[cache_key] = user_data
    return user_data

This eliminated duplicate queries within the same request. If we call get_user(123) fifty times during a single page render, we hit the database once.

Strategy 5: Lazy Loading Where Appropriate

Not everything needs to be loaded upfront. We implemented lazy loading for expensive relationships:

class Post:
    def __init__(self, data):
        self.id = data['id']
        self.title = data['title']
        self._comments = None  # Not loaded yet
    
    @property
    def comments(self):
        if self._comments is None:
            self._comments = get_comments_for_post(self.id)
        return self._comments

Now comments are only loaded if the code actually accesses them. For list views where we don’t show comments, we save those queries entirely.

Strategy 6: Denormalize Strategically (Without Changing Schema)

Wait, didn’t I say we didn’t touch the schema? We didn’t add columns or change relationships. But we did add a small denormalization layer using Redis.

We stored computed values that were expensive to calculate:

# Instead of counting on every page load
def get_user_post_count(user_id):
    cache_key = f"user_post_count:{user_id}"
    
    count = redis_client.get(cache_key)
    if count:
        return int(count)
    
    count = db.query("SELECT COUNT(*) FROM posts WHERE author_id = %s", user_id)
    redis_client.setex(cache_key, 86400, count)  # 24 hours
    
    return count

# Invalidate on new post
def create_post(author_id, title, content):
    post_id = db.insert("INSERT INTO posts ...", author_id, title, content)
    
    # Invalidate the count cache
    redis_client.delete(f"user_post_count:{author_id}")
    
    return post_id

The Monitoring That Saved Us

None of this would have worked without proper monitoring. We set up:

  1. Query frequency tracking: Which queries run most often?
  2. Slow query logging: Which queries take the longest?
  3. Cache hit rates: Are our caches actually working?
  4. N+1 detection: APM tools that flag potential N+1 problems

We used a combination of New Relic, our database’s slow query log, and custom metrics in Prometheus.

The Results

After three weeks of implementation:

  • Database queries: 15,000/min → 3,000/min (80% reduction)
  • Average response time: 850ms → 340ms (60% improvement)
  • Database CPU: 75% → 20% (73% reduction)
  • Peak connection pool usage: 95% → 45%
  • 95th percentile response time: 2.1s → 680ms

But here’s what really mattered: we stopped getting paged at 3 AM. Our infrastructure costs dropped by $800/month because we could downsize our database instance. Our product team stopped complaining about the admin dashboard being slow.

Lessons Learned

1. Profile before optimizing: We wasted a week optimizing queries that ran 10 times per day. Focus on what actually matters.

2. Cache hit rates matter more than cache size: A 1 GB cache with a 95% hit rate beats a 10 GB cache with a 60% hit rate.

3. Invalidation is harder than caching: We spent more time on cache invalidation logic than on the caching itself. Plan for this.

4. Not everything should be cached: Real-time data, user-specific data that changes frequently, and small lookup queries sometimes aren’t worth caching.

5. Monitor, measure, iterate: We didn’t get it right the first time. Our initial caching strategy actually made some things slower.

Common Pitfalls to Avoid

Over-caching: We initially cached everything and ran into memory issues. Be selective.

Stale data: We forgot to invalidate caches on updates and showed users outdated information for hours. Always plan your invalidation strategy.

Cache stampede: When a popular cache entry expires, hundreds of requests hit the database simultaneously. Use cache warming or stale-while-revalidate patterns.

Ignoring cache misses: A cache with a 30% hit rate is worse than no cache at all—you’ve added latency without benefits.

Your Turn

You don’t need to redesign your entire database to see massive performance improvements. Start small:

  1. Add caching to your most frequently called queries
  2. Fix your N+1 problems (they’re everywhere, trust me)
  3. Implement request-level caching to eliminate duplicate queries
  4. Monitor and measure everything

The schema changes can wait. Your users will thank you for the faster app today, not for the perfectly normalized database next quarter.

Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *