SaaS / Enterprise

Enterprise SaaS

SaaS Product Performance Optimization

Vertical SaaS vendor recovered enterprise renewals by attacking tail latency, N+1 query patterns, and inefficient background jobs — measurable UX wins tied to expansion revenue.

Client overview

Industry focus
Enterprise SaaS
Portfolio segment
SaaS / Enterprise
Organization profile
B2B vertical SaaS, ARR ~$140M, 1,900 customers

Product velocity remained high, but enterprise accounts benchmarked competitor performance in POCs. CS flagged "spinning dashboards" during quarter close; engineering blamed database yet lacked unified tracing from browser to warehouse. Investors linked gross retention softness to qualitative performance complaints in diligence calls.

Problem

Tail latency and dashboard stalls threatened enterprise renewals; profiling culture was immature.

APM tools showed averages hiding multi-second outliers on invoice list endpoints. ORM-generated queries multiplied during nested dashboard loads. Background exporters monopolized connection pools nightly.

Frontend shipped large JS bundles without route-based code splitting; LCP exceeded 4s on low-bandwidth hospitality clients.

No SLO definitions meant teams optimized features, not customer-impacting journeys.

Solution

SLO-first program: tracing, query review board, Redis edge cache for hot aggregates, worker autoscaling with backpressure, and frontend performance budget in CI.

OpenTelemetry bridged browser spans to Postgres query plans; weekly perf council prioritized fixes by dollar-at-risk estimates from CS tagging.

Critical endpoints gained explicit indexes and covering patterns; Hibernate fetch graphs rewritten where necessary. Redis cached tenant-scoped KPI tiles with TTL jitter.

Next.js adoption on marketing app improved LCP; dashboard SPA moved heavy charts to intersection observers.

Implementation

  1. 1

    Instrumentation & baseline

    Deployed tracing with sampling tuned for noisy tenants; captured Core Web Vitals in RUM pipelines. Established p95/p99 budgets per endpoint family.

  2. 2

    Hot path burn-down

    Two-sprint cycles per domain team with shared DBA office hours. Kill list of N+1 offenders published in wiki for pride/shame accountability.

  3. 3

    Renewal firewall

    CS playbooks referenced performance attestations before QBR decks; synthetic checks simulated top 20 tenant configurations hourly.

Tools & platforms

  • OpenTelemetry
  • Datadog
  • pg_stat_statements
  • k6
  • Redis
  • Next.js

Engineering challenges addressed

  • Tenant hot spots skewing benchmarks — solved with weighted SLOs by revenue band.
  • Balancing cache freshness with finance close deadlines.

Tech stack

  • Next.js
  • React
  • Java
  • PostgreSQL
  • Redis
  • Kafka
  • Kubernetes
  • AWS
  • OpenTelemetry

Results

  • 63% reduction in API p99 for billing and dashboard families
  • Enterprise logo churn down 3.2 points YoY after performance program
  • Median LCP improved from 4.1s to 1.9s on hospitality profile

Quantified impact

  • 63% API p99 reduction on targeted routes

    Pre/post across 30-day steady state.

  • $6.7M expansion pipeline re-engaged

    Opportunities previously stalled on performance POCs.

Key takeaways

  • Performance is a product discipline — not a heroics sprint before renewal season.
  • SLOs should be revenue-aware; not all tenants deserve equal latency targets.
  • Frontend and backend tracing must stitch — otherwise teams optimize wrong layers.

Book a free consultation — we respond within one business day.

Start