← All posts

Why Salesforce automations break at scale (and how to build ones that don't)

Most Flows and Apex work fine in a demo and fail in production. Here's what actually breaks at volume — bulk limits, recursion, nulls, permissions — and the QA discipline that prevents it.

A Salesforce automation almost never fails on the day you build it. It fails three months later, at 2am, when a data load fires 10,000 records through a Flow that was only ever tested with one.

That gap — between "works in a demo" and "survives production" — is where most CRM trust goes to die. Reports stop reconciling, deals stall in states nobody designed for, and the team quietly goes back to spreadsheets. The software didn't break dramatically. It broke silently, and at scale.

Here's what actually breaks, and how to build automations that hold up under real data, real volume, and real edge cases.

1. Bulk: the 200-record wall

Salesforce processes records in batches. A Flow or trigger that runs fine on a single record can blow past governor limits the moment a bulk operation — an import, a mass update, an integration sync — sends a batch through it.

The classic failure is a query or DML statement inside a loop. One record: one query. Two hundred records: two hundred queries, and you hit the SOQL limit partway through. The batch fails, half the records update, and now your data is in a state no report can explain.

The fix is a discipline, not a setting: bulkify everything, then prove it by running a real 200-record batch before go-live. This single test catches more production failures than any other check — and it's the one most builders skip because the demo already "worked."

2. Recursion: the automation that triggers itself

An update fires a Flow. The Flow updates the record. That update fires the Flow again. Without a guard, you get an infinite loop that Salesforce kills mid-transaction — leaving partial writes behind.

Recursion bugs are invisible in light testing because a single manual edit rarely loops. They surface under volume, or when two automations on the same object start updating each other. The defense is explicit recursion control and a clear map of what triggers what on every object you automate.

3. Nulls and missing data

Demos use clean records. Production has half-filled ones: a missing email, a blank close date, an account with no owner because the rep left. Automations written against the happy path throw errors — or worse, make decisions on bad data — the first time they meet a real record.

Every branch that reads a field should answer one question: what happens when this is empty? Hunting those cases deliberately, before launch, is the difference between an automation that degrades gracefully and one that corrupts data quietly.

4. Permissions and sharing

The automation works perfectly when an admin runs it. Then a standard user triggers it and it fails — or skips records they can't see — because Flows and Apex respect the running user's access unless you've designed for it explicitly. This is one of the most common "it works for me" bugs, and it only shows up once real users are in the system.

5. No alerting — so failures are silent

Here's the one that quietly costs the most: an automation fails, and nobody finds out. No error, no alert, no log. The renewal reminder didn't send. The lead didn't route. You discover it weeks later when a deal is lost, and by then there's no trail.

Reliability isn't just code that works — it's code that tells you when it doesn't. Error logging and alerting on every automation turns a silent revenue leak into a Slack message you can act on the same day.

What "tested like a product" actually means

None of the above is exotic. It's the difference between building a flow and building a system. At Morningscale, every build ships through the same checklist:

  • The 200-record stress test — bulk-safe, proven, not assumed.
  • An edge-case hunt — nulls, bulk loads, permission gaps, recursion.
  • Error logging and alerting on every automation, so nothing fails silently.
  • A bypass switch, so a future data load can't detonate the org.
  • Documentation and handover, so your team owns it — no tribal knowledge, no lock-in.

In healthcare and fintech, the cost of a broken automation isn't an inconvenience — it's a compliance event or a revenue leak. That's why we treat the testing as the product, not an afterthought.

If your org is carrying automations nobody fully trusts anymore, a short, paid Automation Health Audit will tell you exactly which ones are quietly at risk — ranked, before you commit to fixing anything.

Want this reliability in your org?

Book a short, paid Automation Health Audit. We'll read your org and hand you a ranked map of what's running, what's risky, and what's worth fixing.

Book your audit call