Back to Articles

[ SMS & Telephony ]

The Architecture Behind Collectimate's SMS Engine

Building a bidirectional SMS platform that handles thousands of concurrent conversations at scale requires thoughtful infrastructure design. Here's how Collectimate does it.

6 min read [email protected] May 4, 2026

SMS is deceptively simple on the surface. But when you're operating a platform that needs to hold thousands of simultaneous structured conversations, each with its own state, branching logic, and data extraction requirements, the engineering complexity compounds quickly.

This is a look inside how Collectimate's SMS engine is built.

The Core Problem: Stateful Conversations at Scale

A single SMS exchange is stateless by nature. The carrier delivers a message and forgets it ever happened. But a meaningful data collection conversation is not stateless. The agent needs to know what it already asked, what the respondent already answered, and where in the conversation tree it currently sits.

Managing that state across thousands of concurrent conversations is the central architectural challenge.

The Conversation State Machine

At the heart of Collectimate's SMS engine is a conversation state machine. Each active conversation is represented as a persistent object with:

  • Current node — which step in the conversation flow the respondent is at
  • Collected fields — the data gathered so far, keyed by field name
  • Retry count — how many times the agent has re-asked an unclear response
  • Session metadata — timestamps, channel identifiers, campaign reference, and respondent identifiers

When an inbound SMS arrives, the engine retrieves the active session, feeds the message through the response parser, advances the state machine, generates the next outbound message, and persists the updated state.

Message Queue Architecture

Inbound messages arrive unpredictably and in bursts. A campaign that sends 10,000 outbound messages at 9am will receive a flood of responses within the first 30 minutes. The engine needs to absorb that burst without dropping messages or processing them out of order.

Collectimate uses a partitioned message queue, where inbound messages are partitioned by respondent phone number. This guarantees that messages from the same respondent are always processed sequentially, while messages from different respondents are processed in parallel across worker nodes.

This design provides both ordering guarantees and horizontal scalability.

Response Parsing and Validation

Human responses to SMS prompts are unpredictable. Ask someone for a date and you'll receive it in a dozen different formats. Ask for a yes/no answer and you'll receive everything from "yep" to "not really."

Collectimate's response parser uses a combination of:

  • Pattern matching for structured formats (dates, phone numbers, numeric ranges)
  • Synonym normalization for categorical responses
  • Confidence scoring for ambiguous inputs, triggering a re-ask when confidence falls below a threshold
  • Language model fallback for open-ended inputs that need semantic interpretation

Delivery and Retry Logic

SMS delivery is not guaranteed. Collectimate's outbound layer implements exponential backoff retry for undelivered messages, carrier failover routing, and dead letter handling for sessions where delivery has failed repeatedly.

Compliance and Opt-Out Handling

Every SMS platform operating at scale must handle opt-outs correctly and immediately. Collectimate maintains a global opt-out registry checked before every outbound message. Any inbound message containing standard opt-out keywords triggers an immediate session termination and confirmation message — at the infrastructure layer, before campaign logic.

The Result

The architecture described here allows Collectimate to run thousands of simultaneous structured conversations, maintain data quality under real-world response variability, and deliver reliable throughput even under bursty load — all while remaining auditable, compliant, and observable at every layer.