Skip to main content

Magic Chat to Unified Threads

How we merged three separate chat systems into one unified threading architecture over eight months.

By Alexey Suvorov · · Updated · 6 min read
Featured image for Magic Chat to Unified Threads

Three. That’s how many separate chat systems we were running in January 2024. Three different conversation backends, three different threading models, three different database schemas. A user could have a conversation in basic chat, a conversation in Assistants, and a conversation in chatbots, and none of them knew the others existed.

This is the story of how we got there, how we tried to fix it with a band-aid called Magic Chat, and how we ultimately merged everything into one system over eight painful months.

How three chat systems happen

Nobody plans to build three chat systems. You build one. Then circumstances force you to build a second. Then a third.

System one: basic chat. April 2023. We added a simple OpenAI chat feature to our dashboard. Send a message, get a response. No persistent memory. No threading. Just a stateless request/response loop. By June, we’d added chat memory and context history, so conversations could reference earlier messages. Simple, functional, limited.

System two: Assistants API. November 2023. OpenAI released the Assistants API, which offered built-in thread management, persistent memory, file handling, and code interpretation. We integrated it because it solved problems our basic chat couldn’t: long-running conversations that needed to reference documents, multi-turn interactions with maintained state. The integration was bumpy – we attempted it, reverted it, then re-integrated once the API stabilized. By December, our Bulk Chat feature ran on it.

System three: custom chatbots. January 2024. We shipped our chatbot builder, which let users create GPT-like bots with custom instructions and knowledge bases. Each chatbot needed its own conversation threads, its own message history, its own state management. The chatbot system shared some infrastructure with the Assistants API but had its own thread lifecycle.

Three systems. Three schemas. Three sets of API endpoints. Three conversation UIs. And from the user’s perspective, three different places where their conversations lived.

The user didn’t care about our architecture

Here’s the conversation that made us realize we had a problem. A user reached out asking: “Where did my conversation about the marketing strategy go?”

We asked which chat feature they’d used. They didn’t understand the question. “The chat. Your chat feature. I had a conversation yesterday and now I can’t find it.”

It turned out they’d started the conversation in basic chat, then switched to a custom chatbot for the same topic because someone had recommended it, and now they couldn’t find either thread because the search only covered the system they were currently viewing.

Users don’t think in systems. They think in conversations. They had a conversation about marketing strategy. They didn’t care whether it happened in “basic chat,” “assistants,” or “chatbots.” They wanted one inbox.

We heard variations of this complaint for months. Where are my conversations? Why can’t I search across everything? Why does switching from chat to chatbot feel like switching apps?

Magic Chat: the ambitious band-aid

March 2024. We introduced Magic Chat – our attempt to unify the experience without ripping apart the backend.

Magic Chat was a new, unified entry point with threading support and OpenRouter integration. It gave users a consistent interface: one chat panel, one thread list, one search bar. Underneath, it routed conversations to the appropriate backend system based on the conversation type. If you were talking to a custom chatbot, it used the chatbot threading system. If you were in a general conversation, it used its own thread management.

On the surface, Magic Chat was a significant improvement. Users had one place to start conversations. They could switch between models mid-conversation. Thread management was consistent. The OpenRouter integration meant they could use any of our 20+ models without worrying about which backend system supported which provider.

Underneath, Magic Chat was a fourth system layered on top of the original three. We hadn’t merged anything. We’d added an abstraction that hid the complexity from users while doubling it for engineers.

Maintaining four conversation systems meant four sets of bugs. Four schemas to migrate. Four message formats to normalize when we added features like follow-up question suggestions or continue-generation for long responses. Every chat improvement had to be implemented in multiple places, or users would get inconsistent behavior depending on which system was handling their conversation.

The maintenance burden compounds

By mid-2024, the chat infrastructure had become the most time-consuming part of our codebase to maintain. Some examples:

When we added follow-up question suggestions, we had to implement the feature in three different message processing pipelines. Same UI component, three different data flows feeding it.

When we added continue-generation for long responses, the implementation differed per system because each one handled message chunking differently.

When we wanted to add conversation export, we wrote three different exporters because each system structured conversation data differently. Same feature. Triple the code. Triple the bugs.

We had 31+ MongoDB collections at this point, and a significant portion of them existed because each chat system had its own data model for what was fundamentally the same thing: a sequence of messages between a user and an AI.

November 2024: the merge

The unified thread system shipped in November 2024. Marina and Oleg led the effort. It took eight months from Magic Chat’s introduction to get here, though the actual merge work was concentrated in the final stretch.

The approach was surgical. We defined a single thread schema that could represent any conversation type: basic chat, assistant-powered conversations, chatbot interactions, and Magic Chat threads. Every thread had a type field, but the storage format, API endpoints, and search index were shared.

The migration path was:

  1. Build the unified schema with enough flexibility to represent all conversation types
  2. Create adapters that could read old-format threads into the new schema
  3. Migrate existing conversations in batches, validating data integrity at each step
  4. Route all new conversations through the unified system
  5. Deprecate the old endpoints once migration was complete

The hardest part wasn’t the schema design. It was preserving conversation history. Users had months or years of conversations in the old systems, and we couldn’t lose any of them. The migration had to be lossless, which meant handling edge cases like threads that spanned multiple schema versions, conversations with file attachments stored in different formats, and threads that referenced knowledge base documents that had been updated since the conversation.

What unified threads actually looked like

After the merge, a thread was a thread. Whether the user started a conversation with a custom chatbot or through the general chat interface, the data model was the same. One collection. One set of API endpoints. One search index. One place where all conversations lived.

Features that had required multiple implementations now required one. Follow-up suggestions, continue-generation, conversation export, thread search – each feature was implemented once and worked everywhere.

The thread count per user actually decreased after the merge, which surprised us. Users had been creating duplicate conversations across systems – the same topic discussed in basic chat and then again in a chatbot – because they couldn’t find their earlier threads. With everything in one place, that duplication disappeared.

Dashboard v2 learned from all of it

When we started Dashboard v2 in November 2025, the first architectural decision was: one conversation system from day one.

Every interaction in v2 runs through the same thread architecture. Chat messages, agent tasks, code execution sessions, knowledge base queries – all of them are events in a unified thread. The schema was designed upfront to be extensible, with a message type field that distinguishes between user messages, AI responses, tool calls, file operations, and system events.

We didn’t have to merge anything in v2 because we never split it in the first place. The eight months we spent unifying threads in v1 taught us exactly what the schema needed to look like. We knew which edge cases mattered. We knew where flexibility was required and where it was overhead.

The lesson that applies beyond chat

The three-chat-system problem wasn’t a chat problem. It was a feature-accretion problem. Each system was a reasonable response to a real need at a specific moment. Basic chat was right for April 2023. Assistants were right for November 2023. Chatbots were right for January 2024. But nobody stepped back to ask: what happens when we have three of these?

The cost of not asking that question was eight months of migration work, a transitional system (Magic Chat) that added complexity instead of removing it, and countless hours debugging inconsistencies between systems that were supposed to do the same thing.

If you’re building something that users will interact with in a conversational way, start with one thread model. Make it flexible. Add type fields, metadata, and extension points. But keep it one system. The architectural convenience of “let’s just build a new one for this use case” feels fast in the moment. The merge bill comes due later, and it’s always more expensive than you think.

Alexey Suvorov

CTO, AIWAYZ

10+ years in software engineering. CTO at Bewize and Fulldive. Master's in IT Security from ITMO University. Builds AI systems that run 100+ microservices with small teams.

LinkedIn

Related Posts

See what AIWAYZ can do for your team

Start a free trial — no credit card, no commitment.

© 2026 AIWAYZ. All rights reserved.

+1-332-208-14-10