Tlon Messenger · Product Lead
2021–Present
Last updated December 2025
Over the past four years, I have led product development for Tlon, navigating the company through turbulent product evolutions and dramatic PR shifts. The mission driving our work at Tlon is a firm belief in the viability of decentralized networks and true human connection, free of middlemen and advertisers.
Tlon's flagship product is a "personal server" which allows users to host their own real-time messaging service on a decentralized network. We build native smartphone clients which interact with this server and provide push notifications, app badges, and background sync. Users can sign up for a hosted experience where Tlon runs their individual server in a single-tenant environment; others may choose to run their individual server on their own and connect via the apps.
The Groups 2.0 rewrite
In early 2022, Tlon underwent a complete rewrite of our social software product which had gained significant attention in the decentralized web community during the 2020 pandemic. This rewrite touched both the back-end architecture and the front-end user experience, as well as introducing a swath of "table-stakes" features that Discord and Slack users had come to expect. Because of the architecture changes, this migration would take major community groups offline and would potentially cause massive userspace breakages.
We launched through a carefully orchestrated phased rollout: developer alpha in July 2022, which empowered third-party builders with new tools and capabilities to start preparing their applications; public beta in November, providing early adopters the opportunity to experience improved features and offer feedback; and mainnet OTA (over-the-air update to the entire live network) in December 2022. The team worked around the clock in shifts across time zones, from Australia to Europe to the US East and West Coasts.
However, our rush to roll out the large, breaking change resulted in data loss for our users. As it turns out, we had a naive understanding of the shape of user data across our distributed network, and our migration scripts failed to handle many edge cases, resulting in missed inserts, dropped subscriptions, and crash loops.
While we worked tirelessly to resolve these problems, we learned a crucial lesson about the importance of thorough testing and understanding user data before implementing major changes. This experience taught us not just to react, but to anticipate potential issues proactively. Despite the immediate challenges, we learned to enforce discipline, make informed decisions, and ultimately rebuild trust with our users.
A long winter of jank and churn
The platform's promise of commercial product quality with a decentralized back-end and peer-to-peer networking quickly revealed issues: unpaginated backlog loading while joining a group, near-constant connection instability, and events not propagating through group hosts properly. Our team size shrank from 12 to 6.
To add to these issues, our fast-paced sprints caused our feature space to balloon beyond our simple manual test coverage, leading to frequent bugs and feature instability. The pressure to continually release new features without sufficient test coverage created a cycle where bugs became more prevalent, consuming our limited team bandwidth with urgent fixes rather than allowing for thorough testing and stabilization. This, in turn, exacerbated the platform's complexity, making it harder to manage and more prone to errors. We raced to finish essential features, which fed a loop of feature creep that worsened the instability we encountered.
Through 2023 and early 2024, we believed shipping faster and adding features was the solution. We launched mobile apps, in-app signup, role-based permissions, rich media embeds, and a Contacts system. We addressed complaints, checked roadmap items, and hit OKRs.
But underneath the surface, something was fundamentally wrong. Some refactoring work that began in mid-2023, intended to fix our horizontal scaling issues by separating publisher and subscriber roles, kept slipping. What was supposed to be a three-week project dragged on for months, repeatedly getting shelved as we diverted resources to ship the next feature. The mobile v5 and v6 releases through mid-2024 showed improvement in our process, but we were still building an increasingly complex structure on foundations that couldn't support the weight.
Finally, a modicum of stability
By mid-2025, it became clear that the recurring bugs were not isolated issues but symptoms of deeper, systemic problems. The architecture we implemented in 2022 created reinforcing loops in which flaws fed on themselves, exacerbating instability. These issues, including missing sequence numbers, full-history loads, and ad-hoc push notifications, were not just glitches but indicators of underlying structural weaknesses. In August 2025, we stopped feature work and addressed these other problems.
We identified that our lack of comprehensive test coverage was one of many root causes. We substantially invested in a more robust end-to-end testing framework, which was essential for confident engineering and smooth, predictable releases.
We also made the critical decision to root-cause every issue that arose, rather than applying superficial fixes. This meant dedicating time to thoroughly investigate and understand the underlying causes of problems, rather than just addressing the symptoms. By doing so, we were able to implement more effective and lasting solutions, preventing the same issues from recurring and improving the overall stability of the platform.
From September to December 2025, our v7.1.0 to v8.1.0 releases brought stable push notifications, app icon badging, group templates, privacy and safety improvements, and group management features without destabilizing the platform. Notably:
- Push notification reliability hit to 99%
- Monthly bug count dropped by 75%
- Release cadence shifted from daily hotfixes to bi-weekly/monthly updates
Features like blocking, role management, and join requests shipped in single releases, not months. Bugs vanished: no more stuck channels or navigation loops. We shifted to disciplined, theme-based releases rather than daily hotfixes.
A lean team of six can rival larger ones not by speed, but by pausing to fix fundamentals. The journey from chaos to sustainability taught me that progress sometimes means saying no until the foundation is strong.