Categories
Architecture Chat Collaboration Integration Site Reliability SOA Troubleshooting

Connecting Supply with Demand

(10/18-5/20)

Challenge

Ongoing reliability issues with 3rd party chat solution crucial for business operation w/o documentation and integration monitoring.

Action

The key aspect of the Decorist experience is the connection between the Client (Demand-side) and Designer (Supply-side.) To facilitate that connection, many years prior, the business had included a (at that time) nascent Chat-as-a-Solution provider as part of the user experience; it made sense to “Buy” instead of “Build.”

Architecture Overview

The chat UX is loaded into an iFrame. When users chat, their payload is posted to the 3rd party’s backend. The 3rd party then fires a webhook to Decorist which tracks the event in the DB and fires a transactional email depending on business logic.

The Problem

The ways the bugs manifested boiled down to:

  • chat UX not appearing (like below, often as the result of a 3rd party deploy gone bad)
  • emails not sending because webhooks not called

Lacking integration monitoring, issues often bubbled up through first-tier support.

Chat not loading

Improving the Dependency

Noticing the reliability issues, I first delegated to one experienced engineer to triage and then another.

I also dug in on my own and discovered/remedied bugs in our own webhooks while also providing data-backed reliability outage information to 3rd party, escalating to 3rd party’s CTO when necessary

Almost quarterly, as a company, the decision to continue using the 3rd party is re-visited given the reliability issues. Each time, though all stakeholders are aware of the pain collectively experienced, the decision has been made to punt replacing the solution.

Results

  • Ensured remediation of 3rd party issues in 24-36h, even on lowest support tier.
Categories
Architecture Backend Chat Cloud demand-side eCommerce Frontend Innovation Prototyping Strategy supply-side

Launched MVP

I was brought on initially part-time into a two-sided marketplace in the holistic services space to assist the team to go from an environment where code wasn’t being shipped to a place that could launch an MVP.

Prior to joining, I commissioned my own Market Landscape (via UpWork) around holistic services potential; one finding: 2015 global revenue = ~$40B

Better Engineering

When I arrived, the POC had been built by an offshore team in WordPress via cPanel and PHPAdmin and transferred from AWS to Digital Ocean. There was no source control. The first thing I did was `git init`.

I spent time attempting to decompose the production instance into a Docker container (of WordPress and MySQL.) I learned how tricky it is when an application is not already 12-factored to spin-up an alternative instance having functional parity. In case you didn’t know, much of the application state of a WordPress site is maintained as JSON config blobs IN THE DATABASE.

Seeing opportunity to prioritize feature creation (in support of product-market fit) in lieu of operational excellence, I put the Docker initiative on hold and spun up dev and staging instances using cPanel and PHPMyAdmin, simply copying the prod DB.

dev instance after trying to decouple

CaaS : Chatroom as a Service

First Iteration

My initial task was to integrate a chat UX with a broadcast streaming experience being built out and powered by Wowza. I settled on Converse.js and ejabberd given decent documentation, an acceptable UX, and (what seemed to be) easy integration; delivering the following POC

chat, working

Next Iteration

With the web server running under CentOS on one Droplet, I had thought it smart to run ejabberd on a separate instance, installing v14 on an Ubuntu Droplet. I ran into issues opening port 5281 from the CentOS VPS to Ubuntu so I doubled-back on my strategy, opting to co-host the daemons on one Droplet (CentOS.) I then ran into package manager issues and wound up having to build and install ejabberd 18 from scratch while upgrading OTP to v20.

While this was happening, the junior engineer was working to get the Wowza live-broadcast video solution working and time wasn’t on our side to get the MVP launched.

(Additionally, at one point, I looked at using the YouTube/Hangout API didn’t find support for our use case.)

Attempt to leverage YouTube/Hangouts

Once I got ejabberd 18 built, installed, and running, we started experiencing timeout issues; manifesting in FE with a spinner that spun for 1.5 mins before the user was finally able to get into the chatroom. I dove into the ejabberd code, and realized that there was something blocking going on server-side. I never did figure out what was happening, but about that same time I realized that the junior engineer had integrated Twilio Video for the 1:1 experience and Wowza Streaming for the 1:many broadcast experience.

Seeing opportunity to reduce waste and consolidate coding paradigms, I researched Twilio Video and realized that we could, for the MVP, likely replace the Wowza experience for both of the 1:1 and 1:many experiences WHILE ALSO leveraging Twilio Chat to replace Converse.js/ejabberd.

Twilio Video, 1:many POC

Next Iteration

I found and customized React Programmable Chat

Twilio Chat, POC

The next step was to integrate with Wowza.

Wowza UX, now w/Twilio Chat

As their were some production issues with the Wowza implementation, I forged ahead creating an alternative Twilio Video React component using React Scripts.

1:1 UX, Twilio Video Chat (desktop and mobile)

At soft (internal) launch, I posited the Twilio solution and the other team members were delighted.

Though the Twilio Video quickstart project was useful, I ran into challenges around the video codecs (VP8 and H.264) between Chrome and Safari, namely getting desktop and mobile video interaction to work as expected. Digging around produced Github links helping lead to resolution (e.g. simply switching to H.264 to ensure cross-browser support.)

On 4/6/18, we had our first successful supply-side (1:many) broadcast, powered by Twilio Video and Chat.

MVP

After helping the team achieve MVP launch and when funding could not be raised in a timely fashion, I moved on.

Results

  • Swooped in when feature-delivery was woefully behind, built out (desktop/mobile) UX, and launched MVP to paying customers.