Ship new, web-based, VR-powered, eCom experience as requirements changed.
Action
Act 1 : 3/19-5/19
While on my trip to India in April 2019, I formed a Tiger team of one of my best Lead Engineers, a project manager, and two junior engineers, coaching them to see the similarity between what was the business was asking for and current existing components of the system (PLP, PDP, NUX, and Checkout,) setting a plan in motion towards delivering an MVP for the 6/26/19 deadline.
They went heads-down and we successfully shipped v1 (following) on 5/29/20.
ecom landing page
Business priorities shifted and the project was moth-balled, leading to Act 2.
Act 2 : 8/19-10/19
Having shifted focus to more product-based eCom (see Act 1,) the business decided to leverage existing shop-the-room modeling infrastructure in a more user-friendly, web-based purchase flow.
While the original plan was to have them spin up a completely new POC with a new checkout flow, I intervened and met with the remote Technical Project Manager and Architect, providing guidance around the existing monolith marketplace system, knowing it could serve as enough of a “buy” to meet requirements so as not to have to “build” a custom solution.
I saved $40K in redo work after guiding the non-primary, remote, web team around component re-use while then shipping web-based, VR-powered shop-the-room.
Act 3 : 3/20-10/20
Under tight deadline, coached the Pakistani team to iterate and improve perceived and actual load times using CSS Sprites, caching via HTTP headers, use of a spinner, and gzipping in order to get a usable UI to market sooner:
Landing Page
Drilling down, a user looking to design a nursery can swap out items (made possible by a compositing technique with Three.js and photo spheres)
Room Page
Lastly, recognizing future strategic value-add within corporate partnerships, guided the team to decouple the frontend as a Single Page App for iframe embedding after having decreased page load times, introduced progressive enhancement / graceful degradation, and led the SPA strategy.
Leading-by-example to own and improve systems as sole ENGR having SRE/DevOps/Frontend/Backend experience.
Action
Mar 2019
Watching our AWS costs rise ~8% monthly…
Costs Rising
I learned about and subscribed to Reserved Instances to realize costs savings for our hosting spend:
Dec 2019
Though not leading to cost savings or revenue generation, part of my responsbilities have been database administration, jumping in when the production DB would spike like below, figuring out if a runaway process needed to be terminated, if a slow query was bringing it to its knees, if a cron job was introducing load, or whatever needed to be done to keep the site up.
Or when bots would crawl the site, bringing it down, necessitating an IP block:
Or when digging into the logs to find that a route was 500ing and had to be fixed:
Mar 2020
Using Cloudcraft, I diagrammed our AWS infrastructure, identifying and deleting 1000 unused SQS instances.
Also identified and deleted numerous unused RDS snapshots:
All changes led to a yet another 37% reduction in MoM AWS costs:
Results
Saved company 115% of my salary in 2019 through process improvements.
Inherited a two-fold challenge: 1) less-than-optimal user experience of 9.01s Avg. Page Load Speed and 2) a culture that did not yet value performance engineering.
Action
Idenfitied initial frontend and backend low-hanging fruit (e.g. page structure, image resolutions, N+1 queries, lazy-loading, etc.) Identifed and delegated KRs to FE leads. Introduced process to methodically follow up each week, pressing the case for performance engineering over months.
Below, a sampling …
Some initial efforts shaved ~3s of the Homepage:
Lighthouse score soared from 5 to 61:
There were multiple speed improvements of up to 50% on various pages across the site as a result of backend query improvements and a +300% bump YoY in SEO traffic (verified by 3rd party SEO consultant)
After Dec 2019, we shifted focus to non-converted UXes after having addressed low-hanging fruit for all UXes.Sept 2018 to Dec 2019
Results
Record Avg. Site Page Load Speed of prevous four years : 9.03s (Sep 2018) to 4.13s (Dec 2019.)
During the course of June and July 2018, I supported leadership (CTO/VP ENGR/Head of Platform/VP PROD/CEO) around engineering topics such at Brazilian-startup Pipefy around the codebase, Product Engineering, SDLC process, culture, and roadmap(s,) transitioning from a monolith to Service Oriented Architecture, platform strategy, scalability, maintainability, and uptime.
Among other contributions, I:
Coached CTO and VP ENGR around operational excellence, particularly thinking in terms of AS-IS versus TO-BE.
Introduced idea of maturing IT processes towards forecasting, in particular via a Capacity Plan.
Provided thought-leadership around managing remote teams, partly out of own experience, partly as informed by Best Practices.
Results
Identified strategies and tactics to qualitatively improve processes.
For the client, I probed their UX using multiple tools to determine low-hanging fruit, worked with the CTO and VP PROD to understand resourcing constraints given the product roadmap, and enumerate several tactics in a (prioritized) phased approach towards improving performance.
Heuristic discoveries included FE perf bottlenecks such as:
multiple inline JS snippets causing slowdowns
un-optimized JS libs, including of React components
multiple 3rd party JS libs that were no longer necessary
JS libs that loaded neither async nor defer
retrieval of multiple styling resources
I found that the greatest opportunity to optimize existed for two key user experiences at ~6s avg page load and ~4.5s average page load respectively.
During one iteration, changes lead to a 52% reduction in the Webpack bundle of React components, improving page speed by 10% or better across four key user experiences.
During another iteration, one key UX was sped up by 39.4% without Cache, 53.8% with Cache.
During a final iteration, changes lead to a 20% page speed improvement on one key user experience and 10% or better on two others.
Results
Introduced Performance Engineering mindset leading to 20% page-load speedup.
With no-one on staff having formal performance engineering experience or training, I led the effort to address a 6s sitewide page load average. Over approx. 6 months, stealing time from each sprint as I could given other responsbilities, I assessed bottlenecks using performance tools, including procuring a contract for Insights, Browser Pro, and Synthetics.
In a nutshell, assessing some of the pages having a roughly 13s page load, skewing the overall site’s average, I determined that the bulk of the performance issues were in blocking JS libs and then crafted a remediation strategy and established SMART goals for baselining/measuring impact of changes quarter-over-quarter with intent to drop from 6s to 3.5s by the end of 2017. Below you’ll see a spreadsheet I kept, noting the average Page Load Speed every Monday morning. Finally, in May, we address the bulk of the blocking JS libs and reversed the upwards trend.
The CEO was amazed at how much faster the site felt.
Results
Intro’d performance engineering, leading to page load speed reduction by one full second and 11% bump in mobile web conversions.
A remnant of the legacy codebase from the company’s origins eight years ago, the two most important routes on the platform for quality assurance had never been moved from the company’s original Merb app to its companion (upgraded) Rails experience (back when a first movement towards SOA happened three years prior.)
They were rightly regarded with trepidation given their dependence on MooTools when the rest of the platform had been moved to jQuery, especially as the Jasmine suite for their coverage had been mothballed about 18 months before and those same MooTools libs were tightly-coupled between two platform applications.
Towards a future of less frontend tech debt, I seized on the opportunity to champion and shepherd the project (as a Q2 engineering goal) through to completion.
Over the course of three months, I led the effort around project scoping, weekly communication around engineering effort, and architecting and leading the implementation, interfacing with Product and Design to ensure quality in light of the absent test coverage.
Results
Delivered on decoupling strategy for static asset management / Webpack’d bundles while collabratively iterating with VPE/VP PROD and Sr. Engineers
Contributors, as they are called, are the +5M people around the world who do work on CrowdFlower’s platform. The application that enables them to do work is one of the company’s heavily trafficked as well as most complicated – blending a Rails backend with MooTools, jQuery, and RequireJS in the frontend.
The application’s UX
…had largely stayed the same for the last five years. In Q1/2014, we decided to enhance it by making it more interactive and towards engaging our users more and conveying the just how much work there is in our system.
Working with the Product Manager and an external Designer, we came up with the following high-resolution mock
Because the application is so heavily used, we knew we couldn’t merely throw the switch on a new design overnight; both from a community management standpoint as well as application performance. Instead, we chose a strategy of introducing a first at the company: use of A/B Testing to determine a design that would perform as well as if not better than the original.
Our key metric for performance in that regard had to do with contributor’s performance after being exposed to the new UX, particular the messaging around our forthcoming gamification and introduction of Levels. In the beginning, we did not have the infrastructure to determine the value for that metric so we simply settled on ‘clicks’ as a (conversion) proxy to understand if the new design was having an impact.
Infrastructure
Without an A/B Testing framework in place, I needed to choose one. As requirements were not concrete for such, I did some due diligence in vetting several options, coming up with a review of A/B testing frameworks for Rails.
It became obvious that Vanity was best suited to our needs. (Since it doesn’t yet have the ability to throttle a percentage of the traffic receiving experiments, I augmented it with Flipper.)
Once that was in place, we could begin iterating on the design, knowing with confidence how we were impacting the user experience.
Server-side
We knew we wanted the experience to be snappy, but completely replacing the existing experience with a Rich Internet Application was far out of the scope for the first month, particularly as there were infrastructure changes to be made to retrofit the stack with A/B Testing. We decided to make progress iteratively over several sprints.
In our first test, we pitted the control (original) against a bare-bones implementation version of the high-resolution mock as the new design.
original
The new version out-performed control (in terms of clicks) 21.3% vs 20.3% (at 95% confidence) so I continued to iterate on the implementation, coming up with the following
To calculate the overall satisfaction by other contributors for a task (denoted by the stars) proved to be too inefficient in this iteration; it wound up losing.
Client-side
On the assumption that we needed to make the experience snappier in order to drive engagement, it was obvious that we would need to have more (and faster) interaction and therefore, an interactive client-side implementation.
As what was essentially a completely parallel product, leveraging only some of the infrastructure that the server-side rendition was utiziling, I begin to flesh out the following
Further refinement (an actual data) was necessary to get it looking more like the high-res mock (and like its server-side-rendered peer)
At this point, we implemented and integrated with our own homemade badging solution, beginning to display badges in the following iteration
The new version out-performed control (in terms of clicks) 21.3% vs 20.3% (at 95% confidence) so I continued to iterate on the implementation, coming up with the following
Testing the impact of particular messaging was also of interest, so we added a Guiders variation as well. At this time we also leveraged Google Analytics Events on the Guider buttons to track how the far the user got in our messaging.
Letting the experiments run a few days with sufficent traffic, we found that client-side-rendered version peformed no worse than the server-side-rendered version (23.9% vs 22.9%) and that having guiders also performed not significantly worse (23.1% vs 23.7%) so we decided to keep both.
By that time, the new version was out-performing control (the original design) 22.2% vs 20.7% (at 99% confidence) so a decision was made to move forward rolling out the new experience to 100% of contributors, doing some polishing (copy/styling) work before finally settling on the following
Results
Used A/B testing to upgrade company’s most highly-trafficked page (5+M views/month,) increasing user engagement by 5% and saving $2K/month (in Bunchball costs) by rolling own simple badging solution.