In my previous role as SVP-Engineering at Gojek, I often thought about what it took for engineering teams to deliver world-class execution consistently. Everyone speaks about common practices of engaging the right teams on the right problems, hiring top tier talent and prioritisation to help focus on business critical problems. Apart from these, I have now learnt that the best engineering organisations have 3 characteristics that manufacture success:
Visibility into ways of working to move fast on feature delivery
(close to) Predictable culture to reduce friction and tribal knowledge
Great tooling to enable teams to focus on critical problems
This is more relevant to ensure meeting planned outcomes on time while working with moving parts in a distributed remote setup today. But these is easier said than done. What really prevents top tier talent from delivering world class execution?
Over time and many learning moments at Gojek, we built an internal platform that would measure engineering performance. This helped us ensure we were achieving timely desired outcomes. I have captured the biggest challenges engineering teams face towards world-class execution:
Finding the accelerator pedal: The software development process in organisations has many stages with individual complexities. Brainstorming sessions lead to generating precise user stories which heads to development. Code is then tested and finally deployed to production. Each section of this software development life-cycle (SDLC) is intricate and requires special focus. As a manager or team lead in charge of multiple services, it is impossible to keep track of every piece of code developed and deployed across your teams.
How then, does one understand where to push to speed up feature delivery? How do I ensure my teams are focussing on the right priorities? How do I ensure we are shipping fast with high quality? And finally, while my teams are figuring these myriad questions out, how do we ensure sustainability and prevent burnout? Lack of visibility into execution progress leads to unforeseen bottlenecks. These can often cause delays or failures in new feature development and launches.
Keeping the lights on: While our team grew from 15 to 100 and beyond, our services grew much faster. Within a couple of years, we were running hundreds of services. The top ones were always in focus, but many services remained untouched for months at a time. Services tend to become snowflakes due to ageing, different teams working on services at various points, and inconsistent standards and practices.
When such a service fails, these snowflake services are hard to debug and correct quickly. Regular maintenance and upgrades take longer than anticipated. Lack of patterns leads to higher cost and bandwidth allocation. The cost of an old service breaking the system, leading to poor customer experience is much greater than the loss of revenue. It takes away time from higher priority tasks, and engineering teams spend countless hours chipping away to find the issue.
Finding the needle, before moving it: Dev Productivity tools have proliferated the market in the last decade. Each section of the SDLC today has 10s of tools to help teams write better code, test more, deploy faster, and monitor releases. Each tool has its independent analytics, data which is helpful to understand effectiveness of that function. The data from these engineering systems and tooling today is very fragmented.
Searching for the right dataset to look at is a real issue: there isn't a holistic global view of the delivery pipeline. Powerful insights on system performance aren't available without combining and viewing datasets together. Engineering and product managers spend hours wrangling with spreadsheets trying to drive correlations that deliver insights.
The bottomline: In 2018, Stripe estimated developer inefficiency to have a $300 Bn impact on global GDP. The leading reasons? Maintaining legacy systems, poor prioritisation of tasks and building custom code patches. As need for technology outgrows our ability to recruit engineering talent, this is bound to become stark. Tech companies will need solutions that streamline their system performance and effectiveness.