false

Perspectives Home / CTO Stack

Fleet First for Better Developer Experience and Faster Software Delivery  

Take a lesson in developer experience, automation, and managing software at scale from Google and Spotify.

The success of platform engineering proved that investing in developer experiences boosts software engineering productivity and grows the bottom line. Now, the discipline of fleet management aims to further ease software supply chain management with automation at scale. 


Fleet management enables developers to deliver secure, easy-to-manage apps and services faster. It’s a win-win-win for developers, users, and the business. 

 

 

 

What is fleet management?

 

Fleet management is an approach originally developed by Google for managing multiple Kubernetes clusters and groups of clusters as one project, or “fleet.” Through automation, engineers can make a production-wide change to the entire fleet rather than individual clusters in multiple projects.


So what does this mean for developers and the broader business? The idea behind fleet management is to build automation tools that can safely and reliably update thousands of repositories at once, enabling organisations to maintain the health of the software and infrastructure continuously. Spotify, an early adopter of fleet management, calls its approach fleet-first thinking. 

 

 

 

Fleet management drives scalable, secure software 

 

Fleet management is predicated upon the idea that automation is critical to growth and performance. This is especially true for Spotify. With an expanding product and business, the number of components in production can grow to thousands — much faster than developer headcount. In these situations, developer efforts often focus on managing and maintaining work that could be automated, like pull requests, version updates, security fixes, and migrations. 


Fleet management builds on the idea of an infrastructure platform for developer productivity. It allows teams that build end-user products to concentrate on their core mission. 


Evan Bottcher, Chief Architect at REA Group, gave a great description of the key elements of a digital platform, which are also comparable to those of fleet management:


A foundation of self-service APIs, tools, services, knowledge, and support, which are arranged as a compelling internal product. 


The comparison makes a lot of sense when considering how fleet management operates. The model emphasizes moving up the stack (as always!), encompassing both development and maintenance tasks. Similarly, the infrastructure platform team provides the platform upon which the development teams can build their software, and the maintenance, making select parts of the application layer part of the infrastructure platform. This allows developers to focus on producing new and improved code while automated processes push code changes to their components and across the entire fleet of components without requiring human interaction.


Automating maintenance removes toil and enables developers to focus on the things that matter more to the business. For example, Splunk’s annual report, State of Observability 2024: Charting the Course to Success, reveals that leading organisations spend about 38% more time on innovation versus maintenance tasks. When developers don’t need to worry about maintaining infrastructure, they can drive innovation and deliver better products. 


According to Spotify, the positivity was nearly unanimous; 95% of their developers say that software quality has improved with fleet management. Through fleet management, the security fix for the Log4j vulnerability was rolled out in nine hours. The team can push new features faster, internal and external software libraries are updated daily, and updates to the internal service framework used by backend services, which used to take around 200 days, take less than 7 days. Spotify was also able to reallocate 200 developer headcounts from maintenance to software development. 

 

 

Unlocking fleet management success 

 

Building an effective infrastructure platform for fleet management and the organisation to back it up is an ambitious but rewarding goal. It requires greater maturity than directly providing infrastructure for services. Like any bold technological move, lasting success requires large, intentional investments and foundational competencies.

 

Adopting fleet management doesn’t have to involve a major shift in operations, but it does require the appetite and willingness to grow and continuously develop competencies. Scaling has become easier with new technologies and tools that enable automation. From a talent perspective, this approach extends the current skillsets of SREs, platform engineers, software engineers, and data scientists. 

 

The key to success lies in your organisation’s underlying engineering values, principles, and practices. If you’ve already adopted platform engineering, you can build upon it to implement fleet management. It’s a sign that your engineering culture values developer experience and innovation. It will also alleviate developers’ concerns about letting go of control because they’ve already done that once before and have seen how successful it can be. 

 

It’s also important to recognize the size of the codebase. Startups likely would not implement fleet management but may consider it as the codebase grows. A useful indicator can be the level of investment in “keep the lights on” activities that contribute to the daily uptime of the system or application, i.e., when maintaining code starts to take more time from developers.

 

To reap the benefits of fleet management, consider a few things upfront. 

 

Articulate the business case. Commitment to an internal developer productivity platform comes down to economics: its efficiency, quality, and time-to-market benefits should outweigh the costs of development, talent, and missed opportunities over its lifecycle. Take into account the capabilities of commercially available tools and services. Build or buy?

 

Get buy-in. Like with any new major initiative, often the biggest obstacle to fleet management adoption can be getting buy-in across the organisation. Spotify, for example, anticipated a cultural shift as they asked their development teams to let go of some of the control that would now be taken care of automatically. In the end, the developers were very happy with the changes because they were able to see for themselves that automation worked. The ‘why’ was clear, highlighting the importance of showing and communicating the benefits to the team. Leaders can create different channels for dialogue and opportunities to educate teams beforehand, minimizing hesitation, earning developers’ trust, and getting stakeholders excited about the technology and their future. 

 

Allow voluntary adoption. Allowing teams the choice to opt-out is recommended for adoption. Martin Fowler reminds us that you must not forget that you are building a product designed to delight your customers. In this case, your product development teams. Thus, to keep them happy, be disciplined about product ownership of your infrastructure platform. It helps keep services loosely coupled, benefitting the platform when replacing them with new generation services or commercial offerings. What’s more, when an infrastructure platform organisation is dependent on development teams’ appreciation of the platform’s benefits, it puts pressure on your infrastructure platform organisation to keep them happy, thus guarding the team's product thinking discipline. 

 

Standardize code repositories. One crucial step is to get every component up to a base level. There should also be strong consistency in the codebase(s) in terms of frameworks, declarations of dependencies, software bill of materials (SBOM), and manifest declarations. Being in control of your dependencies is key. Ideally, the implementation of declarative infrastructure and platform engineering will have already helped with this. 

 

Adopt continuous integration/continuous deployment. Continuous integration/continuous deployment (CI/CD) is another step toward successful fleet management, as it enables engineering teams to be flexible and make changes often and quickly. Automated testing is one component of that. Before implementing fleet management, engineering teams should have a high level of automated testing, particularly integration testing, to make changes and deploy components without any human interaction. They should also be able to easily and quickly roll back any changes. 

 

Build a best-in-class observability practice. An observability practice acts as a foundation for digital success. An advanced practice, on the other hand, enables organisations to progress from being reactive and chasing down issues to being proactive, innovative, and forward-thinking — ideally predicting issues before they even happen. These advanced teams can build reliable, consistent frameworks for engineering and IT success. Fleet management adoption can be a natural fit with an advanced observability practice. 

 

To learn more about advanced practices of leading observability teams, including platform engineering, read our report, State of Observability 2024: Charting the Course to Success. We’ll cover how platform engineering is driving IT efficiency and competitive differentiation.

Read more Perspectives by Splunk

December 3, 2024 • 5 minute read

Unlocking OpenTelemetry to future-proof engineering teams


Is OpenTelemetry worth it? Discover how it drives compliance, cuts costs, and aligns business with engineering goals.

NOVEMBER 8, 2024  •  4 Minute Read

The Habits of Highly Successful Platform Engineering Teams


Platform engineering isn’t just a trend — it’s a proven way to boost efficiency and developer focus. Uncover the habits of teams setting the standard.

OCTOBER 22, 2024 • 3 minute read

State of Observability 2024 Reveals How Leaders Outpace Their Peers


Discover how observability leaders achieve 2.67x ROI through innovation, efficiency, and resilience in the State of Observability 2024 report.

Get more perspectives from security, IT and engineering leaders delivered straight to your inbox.