Learn strategies to increase the digital resilience of your organization.
Our site is more resilient since using Splunk.
The engineering team’s limited visibility across Rent the Runway’s complex microservices architecture led to outages and disruptions in the customer experience.
With Splunk, Rent the Runway has complete visibility across the company’s complex multi-cloud landscape in a single console — helping teams boost MTTR by 94%, prevent unplanned downtime and offer exceptional customer experiences.
In 2009, Rent the Runway disrupted the trillion-dollar fashion industry by pioneering the “Closet in the Cloud,” offering customers an unlimited range of designer styles to rent, wear and return — or keep. Publicly traded on Nasdaq, the first and largest shared designer closet has come a long way from its startup days when it simply rented cocktail dresses for special occasions. Now offering clothing and accessories for every day and every occasion via a la carte rental, subscription and resale, Rent the Runway offers hundreds of thousands of women access to designer styles from hundreds of brands via its website and app.
As Rent the Runway scaled, they had a need to enhance their application performance monitoring capabilities. With their focus on reverse logistics, the company had built a highly efficient warehouse management system. However, as their operations expanded rapidly, they realized the importance of continuing to maintain smooth operations and minimize downtime. So Rent the Runway turned to Splunk several years ago to upgrade their application performance monitoring systems. By investing in improved observability, Rent the Runway has gained deeper insights into their application’s performance, with increased ability to identify bottlenecks and proactively address issues. This upgrade significantly enhanced their application uptime, ensuring more seamless warehouse management.
With Splunk, Rent the Runway’s engineering and application operations and infrastructure teams have improved their ability to find and fix problems fast — which has kept digital systems more resilient so they can continue to serve customers seamlessly and efficiently.
As a circular logistics operation, Rent the Runway has had a unique business model since its startup days. “In traditional e-commerce, you send out the goods and they don’t come back,” says Stephanus Meiring, VP of engineering for Rent the Runway. “The majority of what we send out comes back to us, and we then need to get it ready to go out again to another customer.”
To keep operations running seamlessly, Rent the Runway relies on dozens of complex services across its multi-cloud architecture to keep tabs on everything from user journeys on the brand’s website to garments that need repairs or a stain treatment. With instant visualizations on Splunk dashboards, teams have a one-stop shop for critical metrics across the company’s sprawling environment, which enables them to identify and repair problems before impacting customer experience.
“If a customer can’t get a dress or is having issues checking out due to a bug in the experience, we need to know so we can address it quickly,” says staff software engineer Shane Ryan. “We’ve leveled up our monitoring game in the last four to five years with Splunk. We no longer have to wait for our user systems to alert us to issues, as we’ve seen the value of alerting across our front and backend systems. Now we can get ahead of the issues — and when there is an incident, it doesn’t have to be all hands on deck.”
Before, it wasn’t uncommon to need two dozen developers on a call when an incident occurred. Aki Yamada, staff engineer with Rent the Runway from the early days, recalls: “Before we started using Splunk, every resolution was bespoke — logging into production machines to analyze logs and run scripts — but Splunk enables us to answer questions about application history with simple queries.”
Now with full visibility across warehousing and consumer apps, teams can monitor what they need to manage and involve fewer people for incident resolution. The result: increased customer satisfaction — and an improved employee experience. “I don’t remember the last time someone has woken up over Thanksgiving to deal with an outage,” says Meiring. “Holidays used to be tumultuous from a tech perspective due to increased customer demand. Since we’ve upped our usage and adoption of Splunk, we haven’t had a single major outage, and the last critical incident was resolved in less than 15 minutes.”
Since we’ve upped our usage and adoption of Splunk, we haven’t had an outage and the last critical incident was resolved in less than 15 minutes.
Our site is more resilient since using Splunk.
Before Splunk, teams relied on a patchwork of tools for infrastructure monitoring, which increased operational risk. Even a 10-minute outage could potentially have a detrimental impact on workload and the customer experience. Now with Splunk Synthetic Monitoring, engineers save time by setting up integrations with CI/CD platforms to roll back buggy releases before they impact customers or the company’s warehouse operations.
While teams previously relied on manual processes to remediate an issue, Splunk’s unified security and observability platform immediately sets remediation in motion when something goes wrong. “We think observability is a way to show our loyalty to our customers,” says Matt Pumphrey, staff engineer at Rent the Runway.
With Splunk, Rent the Runway has more digitally resilient systems that have helped teams fix problems faster and shaped exceptional customer experiences. “Splunk has helped drive our successes,” says Meiring. “By having everyone on the same page, we can focus on business improvements. We have seen the impacts in our engineering team: We believe that our site has never been more resilient.”
Looking ahead, Rent the Runway is looking forward to applying Splunk solutions to more use cases across the company, including expanding machine learning and artificial intelligence in incident management — intended to further improve customer experiences and give valuable time back to teams. “By picking up the real signal from the noise, our goal is to get to the root cause of problems even faster,” says Pumphrey.
The company is also eager to help teams to be more autonomous with self-service, helping new engineers quickly get up to speed and spin up new services within a matter of weeks. “Splunk’s tools have given us a better ability to empower our engineers, which has driven our culture to a better place,” says Meiring.