We have all been there. When you are trying to buy a ticket and the app crashes or loads the next web page when booking a holiday only to find it takes forever and appears to hang. Our frustration level increases and if it continues, we will exit and go elsewhere. With banking apps though, we won’t move straight away but repeated bad experiences here will be remembered and eventually will make us move.
The reliance on apps and web platforms, as well as digital transformation as a whole, has accelerated rapidly over the last few years, with the recent pandemic being perhaps the biggest driver of this change in recent years. From searching for information, getting the latest train times, weather forecasts, and traffic alerts to buying cinema, theatre and plane tickets and paying bills through banking apps, these web apps and platforms are varied and span all elements of our lives today. When they stop working, crash or are slow, we can’t complete what we wanted to do. B2B platforms are no different; if they are not working, have performance issues and errors, employees can’t complete the tasks they need to do.
The impact on your business can be high - from users exiting the platform without buying or an outage that stops all transactions with both resulting in significant drops in revenue through to the high cost, time and resources in finding and fixing the issue. There are also other impacts to consider too - when users can’t transact online they might call the call centre where there is likely to be a cost to service each call as well causing longer delays for customers to get through. Customers will remember their experiences, particularly if they happen more than once, and can easily decide that next time they will use a competitor, where possible, rather than come back. That customer is then lost indefinitely.
Issues with B2B platforms can fundamentally stop businesses from running too - for example, logistics platform problems can stop customer orders from being fulfilled and missing delivery SLAs, which can result in fines:
Our expectations as users are very simple:
However, delivering these simple aims is a huge challenge for the teams that build and manage these platforms. Environments today are extremely different even from those a few years ago; everything is typically in the cloud, spread across thousands of microservices and containers, utilising the latest tech innovations and are heavily reliant upon third-party provided services. The pace of change is fast, with potentially multiple releases per day which are managed by large, distributed teams. They are also ephemeral, meaning that components could last for a few minutes before being destroyed. These approaches provide multiple advantages, with a key one being able to deliver innovation to your customers quickly. Marketing events can be coordinated with app and web platform updates to ensure you drive more users to your platform, for example. This constant change though increases the risk of things going wrong or something breaking and this can then undo all of that innovation in an instant as your users do not see the benefit, at least not immediately, and before you know it, developers are spending time troubleshooting and fixing issues rather than innovating.
And finally, all of this tech has to work so that your users can execute their transactions on your platforms and sadly they do not care about all of that complexity that sits behind them or indeed how they are managed. This is why it is so important to observe and understand the user’s experience of the platform. It provides the key indication of whether you have a problem that needs attention because it is affecting users. The last thing you want is to hear about platform issues via your users and social media! Adding to the mix of challenges is that most monitoring and observability are focused on the back-end application and infrastructure and not on the users and their experiences.
First of all, let’s define what the user experience actually is: imagine you are on a web platform to search and buy a holiday - you will load URLs, click on links, add stuff to a basket etc., to purchase that holiday. Perhaps as you go through the buying journey, you encounter numerous challenges ranging from the page taking a long time to load in your browser through to not being able to select the holiday package you want because of an error. Maybe you are able to get to the payment page but you can’t make the payment. Alternatively, you may have had no issues and everything was great. Either way, when we talk about the experience of the platform, it is the above and having that visibility of the users’ experience means that you can immediately understand how the platform is performing for your users and answer that critical question of whether you have a problem that needs to be fixed. The same is true for mobile apps - understanding how they are performing to the user and whether they are crashing is key to knowing the user experience of them.
This visibility provides answers to key questions about your platform:
The front-end is in itself a complex engine, with multiple moving parts and addressing those performance issues can be an extremely difficult task. What’s more is that users access platforms and the front-end through many different browser types and versions, running on different platforms and OSs as well as downloading mobile apps to run on Android and Apple devices, again, with different versions of the underlying OS.
There are hundreds of variables that can impact performance and combined with a lack of visibility make it difficult to prioritise what to fix and how to do so. From how the page is designed, constructed and built in the browser, complex JavaScript components, a variety of images through to third-party providers including content delivery networks and how the front-end interacts with the back-end, delivers a cocktail of front-end complexity, with multiple variables to manage.
Engineering teams can also be prioritising other issues on the platform, and in many cases, can be unaware of issues impacting their end users due to a lack of visibility. The monitoring of the back-end doesn’t provide a complete picture of the app’s health and user experience. A great example of why needing front-end and user visibility was the Fastly CDN outage and you can read more about that here.
So what is needed is the ability to observe and monitor the user's experience and the front-end. Gartner defines this part of Observability as DEM or digital experience management. Although over the years it has been called many names - end-user experience management, user experience management, digital experience monitoring etc. - and these are still used today. It is a major component of full-stack visibility and ties in nicely with metrics, traces and logs - more about this later on - which are the three pillars of observability. Splunk provides two key approaches to observe the front-end and user experience:
The visibility above can also be metricised so that you can track key metrics about user performance, including page response time and the web vital metrics. The front-end is also linked to the back-end, as each front-end call is traced through to the back-end, with further drill-downs into the relevant logs, thus speeding up troubleshooting and solving issues.
The Splunk O11y platform has a comprehensive DEM solution that will provide user experience visibility into the front-end and user experience:
This visibility is easily provided without the need to install anything within the platform. From the screenshot below, you can quickly see the availability or uptime of your platform and how it has been performing over time. Has there been an outage that needs further investigating or a blip in performance over the last 30 days? If there has, we can drill down into the details as to why.
By drilling down into a synthetic test, you can get much deeper insight as depicted in the screenshot below. You can see immediately the components that make up the page, the key Google web vital metrics as well as links to the back-end traces for each of those components, therefore linking the front-end to the back-end for quicker troubleshooting. Any issues here can be acted upon and addressed, quickly and efficiently.
Don’t forget that this visibility can be provided for API as well as simple uptime tests.
Once RUM is deployed, the performance of the browser-based or mobile app is captured and reported. The screenshot below shows this visibility for a browser-based app and you can quickly see which pages are being used, how frequently they are being accessed and their performance, with a detailed analysis of the core web vitals of each page. Any changes to the platform can be visualised to see if there is a negative impact on the front-end and the users.
You can drill down further into any page, to get further info on it either to help troubleshoot an issue or for performance optimisation. By drilling down into the checkout page, there is a range of additional info that can be seen - from where the users are coming from (to determine if there is a location-based issue) through to which browser and OS type and version the user is using to access the page. You can understand whether a problem is affecting all users and locations or a particular browser type.
From this view, you can pivot into a full user session analysis and measure the customer impact of every resource, image, route change and API call, with complete visibility across every user session. Using session replay provides deeper visibility into the user’s experience with video reconstructions which are correlated with session waterfall, to truly understand what the user was doing.
From working and learning with our customers, I have put together some recommended best practices below which go through a suggested approach to achieve this front-end and user visibility:
Try Splunk O11y for yourself by signing up for a free trial. Check out the links below for some great further reading:
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.