In our Part 1 of Dashboard Design, we reviewed dashboard layout design and provided some templates to get started. In this Part 2, we’ll be walking through:
There are many visualization types and configurations available to choose from. In general, keep your visualizations as simple and straightforward as possible to avoid distraction and highlight only the most important information. If there is too much unnecessary information on the page it can be overwhelming and focus can be misdirected to unimportant details.
Charts based on the horizontal axis typically display time series data. The visualization represents data over a period of time and is useful to understand trends, highlight anomalies, and possibly compare multiple series.
A basic time series chart can be displayed using the `timechart` command. For example, if you’d like to investigate which days of the week have the most sales:
index=_internal | timechart span=1d count
A line chart is the best time series chart to understand continuous trends.
To determine whether a certain person had any noticeable differences in sales:
index=_internal | timechart span=1d count by user
A line chart is the best time series chart to understand and compare trends.
Stacked charts represent the accumulation of more than one data series. If there’s a data series of central importance, position it directly on the axis in order to best see its development over time. Taking the example above, we can use this configuration to understand the charts above in one; both how many sales are made in total each day, with an additional breakdown by user for further investigation on the trends:
A column chart is the best time series chart to understand discrete data points, especially summed
If we don’t care to see the total number of sales, but want to clearly compare who is making the most sales per day, we can use a stacked chart because the accumulation of all data adds to a whole in this case:
An area chart is the best time series chart to understand continuous quantities
Bar charts are typically used to compare data of one period or point in time across multiple categories. By being on an axis, each category is more easily compared using a common baseline.
A basic categorical chart can be displayed using the `stats` command. For example, if you’d like determine which product had the most sales last year:
index=_internal | stats count by product
If you’d like to add a second dimension the `chart` command is useful here. For example, comparing sales across products, with an additional breakdown by person:
index=_internal | chart count by user product
Only use a pie chart if you have a single series and would like to highlight how the partial categorical elements add up to a whole.
For example, we can highlight the percentage Mary contributed to sales last year:
index=_internal | stats count by user
To see how a single element measures up to a threshold, or multiple thresholds you may use a single value radial or gauge.
For example, display the current sales compared to the sales goal for the year:
index=_internal | stats count
Or if you would also like gauge against the amount required to match costs, and highlight the profit:
If you can, use simple single values to display metrics as they communicate data in a straightforward way.
Correlation charts are used to investigate the relationship between 2 or more variables.
The most useful visualization for displaying the relationship between 2 quantitative variables is a scatter chart. For three quantitative variables, bubble charts or parallel coordinates are good choices. For example, you can investigate possible correlations between number of stores, sales made, and cost. In the following bubble chart, we can see “Temperature” and “Sales” are linearly correlated for “Product 1”, but they are not correlated for “Product 2”.
Charts with links are a great way to show relationships between entities. Sankey diagrams are best used to show amounts moving between specific entities such as money flow between countries, data amounts between applications, and more.
Alternatively, link graphs show connections between distinct values in various fields. The aggregated node values allow users to quickly visualize highly connected data points that may be otherwise overlooked.
Maps and choropleth SVGs can be used to emphasize relationships between elements in certain locations or areas.
A map is used to show geospatial data. Bubble layers are a great way to visualize count distributions across geographic locations, and marker layers to identify specific locations. Alternatively, choropleth layers visualize relationships between elements in certain areas if counts can be aggregated. Additional tooltip fields can be helpful to add labels or other qualitative data about specific markers or regions.
If you have a custom shape, such as a building floor plan, any custom Scalable Vector Graphic (SVG) image can be used to display the corresponding contextual data.
This section introduces the best practices for choosing colors for your visualizations – find a mapping from data values to colors. You can distinguish the design choices for mapping the data values by whether the data attribute is categorical or ordinal. Ordinal attributes can be further distinguished by whether the data is sequential or diverging. Colors can be understood by three channels: luminance, hue, and saturation[1]. Luminance and saturation are magnitude channels, and they are effective for ordinal data encoding. At the same time, hue is an identity channel, which is very effective for representing categorical data and showing groupings.
Categorical colormaps use different color hues to encode categories. This visual encoding is highly effective among all types of visual encodings, such as using shapes and sizes. The following is an example of mapping the three categories “California,” “Washington,” and “Oregon” into blue, green, and yellow colors.
Studies show the number of discriminable colors for coding small separated regions is usually limited to between six and twelve bins, and choosing colors from opponent color axes is recommended. And the chosen colors should be close in luminance. You should avoid using a large number of colors, especially when the colored regions are small and not contiguous. In the following example[2], 21 colors are used to encode chromosomes. The color representations are distinguishable when the regions are large and similar colors are next to each other (right), but only twelve bins of colors can be distinguished from each other when used in the small regions (left.)
A sequential colormap encodes a range of values from a minimum to a maximum value. The following marker gauge visualization example linearly maps the increasing bin values of “Custom Churning Rate” to a decreasing lightness of purples.
A diverging colormap has two hues at the endpoints and a neutral color as a midpoint. For example, the choropleth map below uses the blue and red hues to represent the increased and decreased revenue, respectively.
Choosing colors should take into account the common problem of color blindness. The following shows an example where lousy color palettes can cause perception issues for people with deuteranopia (red-green color blindness.) People with normal color vision can distinguish divergent red and green colors, but it's difficult for people with deuteranopia.
The red-green color blindness is not limited to simply telling red apart from green. The confusion can extend to red from black, blue from purple, and brown from green for different color blindness. Tools like Colorblindly (used above) can help to check your color palette. You could avoid using only the hue channel to encode information or mapping information to colorblind-unfriendly colors. If using such colors are strong convention in some practice, you could also use other visual channels such as shape or size to encode the data, for example:
References:
[1] Tamara Munzner. Visualization Analysis and Design. A K Peters Visualization Series, CRC Press, 2014.
[2] Sinha AU, Meller J. Cinteny: flexible analysis and visualization of synteny and genome rearrangements in multiple organisms. BMC Bioinformatics. 2007
This blog was co-authored by Tara Yuen and Chuan Wang.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.