Last April, we were at Algolia’s Tech Lunch meetup in Paris to talk about visual perception and its applications to the design of data visualizations. You can watch the video or read the notes below.
In The Unreasonable Effectiveness of Mathematics (1980), Richard Hammings introduces the notion of “unthinkable thoughts”:
“Just as there are odors that dogs can smell and we cannot, as well as sounds that dogs can hear and we cannot, so too there are wavelengths of light we cannot see and flavors we cannot taste. Why then, given our brains wired the way they are, does the remark “Perhaps there are thoughts we cannot think,” surprise you? Evolution, so far, may possibly have blocked us from being able to think in some directions; there could be unthinkable thoughts.”
In this talk, we will focus on the notion of cognitive system, which is defined as a user aided with a cognitive tool. Since reasoning only inside our heads is very limited, we need to enhance our intellectual abilities by putting information “into the world”, as Don Norman explains in The Design of Everyday Things. Then, by looking, understanding, and acting on the information with the aid of tools, we refine our mental representation and put in turn more of it “into the world”.The visual brain is an essential part of this process, especially in data visualization. By understanding the basic mechanisms of the visual brain, we can take away a few good practices to apply to the design of our visualizations.
Visual perception offers the highest bandwidth channel, as we acquire much more information through visual perception than with all of the other channels combined, thanks to our 20 billion neurons dedicated to this task. Moreover, the processing of visual information is, at its first stages, a highly parallel process.
Neurons in the visual areas are attuned to three main distinct channels: form (orientation and shape), color and motion. For each of these channels, neurons specialise in low-level features such as edge detection, perception of saturation and hue, speed of motion etc.
The fovea is the part of the retina on which falls the image that is seen by the thin cone of vision at the center of our field of view. It is the region where vision is the most clear and precise.The first observation we can draw from the arrangement of visual neurons in regard of the field of view, is that the mapping of neurons corresponding to the fovea area is very dense.The second, and more interesting observation, is that the spatial mapping of neurons in our visual brain respects the structure of the field of view. Roughly speaking, the spatial arrangement of two objects is “equivalent” to the arrangement of the visual neurons that will process them. It explains why our spatial awareness is so great, and why spatial positioning and grouping should be used as the primary way of encoding information in visualizations.
An interesting parallel can be drawn with artificial neural networks for computer vision.Neurons on the top of the figure beside learned to detect edges, while neurons on the bottom learned to recognise color (Krizhevsky et al., ImageNet Classification with Deep Convolutional Neural Networks, 2012). These two sets of neurons were distributed on two different graphical processors and their recognition tasks were automatically split.This is not much different from the way our visual brain operates, separating channels to augment the parallel processing of simpler and independent tasks.
Preattentive processing occurs before conscious attention. Preattentive features are processed very quickly and in parallel, within around 10 milliseconds. The nice property of preattentive features is that the speed of their recognition and analysis does not depend on the amount of disturbance around them. In the example above, the red dots can be spotted at a glance. But if you try to count them, the process gets much slower because it requires conscious attention. Shape and size are preattentive features.
The figure beside shows more preattentive features.
All features are not equally strong: the strongest effects involve color, orientation, size, contrast, motion or blinking.Also, their effectiveness depends on the variety of the distractors, like the gray dots in the previous figure: if a theoretically efficient feature is used but the distractors look very similar, it will be weaker.Although size and position are very effective, there is no strict rules in using one or another in particular. Strong visual metaphors also enforce the good understanding of a visualization, which is equally important as its raw “perceptibility” power.
Redundant coding is the combination of several features for expressing the same property. In this example, size and color represent the same dimension in the data. Redundant coding, when used in conjunction with preattentive features, further enhances the speed and accuracy of perception.
Visual queries are questions that can be answered thanks to pattern matching. Visual queries are formulated when a user looks at a visualization and tries to solve a problem. The user performs a pattern matching search on the visualization to answer that query.When a visual query is formulated, it tunes the visual cortex and optimizes it for answering this precise query. It partly explains why we are blind to changes we do not chose to observe, but it also explains why the conducted search is so efficient.To be effective, a visualization must present ways to be explored that are best solved by visual queries. The main problems to be solved must stimulate the formation of visual query that use pre attentive processing.
Let’s look at a practical example. On the right is a visualization from Dataveyes’ portfolio, showing job positions. The farther away they are from the current position at the center, the more intermediary positions will be needed to access them. The size of the bubbles represent the number of persons that currently occupy a given position.After conducting user research and prioritising user needs, some queries were chosen to be put forward on the visualization. The user is primarily looking to answer these two questions: “starting from my current position, where can I go?” and “are there many people doing this job who moved like me before?”.These queries need to become visual queries to be answered by pattern matching. This is what has been done here: looking for the next positions is encoded by 2d positioning and connectedness, and finding out if many people are in a given position is encoded by size.
This interface does not help users compare the size of the bubbles, and this choice is deliberate and based on user’s needs. The interface on the right, another example from our Portfolio, shows the example of a better way to stimulate these kind of comparisons.
Memory plays an important role, not only in helping a user remember a set of learned codes for reading a visualization, but also, at a lower level, it aids the resolution of visual queries. Let’s see how short-term and long-term memory play a role in understanding visualizations and how you should think about the user’s memory when designing a visualization.
Let’s look at long term memory first. It can be seen as a graph of concepts, where the activation of a concept fires related ones. When acquiring new data, it is stored in the context of existing data, so that the raw amount of information to be memorised is reduced. It means that we heavily rely on our existing knowledge and build new concepts incrementally.
Working memory acts like a buffer that fetches information from the long-term memory, as well as receives the visual information from the iconic buffer. It can be thought of as the amount of information in a small RAM, that retains information from one fixation to the other. It appears to be limited to 3 to 5 simple objects, each described by a set of characteristics such as color, shape, texture etc. This amount depends on the complexity of the objects, and other factors.Instead of seeing visual working memory as separate from long-term memory, it can be seen as the triggering of a subgraph of concepts in the long-term memory.
The fact that we experience the world in a very rich visual way despite our very limited visual working memory is because we mainly interpret the world as we know it. The actual information we perceive with each fixation is very limited, but placed in a context of a well-known world.Placing visual cues on a visualization that help the user recall a certain context is a good way of dealing with the limits of the visual working memory. Also, using very distinctive features will help separate the objects the user is looking for, and will ease his mental load.
Knowledge formation does not always need repetition: a single exposure to a new concept can be sufficient for it to be understood and retained. New knowledge is formed by building on top of our existing knowledge, and this is why metaphors work so well: we find explanations and understanding in mechanisms we already know. The use of real-world metaphors like physics is very powerful in that it allows to transfer existing knowledge to something totally new and find bridges between concepts.
By drawing a 3d graph of a function for which we are looking for a minimum, we introduce a metaphor for the gradient descent algorithm that refers to gravity. This helps the user “feel” how a ball would fall along the function’s plane and into a hole, representing a local minimum. When implementing the algorithm, thanks to this metaphor, you would already know that you have to find the steepest slope for your next iteration, and you would also know that you may not find the global minimum.
Epistemic actions are actions that are taken by the user to resolve a visual query. The simplest epistemic action is eye movement: by moving the eyes, we discover information bit by bit. Epistemic actions are also user interactions: when a user zooms in a map, hovers a tooltip, drags a timeline etc., he takes actions to narrow his search and find information.
Since our visual working memory is limited, epistemic actions should provoke changes that are minimal and sufficient. For example, it is not necessary to switch to a new page to get information about a content if hovering is sufficient. Likewise, zooming on a map should not make the interface blink or refresh, for all of the context and the content of the visual working memory would be lost. That’s why we should try to design interfaces that encourage non-destructive actions and that respect established interaction conventions: hover, scrolling, zooming, panning etc.
In the introduction, we mentioned cognitive systems that are formed by a user and a visualization, or particularly an interactive visualization. The figure beside summarizes the concepts we have introduced so far:
In this example from Dataveyes’ portfolio, the action of sliding changes visualization modes. It allows users to explore different simulation modes for monitoring the energy consumption of a home.Even though sliders are best used for moving through quantitative data rather than qualitative, the morphing of the blue blades around the clock is suggested by the slider thanks to the metaphor of morphing or modulation.This example shows an effective combination of metaphors and epistemic actions, encouraging the user to discover the data set in a playful way.
We have seen how some perception mechanisms of our visual brain operate, and how to get the most of it if we respect its features. But perception is only a part of what makes a data visualization effective. It also needs to be useful, complete, true and engaging.
We have seen that the formation of knowledge and understanding is incremental, because of how it is stored in the long-term memory. Thus, we can raise the question of balancing the effectiveness of a well-established representation, vs. the introduction of new paradigms that may be harder for users to get into.
At Dataveyes, we think that if we respect the good practices presented in this talk, and if the visualization is useful enough for its intended users, then new types of visualization can be introduced. Complex data sets and specific problems or areas require visualizations with a degree of customisation. When introducing a new visualization, we try to use metaphors to rely on what users already know, and keep good practices for efficient perception and memory load management in mind.
Most of the principles and scientific results presented in this talk are taken from Colin Ware’s Information Visualization (Third Edition, 2013). Below are additional references: