Purpose of visual display of information
Today everybody can produce reports of quantitative business information in the form of tables and graphs. But because doing it became so easy due to new technology, many of us forgot about mean purpose: to provide the reader with important, meaningful and useful insights.
The content must compose a coherent whole that addresses a specific set of closely related needs for a well-defined audience.
Tables and graph can be used for four purposes:
- Analysing
- Monitoring
- Planning
- Communicating
Tables and graphs are part of family of display methods known as charts. Other types of charts are diagrams and maps. Graphs are perceived by our visual systems. Tables interact primarily with our verbal system.
Numbers
Quantitative messages are always about relationships. Relationship between some measure of quantity and one or more associated categories of interest to business. Two types of data: quantitative and categorical. Quantitative values measure things, categorical subdivide the things that they measure into useful groups. Different type of relationships requires different types of displays.
Categorical subdivisions can relate to one another in the following ways:
- Nominal
- Ordinal
- Interval
- Hierarchical
Nominal is one in which there is a single category and it is discrete and have no intrinsic order. Ordinal is about prescribed order. Interval includes a series of individual, sequential numerical ranges. Hierarchical involves multiple categories that are closely related to each other as separate levels in a ranked arrangement.
Categorical subdivision can also relate to one another by virtue of the quantitative values associated with them: ranking, ratio, correlation.
Statistic provides several methods for data reduction or summarization (aggregation):
- Measure of average:
- Mean – sum all the values divided by the number of values.
- Median – sort the values and find the value that falls in the middle of the set.
- Mode and midrange – mode is the most often value in the set of values, midrange in the midway value between the highest and lowest values in a set of values.
- Measure of distribution:
- Range – subtracts the lows value from the highest value.
- Standard deviation – set of values measures their distribution relative to mean.
- Measures of correlation
- Measures of ratio:
- Sentence
- Fraction
- Rate – most common
- Percentage – most common
- Measures of money
- Over time – adjust to inflation.
- Multiple currency – convert them into a common one.
In standard deviations we mostly use graphs called frequency polygon. It has something like a bell-shaped curve formally called a normal distribution. 68% of the values fall within one standard deviation above and below the mean, 95% fall within two standard deviations and 99,7% fall within three. This is called empirical rule.
We can measure correlation in a particular way and express it in a single value – linear correlation coefficient. It measures the direction and degree of the linear relationship between two paired sets of values.
Tables and graphs
When you want to convey quantitative information that consist of one or two number do it in written language. There is elegance in simplicity. Quantitative values are numbers. Categories identify what the quantitative values measure. Sometimes numbers simply categorize information and have no quantitative meaning.
Tables is a structure for organizing and displaying information. Data are arranged in columns and rows, data are encoded as text (words, numbers).
When to use tables:
- Look up individual values
- Compare individual values
- Use of precise value
- The quantitative information involves more than one unit of measure
A graph is a method for displaying quantitative information that exhibits the following characteristic:
- Values are displayed within an area delineated by one or more axes.
- Values are encoded as visual objects positioned in relation to the axes.
- Axes provide scales and assign values and labels to the visual objects.
Maps are a two-dimensional representation of the physical world. They were first graphs. Rene Descartes was first to use two-dimensional visual grids for numbers only. William Playfair, a British social scientist, created a lot of graphing techniques in 18th century.
When to use graphs:
- The message is contained in the shape of the values.
- To reveal relationships among multiple values.
Tables – variations
Five relationships, divided into two types:
- Quantitative-to-categorical relationships
- One set of quantitative values to ones set of categorical subdivisions
- One set of quantitative values to intersection of multiple categories
- One set of quantitative values to intersection of hierarchical categories
- Quantitative-to-quantitative relationship
- One set of quantitative values associated with multiple categorical subdivisions
- Distinct set of quantitative values associated with the same categorical subdivisions
Types of relationship can be displayed in two fundamental types:
- Unidirectional – categories items are laid out in one direction only (across columns or rows)
- Bidirectional – categorical items are laid out in both direction
One of the advantages of bidirectional tables is that they can display the same information using less space.
Graphs variations
Graphs always display information about relationship. They can present complex relationships so that we can see these relationships quickly and easily. Graphs components: scales of axes, grid lines, bars, legend. Some components represent quantitative values, some categorical subdivisions and some play a supporting role.
Quantitative values can be encoded as:
- Points
- Lines
- Bars
- Shapes with 2-d area
Lines is by definition always straight. If it is not, it is a curve. Lines are used to: connect individual points and display the trend of a series of data points.
Bars do one thing extremely well, representing individual values discretely. Because length of a bar is one of the attributes that encodes its quantitative value, its base should always begin at the value zero. Pie chart is part of larger family of area graphs. In pie chart, the perimeter of the circle serves as a circular axis. Pie charts communicate poorly.
Some other elements that can be used are the varying sizes of circles in a bubble chart.
Categorical subdivisions can be encoded as:
- Position (along x-axis)
- Color
- Point shape
- Fill pattern
- Line style
The most common attribute used to identify categorical subdivisions is position. Second is color. Fill pattern are only used when bars are used.
There are seven types of relationships that business graphs are typically used to display:
- Nominal
- Time series
- Ranking
- Part-to-whole
- Deviation
- Distribution
- Correlation
Nominal displays a series of discrete quantitative values so they can be easily seen and compared. Time series are quantitative values associated with categorical subdivisions of time. If you use some descriptions of relationship like: change, rise, increase, fluctuate, grow, decline, decrease or trend; you probably are talking about time-series.
Ranking is about sequential relationships. Words like: larger than, smaller than, equal to, greater that and less than; are usual with rankings. In part-to-whole, the common unit of measure is percentage. Words like: percent or percentage of total, share and accounts for x percent; are connected with part-to-whole.
Deviation is about how one set differ from a primary set of values. Words are: plus or minus, variance, difference and relative to. When a graph displays the distribution of a single set of values, the relationship is called a frequency distribution. Words are: frequency, distribution, range, concentration, normal curve, normal distribution or bell curve.
Correlation is about two sets paired and how they are connected, in which direction and to what degree. Words are: relates to, increases with, decrease with, changes with, varies with, caused by, affected by and follows.
- Nominal comparisons are best encoded by: vertical and horizontal bars and data points.
- For time-series you cannot use horizontal bars. You can use: lines, point and lines and vertical bars.
- Ranking: vertical bars, horizontal bars and data points.
- Part-to whole: vertical and horizontal bars. Use stacked bars if you need, don’t use pie charts.
- Deviations: horizontal bars (not when combined with time-series), vertical bars, lines, lines and points.
- Distribution can be:
- Single (one set of values – frequency distribution (histograms): vertical bars and lines.
- Multiple (not more than five value sets): lines and bars and point (boxes).
- Correlation: scatter plots and correlation bars.
Visual perception
Approximately 70% of the sense receptors in our bodies are dedicated to vision. Graphs and tables are visual means of communication. Sensation is a physical process, involving the receipt of a stimulus. Perception is a cognitive process, involving the interpretation of the physical stimulus in an effort to make sense of it. For a visual stimulus to occur, there must be light.
Our eyes jump to the next point of fixation in ¼ of a second. It is called saccadic eye movement. We don’t see images with our eyes; we see them with our brains. We have three types of memories: buffer memory, working memory and permanent memory. They are analogous to three types of memory that process visual information in our brains:
- Iconic memory
- Shor-term memory
- Long-term memory
Iconic memory is part of preattentive processing. It is an extremely fast process of recognition and nothing more. Short-term has two fundamental characteristics: it is temporary and it has limited storage capacity. That is why graphs are better for large amounts of information, since they can communicate them in a way that can be perceived all at once, while tables are limited to the purpose of look-up. About long term: how we store information involves an intricate network of links and cross-references, like indexes in a computer that help us to find information and retrieve it back into short-term memory when need it.
Preattentive processing is the early stage of visual perception that occurs below the level of consciousness at an extremely high speed and it tuned to detect a specific set of visual attributes. Collin Ware has organized preattentive attributes into four categories:
- Form
- Orientation
- Line length
- Line width
- Size
- Shape
- Curvature
- Added marks
- Enclosure
- Color
- Hue
- Intensity
- Spatial position
- 2-d position
- Motion
- Flicker
- Direction
Color is made up of three separate attributes: hue (red, green, blue, …), saturation and lightness (both intensity features).
Preattentive attributes of motion are very powerful attention-getters.
List of preattentive attributes that can be useful in graphs and if they can be perceived quantitatively:
- Form
- Orientation (Y)
- Line length (Limited)
- Line width (N)
- Size (Limited)
- Shape (N)
- Curvature (N)
- Added marks (N)
- Enclosure (N)
- Color
- Hue (N)
- Intensity (Limited)
- Spatial position
- 2-d position (Y)
Size is limited, since sometimes we have problem estimating the size and, in that connection, also value. Same goes for intensity of colors. Text is easiest to read when it is black on a white background, white text on a black background also works, sometimes even red or dark blue text on a white background.
We can distinguish preattentively between no more than eight different hues, about four different orientations and about four different sizes and all the other visual attributes should be limited to less than 10. Preattentive processing generally cannot handle more than one visual attribute of an object at a time.
Gestalt principles:
- Proximity
- Similarity
- Enclosure
- Closure
- Continuity
- Connection
General design
The primary objective of visual design is to present content to your readers in a manner that high-lights what’s important.
Two main objectives are:
- Highlight the data
- Organize the data
Edward Tufte: “Above all else show the data.” He introduced the concept data-ink ratio. This is ratio of data ink to total ink. It should be close to 1 as possible, without loss of data information. We can improve a design process either through reducing non-data ink or through enhancing the data ink. Reducing non-data is about subtracting unnecessary non-data ink or de-emphasizing and regulating the remaining non-data ink. We bring out elegance. Latin term eligere means to choose out or to select carefully.
Tables and graphs consist of three visual layers:
- The data as the top prominent layer
- Supporting components (grid lines)
- The background
Use the thin lines and soft, neutral colors for second and third layer.
To enhance the data ink, you can subtract unnecessary data ink and emphasize the most important data ink. Some of the preattentive visual attributes used for emphasizing data ink:
- Line width
- Orientation
- Size
- Enclosure
- Hue
- Color intensity
Organizing the data:
- Group the data
- Prioritize the data
- Sequence the data
Tables primarily use the Gestalt principles of proximity and continuity. Graphs use similarity and connection to group data. You can make text, tables or graphs appear more prominent by locating them at the top left of page or screen. Pointers are not subtle, especially arrows. So, you should use them with discretion to avoid visual clutter.
The role of text:
- Label
- Introduce
- Explain
- Reinforce
- Highlight
- Sequence
- Recommend
- Inquire
Some information is so important that you should say it more than once and in more than one way. Recommendations for action are the best communicated in words.
Text should be included on every page of every report to answer the following:
- What
- When
- Who
- Where
A good title is invaluable. When is about the range of dates the information represents or the point in time when the information was collected? Who is about letting people know whom to contact?
Sequence the data using left to right and top to bottom positioning, using visual highlighting and using sequential labeling.
Table design
The components that we combine to construct tables and graphs: data and support components. Tables encode data as text.
Data components: categorical subdivision, quantitative values and complementary text. Support components are: white space and page breaks, rules and grids and fill color.
Table terminology. The term body generally refers exclusively to the rectangular area that contains the quantitative values. Rows summarize information contained in preceding rows are called footers. These can be used to summarize the entire table, as in the diagram above or to summarize a subset of rows, which are called group footers. When a header spans multiple columns, it is called a spanner header.
Use one-to-one ratio of data height to white space height. Rule and grid can be used to: delineate columns and rows, group subsets of data and highlight subset of data. The problem is that they break up the data. When you use rules, be sure to subdue them visually in relation to the data by keeping the lines as thin and light as possible. When white space alone can’t be used to effectively delineate columns and rows in tables, fill shades and hues work better than grids and rules.
Time series should be arranged across the columns. Ranking look more natural when arranged from top to bottom.
Whatever the reason for breaking the data into groups, keep in mind the following design practices:
- Use vertical white space between the groups but only enough to make the break noticeable.
- Repeat the column headers at the beginning of each group.
- Don’t vary the structure of the table from group to group.
- Group data based on multiple categories, position the group headers on the same row.
- When you want your readers to examine each group of data in isolation, start each group on a new page.
To enable easy comparison between individual members of a particular set of categorical subdivisions, arrange them across multiple columns or to the right of the other columns of categorical subdivisions.
Data sequencing in a table is sorting. Numbers and dates both have a natural order that is meaningful. Alphabetical order is useful in tables for data that have no built-in meaningful order.
Formatting text can be about orientation and alignment. Numbers that represent quantitative values, as opposed to those that are merely identifiers, should always be aligned to the right. Dates are best aligned to the left, using a format that keeps the number of characters in each portion of the date. Text that expresses neither numbers nor dates works best when aligned to the left. Use same number of decimal digits. And align both the decimal point and the final digit to the right.
Currency sign can be used only in summary, not with every number.
Font should be as legible as possible and the same font should be used throughout. Fonts that are most legible tend to have a clean and simple design. Fine legible fonts are: serif (Times New Roman, Palatino, Courier) and sans-serif (Arial, Verdana, Tahoma).
When the summary values are more important to your message or to your readers than the detail, it often makes sense to place them in a header.
Graph design
The strong visual nature of graphs requires a number of unique design practices. Always avoid 3-d and encode quantities to correspond accurately to the visual scale. You can only use two attributes of visual perception to reliably encode quantitative information: line length and 2-d position. Line length in the form of bars and 2-d position in the form of points and lines.
Like tables, graphs are constructed using components that fall into two categories: data and support components.
Different sizes, shapes and hues are only three of the visual attributes that you can use to distinguish different sets of points. When points overlap increase size of graph or decrease size of points. Maybe make them transparent. If you choose points and lines in combination, make sure that the points are not obscured by the lines.
A bar consists of two ends: the one that marks the value, called the endpoint and the one that forms the beginning, called the base. Bars should begin at zero.
Lines that are used to encode values in graphs fall into four categories: standard line, high-low line, trend line and reference line. Standard are particularly good at displaying values that change through time as well as the overall of that change. High-low connect the maximum and minimum quantitative values across multiple data sets at each location along a categorical scale. Trend lines display the overall course of quantitative values that are spread across a series of categorical subdivisions. Reference lines are used to display a set of values against which other can be compared or to mark a point of interest along a categorical scale.
Scale lines and aces are intimately related. They divide axes into increments of equal lengths. Quantitative scales are common and logarithmic. The same distance anywhere along a logarithmic scale equals the same percentage.
Tick marks may appear on the inner side or outer side of the axis or across the axis.
You can discard legends when the data subdivisions that need labels are grouped together so that you can place a label right next to each set. Legends should be placed outside of the data region.
The upper limit to categorical subdivisions is between five and eight.
Axes give dimensionality to graphs.
You should make graphs wider then taller. White data region is generally the best background for your data objects. Grid lines should always be visually subdued. They can be used to: enhance the look-up and comparison of data or perception and comparison of localized patterns.
Solution for multiple variables
It is easy to combine multiple sets of quantitative data in a single graph when they all use the same unit of measures.
Sometimes graphs can be arranged closely together so they can be examined together. The most important is to keep consistency among the graphs. Graphs that display correlations include quantitative scales of both axes, they may be arranged horizontally, vertically or in both directions as a matrix.


