Why and how to apply semiotics to data visualization

Ingrid Pino
8 min readOct 15, 2021

In a world full of signs and symbols, everyone practices semiotics daily, trying to interpret their meaning. And as visual literacy — the ability to interpret data visually — becomes more relevant in different areas and professions, so does navigating the meaning of symbols and being able to communicate data through them. That is especially challenging because some symbols are almost universal, while others have significant cultural variations.

As the study of meaning created through signs, semiotics applies to all signs and symbols existing in data visualization. While a sign is anything that communicates meaning on its own besides the sign itself, a symbol represents something to a certain group of people; symbolic meaning can be different according to the context. In the field of data visualization, especially when it comes to empirical studies, the perspective of semiotics becomes relevant for two main reasons:

“On the one hand, it has become increasingly urgent to examine what Kennedy and Hill (2017) define as the “visual sensibilities” (p. 2) that are at work in the ways in which ordinary people respond culturally and engage emotionally with data and their visualizations. On the other hand, professional and institutional uses of data visualization techniques must be examined in the light of their underlying histories, conventions, and changes over time and across contexts.” (Aiello, 2019).

Understanding the symbols related to data visualization is crucial to know the ways of interpreting the data. The past uses of semiotic resources, especially through social practices, define the range of their possible uses (Van Leeuwen, 2005, apud Aiello, 2019). In consequence, the description and interpretation of symbols is a result of cultural processes and power relations. Likely the first elements that come to mind are those that are more obviously symbols used in data visualization, such as pictographs and icons.

Pictographs are pictorial symbols for a word or phrase, and they represent data using images. In modern data visualization, they were popularized through the Isotype (International System of Typographic Picture Education), a method of showing data in standardized and abstracted pictorial symbols, created in the 20th century by Otto and Marie Neurath. As stated by Otto Neurath, “to remember simplified pictures is better than to forget accurate figures”. Indeed, that summarizes well how visual data can be more meaningful and impactful than numbers on a table.

Examples of the isotype from children’s books by Marie Neurath. ‘The magic knife’ was a term used by Marie in If You Could See Inside (1948) to explain cross section drawings, which give the appearance of cutting through an object, such as a building, to reveal its inner workings. In Railways under London (1948) a cross-section of the London Underground is combined with Isotype pictograms. (Picturing science for children: the power of Marie Neurath’s designs, 2019)

Having created the foundations of the isotype in Vienna, co-creator Marie Neurath first tested the efficiency of the Isotype for non-western audiences in projects developed in the 1950s by request of the Nigerian government. She “traveled several times to Africa to learn more about the culture and began testing initial designs for their cultural effectiveness” (Forrest, 2020), which is reflected in her accounts of the reaction to the Isotype in a classroom: “I saw with pleasure how the children eagerly examined the pictures… Of course, the symbols had to ‘speak’ to the Nigerians, just as they had to the Viennese; men, women, children had to look as they did [in Africa]; houses could not have chimneys; but in the essential rules of transformation [1], nothing needed to be changed.” What is most fascinating about that account is the simultaneous recognition of the universality of some symbols and of the cultural peculiarities that would demand small distinctions in the symbols.

The usage of pictographs is more common nowadays in the area of information design, especially to create infographics. Those are collections of images, charts and minimal text to provide an overview of a topic. This kind of visual representation of information is focused on making complex subjects easier to understand.

“The deadliest animal in the world”, infographic published by Bill Gates in 2014 to raise awareness about fighting Malaria.

There is a specific type of pictorial symbols called icon: in computing, an icon is a pictogram or ideogram that helps the user navigate a computer system. Besides the use of icons on data visualization softwares to help the users find and execute different functions, it became common to apply icons in data visualization to help explain or identify quicker the message to be communicated from the data. That is especially useful when the intention is to draw attention to information that might require action, for example using up and down arrow icons, usually color-coded in green and red, to show growth of a metric in comparison to previous periods.

Icons exist on the user interface of data visualization tools to help identify which charts to create or which functions to execute (partial screenshot from the UI of Google Data Studio).
Some resources to visualize data may include arrow icons to show growth in comparison to previous periods (partial screenshot from a dashboard created on Google Data Studio).

Because of the ease of including icons in data visualization software, they are occasionally used merely as an illustrative resource, without added meaning or functionality. Those should be used with moderation, as they have the potential to distract the user from the data rather than helping increase readability.

Example of dashboard that uses icons to illustrate each section of data being displayed. Google Data Studio Template Library.

On the other hand, depending on the awareness about the symbol in a certain context, an icon can even be used to substitute text in a data visualization.

The version in the top has both icons and text to identify the device types (desktop, mobile and tablet), while the versions in the bottom have only text or only icons.

Contemporary data visualization examples like the works of information designers Giorgia Lupi and Stefanie Posavec are quite deeply rooted in semiotics. In their project Dear Data, symbols were used to collect and visualize data. Over the course of a year, they measured different types of personal data related to their lives and then visualized them as a drawing on a postcard, which they then sent to one another as a type of “slow data” transmission of their journaling.

A page from “Observe, Collect, Draw! — A Visual Journal”, book by Giorgia Luppi and Stefanie Posavec, based on their project Dear Data, that encourages the collection and visualization of personal data through symbols.

While data visualization practitioners benefit from being aware of semiotics and the common meaning of symbols, breaking away from dominant ‘visual sensibilities’ can also promote social action and social change (Aiello, 2019). That approach recognizes data visualization as an evolving field, grounded in principles of visual design, but also shaped by what is perceived by day-to-day practitioners as “best practices”[2], which may vary according to the professional and cultural context:

“For example, in addition to outlining guidelines for ‘good’ data visualization design, data visualization designers and their students can use a social semiotic approach to examine the histories of particular semiotic resources (e.g. colour, but also shape or layout) as well as understand how these may be used in different social and cultural contexts.” Van Leeuwen (2008), apud Aiello (2019).

Great exploration of semiotic resources is seen in the works of Nadieh Bremer and Shirley Wu, especially in their project Data Sketches. In the namesake book, they document the process of trial and error to create complex data visualizations, which often includes their experimentations of semiotic resources. Two examples of different kinds of exploration of semiotic resources from the Data Sketches project are “Film Flowers” and “Figures in the Sky”. While “Film Flowers” experiments with how new shapes and color combinations can be created with data, “Figures in the Sky” is at its core about social semiotics, showing the similarities and differences of shapes seen in the night sky by different cultures.

Part of the “Film Flowers” visualization created by Shirley Wu (2016).
Part of the “Figures in the Sky” visualization created by Nadieh Bremer (2018).

It is not always intentional or evident at first sight how semiotics is imbued in data visualization. In an exploration of data done in 2021, I was aiming to better visualize for project management at a high level how each team’s work was affecting the others. The data had multiple levels of dependency between tasks from different teams. My default is to stay away from most circular charts, especially variations of pie charts [3]. But I put that away for a moment and drafted a sunburst chart: segmented per task, with multiple levels of dependency, and color coded per team. The sunburst chart created for that project management challenge worked great to show in a glance the multi-level dependency and how each team’s work affected the others. That was especially useful to identify and solve cross-team bottlenecks at team-level (and not just on the level of individual tasks blocking others). When I shared those ideas online[4], Nancy Cox, market researcher and my writing mentor, brilliantly noticed a semiotic layer on the use of a circular chart in that context, making a wonderful association to a core aspect of teamwork: “circles are also a metaphor for conversations — so necessary in projects!”.

My whiteboard draft of a sunburst chart to help project management know at a high level how each team’s work was affecting the others. In this example, the angle of each slice will depend on how many tasks they’re blocking. Looking at the first level (most internal), the smallest slice is the purple team, blocking only 1 task from the green team. The largest slices are from the orange and blue teams, each one blocking 3 tasks. Focusing on those, while the orange team blocks 1 task from each team in the next level of dependency (arcs in the middle), the blue team is blocking 2 from the purple team, and one of those even blocks another task from the blue team in the last level. The purple team is the most affected by tasks blocked by other teams, and it’s the one blocking the least amount of tasks from other teams.

The meaning of resources relevant to data visualization, like color, can vary substantially according to the cultural and social context; that meaning is shaped by cultural and political forces and changes over time [5]. Semiotic innovation happens when the dominant meaning of symbols is questioned, and the ‘rules’ of semiotic are broken or changed (Van Leeuwen, 2005, apud Aiello, 2019), and through that, even inspire social change. In its turn, “a social semiotic framework (…) can and ought to be extended further to inventorize, situate, and transform the semiotic resources associated with data visualization” (Aiello, 2019).

Semiotics application to data visualization is yet to be better studied and systematized, which should be done by investigating the characteristics of visualization design and their implications on how people make sense of the messages communicated by data visualization.

[1] In the “Vienna method of pictorial statistics”, work developed by Marie and Otto Neurath together with Gerd Arntz and that later became the Isotype, the role of Transformer was somewhere between a data scientist and an information designer, or what in the early 21st century would be a data visualization specialist. Marie explained the work of the transformer: “From the data given in words and figures, a way has to be found to extract the essential facts and put them into picture form. It is the responsibility of the transformer to understand the data, to get all necessary information from the expert, to decide what is worth transmitting to the public, how to make it understandable, how to link it with general knowledge or with information already given in other charts. In this sense, the transformer is the trustee of the public. He has to remember the rules and to keep them, adding new variations where advisable, at the same time avoiding unnecessary deviations which would only confuse. He has to produce a rough chart in which many details have been decided: title; arrangement, type number, and color of symbol; caption, etc. It is a blueprint from which the artist works.” (Forrest, 2020).

[2] In the context of data visualization, Aiello defines as “best practices” the approaches that are widely accepted and prescribed as being most effective and sound (2019).

[3] The main reasons are 1. they visually distort the data; and 2. our length perception is better than area.

[4] The original publication is available on LinkedIn.

[5] Aiello (2019) highlights the works of Michel Pastoureau regarding the sociological history of colors and their meanings.

--

--