d3js_title-01

Web-Based Visualization Part 1: The D3.js Key Concept

by Stephen Thomas · Nov 5, 2012 · 1 comment

Those of us in the user interface community often obsess over the smallest details: shifting an element one pixel to the left or finding the perfect transparency setting for our drop shadows. Such an obsession is well and good; it can lead to a superior user experience. But sometimes perhaps we should take a step back from the details. In many cases, our interfaces exist to provide understanding. If the user doesn’t understand the information, then it doesn’t matter how pretty the interface looks. Fortunately, there’s an entire science devoted to understanding information, the science of data visualization. And on the web there is one particular tool that its practitioners rely on far more than any other—the excellent D3.js library from Mick Bostock. In a series of posts, we’ll take a close look at this library and its enormous power.

The Power of Visualization

The remarkable power of data visualization was evident long before computers and the web. While begging forgiveness from those that have seen this before, let me present an oft-cited example from over a century ago. I could, if I wanted, tell you a simple, sobering fact:

In the War of 1812 Napoleon began his march on Moscow with an army of 422,000 men. He ended his retreat with about 10,000 men.

 

Instead of telling you this fact, though, I could let Charles Joseph Minard show you:

Minard’s poster shows the march of Napoleon’s army on a map. The Russo-Polish board is on the left, and Moscow is in the upper right. The advance is tan while the retreat is black. Most crucially, the army’s width on the map is proportional to its size. Comparing that width at the start and end of the campaign is viscerally and heartbreakingly powerful. Authorities such as Edward Tufte have said that Minard’s poster “may well be the best statistical graphic ever drawn.”

Visualization draws its power from a more primitive and immediate part of our brains. It relies on perception rather than cognition. To appreciate my original statement of the campaign, for example, you had to perform (at least subconsciously) some non-trivial mathematics. Simply comparing quantities isn’t enough. Because the numbers 422,000 and 10,000 are so different, some notion of logarithms is essential to understanding the difference. It’s true that most of us can make those calculations without explicitly thinking about them, but they still require that our brains do some real work. Visually, however, the contrast between the width of the tan and brown on the map is immediate; no mathematics required. It is this kind of immediate and powerful understanding that effective visualization provides.

What the Web Brings

If, even today, the greatest example of data visualization dates from the 19th century, should we worry about it all in the context of web design and development? Of course we should, because the web brings two key factors that Mindard could only have wished for: interactivity and real-time data. Both of these factors let us significantly enhance the static visualizations of the past.

Interactivity

Here’s a static image of a visualization that explores data far less consequential than war:

The visualization examines the frequencies of rides between San Francisco neighborhoods using the Über service. The static image bears more than a passing resemblance to Minard’s map, with volume represented by width and color distinguishing different paths. A static image, however, doesn’t do the Über visualization justice; it becomes much more effective when you view it on the web. In a web browser, moving your mouse over various components in the visualization highlights those components, making it much easier to isolate relevant parts of the data. For another excellent interactive example, visit the home page for Crossfilter. That example lets users dynamically manipulate the data that powers the visualization.

Interactive visualizations like these examples give the user the power to change the perspective. And they give the designer the power to present many perspectives at once. As a designer, you no longer have to worry about presenting exactly the right visualization; rather, you can give users control to make the visualization most relevant to their needs.

Real Time

Another characteristic of the web is its ability to provide real time updates to the data. As an example, here’s another static image that doesn’t really do justice to the full visualization.

Find out more about this visualization from Andrew Weeks. The exciting aspect of this visualization is its timeliness. The presentation gives you an understanding of the server’s performance exactly as it is happening. The web has always had its share of continuously updating data; think of how many “real time” stock tickers you’ve seen. Now, however, we’re combining that data with visualization that gives users a much better understanding of the information.

What D3.js Brings

Of course, harnessing interactivity and real time data, much less creating effective visualizations in the first place, aren’t necessarily easy tasks for designers and developers. But thankfully, the web also brings something else: open source software. The preeminent open source software for data visualization is undoubtedly Mike Bostock’s D3.js. D3.js is

  • A Javascript library to help you create spectacular visualizations,
  • with full interactivity using standard HTML/CSS/Javascript events,
  • That can be automatically updated in real time as underlying data changes.

And unlike many open source libraries, D3.js is ready for production use today. In fact, many stellar organizations are already using D3.js for their online content, including

If you’ve heard anything about D3.js, however, you might have heard that it’s hard to use.

Once you pass the learning curve…

D3 has a steep learning curve…

…it has a steep learning curve…

There is a huge learning curve…

…the learning curve was quite steep.

Despite its prevalence, I don’t agree with the idea that D3 has a steep learning curve. It is unique enough to require a different perspective, but I think the learning curve is actually quite small. In fact, I think there is really only one key concept that you have to master to use D3 effectively, and once you see it explained, that concept isn’t actually that difficult. The rest of this post is my attempt to prove that contention. If I’m right, you should be well on your way to using D3.js after 15 minutes of reading.

D3.js as Assembly Line

For me, the most obvious metaphor for D3.js is an assembly line. We feed it a bunch of parts and some assembly instructions, and it produces visualizations.

* Image source: Office national du film du Canada. Photothèque. Bibliothèque et Archives Canada

Perhaps the best way to appreciate the assembly line metaphor is by example. In the simple code that follows, we won’t produce a stunning visualization. In fact, the result is so boring that it’s not worth showing the final outcome. But we will use D3.js, and we will see the assembly line concept on which all great D3 visualizations build.

Assembly Plant: HTML Document

The first thing our assembly line needs is a manufacturing plant, some place to perform the assembly. In our case, of course, that plant is simply an HTML document. Of course, we need to include the D3.js library in the document. For this example, though, that’s all we need. Here’s a bare bones HTML5 document that includes D3.js.

<meta charset="utf-8" /><script type="text/javascript" src="d3.js">// <![CDATA[

// ]]></script>

Assembly Line Part A: Data

Now that we have a place to do the work, we need parts. For D3.js, there are really only two parts: data and HTML elements. We’ll start with the data. Here’s some Javascript code to retrieve a collection of numbers (formatted according to Javascript Object Notation, or JSON). (If you’re relatively new to Javascript, you might want to check out a post on Javascript Idioms that explains some of the commonly used features of Javascript that don’t appear in many traditional programming languages.)

d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// numbers[i].value is the value itself
// numbers[i].key is a unique key for each
});

Although they’re not critical and we’re certainly not required to use them, D3 does include several functions to help get the data we need. d3.json() is part of the D3 library. It retrieves JSON-formatted data from a URL, in this case the URL is numbers.json.

d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// numbers[i].value is the value itself
// numbers[i].key is a unique key for each
});

When the data is available, D3 calls the indicated (callback) function; the parameter to that function (numbers in this case) is the data returned from the URL.
d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// numbers[i].value is the value itself
// numbers[i].key is a unique key for each
});

Assembly Line Part B: Elements

Now that D3 has gotten the first part for our assembly line, it’s time to take care of the second part. The second part of the assembly is, of course, HTML elements. Here’s the code that gets the HTML elements for us.

d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// numbers[i].value is the value itself
// numbers[i].key is a unique key for each
// and we want to associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers);
});

With D3 we have complete flexibility into which HTML (or, for that matter SVG) elements we want to use in our assemly line. In this example, we’re being as simple as possible and sticking with standard paragraph (<p>) tags. We tell D3 we want to use<p> elements within the <body> tag for our data values. (The function names will make sense in a bit.)
d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// numbers[i].value is the value itself
// numbers[i].key is a unique key for each
// and we want to associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers);
});

And then we tell D3 to create a one-to-one correspondence between data values and<p> elements. Remember that our data values (part A) are contained in the Javascript array numbers.
d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// numbers[i].value is the value itself
// numbers[i].key is a unique key for each
// and we want to associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers);
});

Optional: “Serial Numbers” for the Data

The next step is optional in many visualizations (particularly for static ones), but it can be essential for visualizations in which the data might change. Sticking with the assembly line metaphor, we’re going to add a serial number to the data parts.

d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// numbers[i].value is the value itself
// numbers[i].key is a unique key for each
// and we want to associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers, function(d) { return d.key; });
});

The call to D3’s data() takes an optional second parameter. That parameter provides a key, or unique identifier, for the given data value.
d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// numbers[i].value is the value itself
// numbers[i].key is a unique key for each
// and we want to associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers, function(d) { return d.key; });
});

This function is important if we need D3 to distinguish different data values from each other. (By default, D3 uses their index in the data set.)

Assembly Instructions

Okay, now we have the two parts we need. It’s time to create the instructions that define how those two parts go together to create a visualization.

d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// and we want to associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers, function(d) { return d.key; });

// what to do when data becomes available
ptags.enter().append("p").text(function(d) { return "Value: " + d.value; });
});

Here’s what that code is telling D3: If there isn’t a <p> tag for a particular data value, …
d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// and we want to associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers, function(d) { return d.key; });

// what to do when data becomes available
ptags.enter().append("p").text(function(d) { return "Value: " + d.value; });
});

… create a new <p> tag for the data value, …
d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// and we want to associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers, function(d) { return d.key; });

// what to do when data becomes available
ptags.enter().append("p").text(function(d) { return "Value: " + d.value; });
});

… and set the text contents of the newly created <p> tag to show the value of the data.
d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// and we want to associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers, function(d) { return d.key; });

// what to do when data becomes available
ptags.enter().append("p").text(function(d) { return "Value: " + d.value; });
});

That’s it! We’ve just created a D3 visualization. We have a bit of cleanup left to do, but the essential actions are in place. We use d3.json() to get part A, selectAll("p") to get part B, data() to bind them together, and text() to explain how the finished product is constructed.

As an aside, note that D3’s append() returns the newly created element, not the container to which it was appended. This return value is different from the jQueryappend() function, so be careful when chaining methods.

Waste Removal

Okay, part of the cleanup we still need to implement is waste removal. For dynamic visualizations, data that once was part of the visualization may be removed. Here’s how we handle that case:

d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// ...
// what to do if data disappears
ptags.exit().remove();
});

If there isn’t a data value (any more) for a particular <p> tag, …
d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// ...
// what to do if data disappears
ptags.exit().remove();
});

… then just remove the <p> tag from the document.
d3.json("numbers.json", function(numbers) {
// numbers has the values we want to "visualize"
// ...
// what to do if data disappears
ptags.exit().remove();
});

Start the Day Shift

Now that our assembly line is set up and ready to go, it’s time to put it into motion.

function update() {
d3.json("numbers.json", function(numbers) {
// numbers has the values to "visualize"
// ...
});
} setInterval(update, 5000);

The easiest way to get everything running continuously is to put all the previous code in a function, …
function update() {
d3.json("numbers.json", function(numbers) {
// numbers has the values to "visualize"
// ...
});
} setInterval(update, 5000);

… and call that function periodically, in our case every 5 seconds.
function update() {
d3.json("numbers.json", function(numbers) {
// numbers has the values to "visualize"
// ...
});
} setInterval(update, 5000);

And the Night Shift

Now that our assembly line is running continuously, we can look back at some of the code we’ve written and see how they make sense in that context.

function update() {
d3.json("numbers.json", function(numbers) {
// numbers has the values to "visualize"
// associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers, function(d) { return d.key; });

// what to do when data becomes available
ptags.enter().append("p").text(function(d) { return "Value: " + d.value; });

// what to do if data disappears
ptags.exit().remove();
});
}

Every time we update the visualization, D3 will automatically identify the <p> tags that are already present in the HTML document.
function update() {
d3.json("numbers.json", function(numbers) {
// numbers has the values to "visualize"
// associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers, function(d) { return d.key; });

// ...
});
}

And D3 will automatically match the (possibly) updated data set to the pre-existing <p>elements.
function update() {
d3.json("numbers.json", function(numbers) {
// numbers has the values "visualize"
// associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers, function(d) { return d.key; });

// ...
});
}

If there isn’t a <p> tag for a particular data value, create a new <p> tag for the data value.
function update() {
d3.json("numbers.json", function(numbers) {
// numbers has the values to "visualize"
// associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers, function(d) { return d.key; });

// what to do when data becomes available
ptags.enter().append("p").text(function(d) { return "Value: " + d.value; });

// ...
});
}

If there isn’t a data value (any more) for a particular <p> tag, then just remove the <p>tag from the document.
function update() {
d3.json("numbers.json", function(numbers) {
// numbers has the values to "visualize"
// associate each point with a <p>
var ptags = d3.select("body").selectAll("p").data(numbers, function(d) { return d.key; });

// what to do when data becomes available
ptags.enter().append("p").text(function(d) { return "Value: " + d.value; });

// what to do if data disappears
ptags.exit().remove();
});
}

At this point we have a fully up-and-running visualization. That’s a good time to step back and see what D3 has allowed us to accomplish.

  • D3 automatically keeps track of data values and associated HTML (or, as we’ll see in later posts SVG) elements.
  • As the data set changes, D3 identifies elements that need to be created and elements that are no longer needed.
  • All the designer havs to do is tell it what to do with those elements.

That’s pretty amazing for just a few lines of Javascript. Even more important, though, is that D3 hasn’t at all constrained up in what we can do with the elements we’ve created. They’re just standard HTML, so we can style them with CSS, make them interactive with Javascript, link them to other pages, and so on. And, if we’re web developers, we don’t have to learn a new language to do those things. We already know HTML, CSS, and Javascript. And there’s something else you might have noticed about the code we put together. It explicitly identifies elements when they’re added (enter()) or removed (exit()) from the visualization. If that has you thinking about powerful animations, you’re not alone. As we’ll see, D3 has some pretty nifty “transitions” up its sleeve as well.

Next Steps

Now perhaps you’re thinking that having the power to style, animate, and make elements interactive is nice, but it’s also a lot of work. Well don’t worry, D3 has you covered there as well. In addition to its core function as an assembly line for visualizations, D3 includes a set of useful toolkits to help with those tasks. We’ll cover several of those tools in later posts, as we explore D3 further. Planned topics include:

  • Creating simple and complex graphs with D3 and Scalable Vector Graphics
  • Using D3′s advanced layouts for unique visualizations
  • D3, geography, and maps
Web-Based Visualization Part 1: The D3.js Key Concept, 5.0 out of 5 based on 1 rating

About the author

Stephen Thomas

Stephen Thomas is the User Experience Architect for Dell SecureWorks, where he dabbles in user research, usability testing, design, and front end development for Dell's managed security services.

1 Comment to “Web-Based Visualization Part 1: The D3.js Key Concept”

  • Otto Borden says:

    Thank you! This article is excellent!

    VA:F [1.9.22_1171]
    Rating: +1 (from 1 vote)
    August 18th, 2013 at 5:42 am · Reply

Post a comment

Pageviews: 47428 · Back to Top