Data can come from many different places: websites, news feeds, spreadsheets, databases, and so on. Let's say you've decided to make a map of the world's flowers. After searching online you might find a PDF version of a flower encyclopedia, or a spreadsheet of flower genera, or a JSON feed of flower data, or a REST API that provides geolocated lat/lon coordinates, or some web page someone put together with beautiful flower photos, and so on and so forth. The question inevitably arises: “I found all this data; which should I use, and how do I get it?”
In this case, someone else has done all the work for you. They've gathered data about flowers and built a library with a set of functions that hands you the data in an easy-to-understand format. This library, sadly, does not exist (not yet), but there are some that do.
Let's take another scenario. Say you’re looking to build a visualization of Major League Baseball statistics. You can't find a library to give you the data but you do see everything you’re looking for at mlb.com. If the data is online and your web browser can show it, shouldn't you be able to get the data? Passing data from one application (like a web application) to another is something that comes up again and again in software engineering. A means for doing this is an API or “application programming interface”: a means by which two computer programs can talk to each other. Now that you know this, you might decide to search online for “MLB API”. Unfortunately, mlb.com does not provide its data via an API. In this case you would have to load the raw source of the website itself and manually search for the data you’re looking for. While possible, this solution is much less desirable given the considerable time required to read through the HTML source as well as program algorithms for parsing it.
This is how it might look if you typed it into your code directly (the quotes are no longer necessary.)
An object can contain, as part of itself, another object. Below, the value of “brother” is an object containing two name/value pairs.
To compare to data format like XML, the preceding JSON data would look like the following (for simplicity I'm avoiding the use of XML attributes).
You might find an array as part of an object. Below the value of “favorite colors” is an array of strings.
A great place to find a selection of JSON data sources to play with is corpora, a github repository maintained by Darius Kazemi. For example, here’s a JSON file containing information about birds in Antarctica.
loadJSON() can be called in
preload or used with a callback. I'm using callbacks in just about all my examples so let's follow that syntax here.
The data from the JSON file is passed into the argument
data in the
gotData callback. Then it becomes a bit of detective work. How is the data structured — a single object? an array of objects? An object full of arrays of objects? Let’s look at a snippet from the birds of Antarctica.
If the JSON file is loaded into the variable
data, the way you access that data is no different than if you had said:
For example, if you wanted to display the description and link it to the source you would say:
birds is an array of objects, you can use a
for loop just the way you always do with arrays. Each element of the array is an object itself with properties that can be accessed like
members (which is also an array!).
Here’s what this looks like:
What makes something an API versus just some data you found, and what are some pitfalls you might run into when using an API?
An API (Application Programming Interface) is an interface through which one application can access the services of another. These can come in many forms. Openweathermap.org is an API that offers its data in JSON, XML, and HTML formats. The key element that makes this service an API is exactly that offer; openweathermap.org's sole purpose in life is to offer you its data. And not just offer it, but allow you to query it for specific data in a specific format. Let's look at a short list of sample queries.
One thing to note about openweathermap.org is that it does not require that you tell the API any information about yourself. You simply send a request to a URL and get the data back. Other APIs, however, require you to sign up and obtain an access token. The New York Times API is one such example. Before you can make a request, you'll need to visit The New York Times Developer site and request an API key. Once you have that key, you can store it in your code as a string.
You also need to know what the URL is for the API itself. This information is documented for you on the developer site, but here it is for simplicity:
search() function you might say:
This isn't just guesswork. Figuring out how to put together a query string requires reading through the API's documentation. For The New York Times, it’s all outlined on the Times' developer website. Once you have your query you can join all the pieces together and pass it to
loadJSON(). Here is a tiny example that simply displays the most recent headline.
Some APIs require a deeper level of authentication beyond an API access key. Twitter, for example, uses an authentication protocol known as “OAuth” to provide access to its data. Writing an OAuth application requires more than just passing a string into a request. There are some examples this week that use server-side programming in Node to perform the authentication.
Certain characters and invalid in URLs. For example, let’s say you were querying wordnik for the words “bath towel”. You would have to say
bath%20towel. You could do this yourself with a regex or use URI encoding with
encodeURI(). Here is more documentation and an example below.
encodeURI does not encode the following characters:
, / ? : @ & = + $ #. This is as it should be since these are used in URLs to mean certain things. However, if you wanted to have a $ or / as part of some text you are passing into a key/value pair you would want to encode these characters. For this
encodeURIcomponent() can be used.