Redirecting...

Click here if you are not redirected.

A to Z, F15

Week 4 Notes

All examples (client-side)

Examples requiring server-side

Data

This material is excerpted partially (and adapted for JavaScript) from Learning Processing.

Data can come from many different places: websites, news feeds, spreadsheets, databases, and so on. Let's say you've decided to make a map of the world's flowers. After searching online you might find a PDF version of a flower encyclopedia, or a spreadsheet of flower genera, or a JSON feed of flower data, or a REST API that provides geolocated lat/lon coordinates, or some web page someone put together with beautiful flower photos, and so on and so forth. The question inevitably arises: “I found all this data; which should I use, and how do I get it?”

If you are really lucky, you might find a JavaScript library that hands data to you directly with code. Maybe the answer is to just download this library and write some code like:

function setup() {
  var fdb = new FlowerDatabase();
  var sunflower = fdb.findFlower("sunflower");
  var h = sunflower.getAverageHeight();
}  

In this case, someone else has done all the work for you. They've gathered data about flowers and built a library with a set of functions that hands you the data in an easy-to-understand format. This library, sadly, does not exist (not yet), but there are some that do.

Let's take another scenario. Say you’re looking to build a visualization of Major League Baseball statistics. You can't find a library to give you the data but you do see everything you’re looking for at mlb.com. If the data is online and your web browser can show it, shouldn't you be able to get the data? Passing data from one application (like a web application) to another is something that comes up again and again in software engineering. A means for doing this is an API or “application programming interface”: a means by which two computer programs can talk to each other. Now that you know this, you might decide to search online for “MLB API”. Unfortunately, mlb.com does not provide its data via an API. In this case you would have to load the raw source of the website itself and manually search for the data you’re looking for. While possible, this solution is much less desirable given the considerable time required to read through the HTML source as well as program algorithms for parsing it.

The goal of these notes is to give you an overview of techniques, ranging from the more difficult manual parsing of data, to the parsing of standardized formats, to the use of an API designed specifically for JavaScript itself. Each means of getting data comes with its own set of challenges. The ease of using a JavaScript library is dependent on the existence of clear documentation and examples. But in just about all cases, if you can find your data in a format designed for a computer (spreadsheets, XML, JSON, etc.), you'll be able to save some time in the day for a nice walk outside.

JSON

The data exchange format that all of this week's examples focus on is called JSON (pronounced like the name Jason), which stands for JavaScript Object Notation. Its design was based on the syntax for objects in the JavaScript programming language (and is most commonly used to pass data between web applications) but has become rather ubiquitous and language-agnostic. Working with it in JavaScript is incredibly convenient.

All JSON data comes in the following two ways: an object or an array. And the syntax for these is identical to the syntax you see in JavaScript itself.

Let's take a look at a JSON object first. A JSON object is identical to a JavaScript object (without functions). It’s a collection of variables with a name and a value (or "name/value pair"). Each name is encoded as a string enclosed in quotes, this is just about the only difference. For example, following is JSON data describing a person:

{
  "name":"Olympia",
  "age":3,
  "height":96.5,
  "state":"giggling"
}

This is how it might look if you typed it into your code directly (the quotes are no longer necessary.)

var person = {
  name: "Olympia",
  age: 3,
  height: 96.5,
  state: "giggling"
}

An object can contain, as part of itself, another object. Below, the value of “brother” is an object containing two name/value pairs.

{
  "name":"Olympia",
  "age":3,
  "height":96.5,
  "state":"giggling",
  "brother":{
    "name":"Elias",
    "age":6
  }
}

To compare to data format like XML, the preceding JSON data would look like the following (for simplicity I'm avoiding the use of XML attributes).

<xml version="1.0" encoding="UTF-8"?>
<person>
  <name>Olympia</name>
  <age>3</age>
  <height>96.5</height>
  <state>giggling</state>
  <brother>
    <name>Elias</name>
    <age>6</age>
  </brother>
</person>

Multiple JSON objects can appear in the data as an array. A JSON array is simply a list of values (primitives or objects). The syntax is identical to JavaScript syntax. Here is a simple JSON array of integers:

[1, 7, 8, 9, 10, 13, 15]

You might find an array as part of an object. Below the value of “favorite colors” is an array of strings.

{
  "name":"Olympia",
  "favorite colors":["purple","blue","pink"]
}

A great place to find a selection of JSON data sources to play with is corpora, a github repository maintained by Darius Kazemi. For example, here’s a JSON file containing information about birds in Antarctica.

Loading JSON into your code

Now that I've covered the syntax of JSON, I can look at using the data in JavaScript and p5.js. The first step is simply loading the data loadJSON(). loadJSON() can be called in preload or used with a callback. I'm using callbacks in just about all my examples so let's follow that syntax here.

function setup() {
  loadJSON('birds_antarctica.json', gotData);
}

function gotData(data) {
  // The JSON is now in data!
  console.log(data);
}

The data from the JSON file is passed into the argument data in the gotData callback. Then it becomes a bit of detective work. How is the data structured — a single object? an array of objects? An object full of arrays of objects? Let’s look at a snippet from the birds of Antarctica.

{
  "description": "Birds of Antarctica, grouped by family",
  "source": "https://en.wikipedia.org/wiki/List_of_birds_of_Antarctica",
  "birds": [
    {
      "family": "Albatrosses",
      "members": [
        "Wandering albatross",
        "Sooty albatross",
        "Light-mantled albatross"
      ]
    },
    {
      "family": "Cormorants",
      "members": [
        "Antarctic shag",
        "Imperial shag",
        "Crozet shag"
      ]
    }
  ]
}

If the JSON file is loaded into the variable data, the way you access that data is no different than if you had said:

var data = {
  "description": "Birds of Antarctica, grouped by family",
  "source": "https://en.wikipedia.org/wiki/List_of_birds_of_Antarctica"
  // etc
}

For example, if you wanted to display the description and link it to the source you would say:

createA(data.source, data.description);

And since birds is an array of objects, you can use a for loop just the way you always do with arrays. Each element of the array is an object itself with properties that can be accessed like family and members (which is also an array!).

for (var i = 0; i < data.birds.length; i++) {
  var family  = data.birds[i].family;
  createElement('h2', family);
  var members = data.birds[i].members;
  for (var j = 0; j < members.length; j++) {
    createDiv(members(i));
  }
}

Here’s what this looks like:

Birds of Antarctica, grouped by family

Albatrosses

Wandering albatross
Sooty albatross
Light-mantled albatross

Cormorants

Antarctic shag
Imperial shag
Crozet shag

APIs

What makes something an API versus just some data you found, and what are some pitfalls you might run into when using an API?

An API (Application Programming Interface) is an interface through which one application can access the services of another. These can come in many forms. Openweathermap.org is an API that offers its data in JSON, XML, and HTML formats. The key element that makes this service an API is exactly that offer; openweathermap.org's sole purpose in life is to offer you its data. And not just offer it, but allow you to query it for specific data in a specific format. Let's look at a short list of sample queries.

http://api.openweathermap.org/data/2.5/weather?lat=35&lon=139
A request for current weather data for a specific latitude and longitude.
http://api.openweathermap.org/data/2.5/forecast/daily?q=London&mode=xml&units=metric&cnt=7&lang=zh_cn
A request for a seven day London forecast in XML format with metric units and in Chinese.
http://api.openweathermap.org/data/2.5/history/station?id=5091&type=day
A request for a historical data for a given weather station.

One thing to note about openweathermap.org is that it does not require that you tell the API any information about yourself. You simply send a request to a URL and get the data back. Other APIs, however, require you to sign up and obtain an access token. The New York Times API is one such example. Before you can make a request, you'll need to visit The New York Times Developer site and request an API key. Once you have that key, you can store it in your code as a string.

// This is not a real key
var apiKey = "40e2es0b3ca44563f9c62aeded4431dc:12:51913116";

You also need to know what the URL is for the API itself. This information is documented for you on the developer site, but here it is for simplicity:

var url = "http://api.nytimes.com/svc/search/v2/articlesearch.json";

Finally, you have to tell the API what it is you are looking for. This is done with a “query string,” a sequence of name value pairs describing the parameters of the query joined with an ampersand. This functions similarly to how you pass arguments to a function. If you wanted to search for the term "JavaScript" from a search() function you might say:

search("JavaScript");

Here, the API acts as the function call, and you send it the arguments via the query string. Here is a simple example asking for a list of the oldest articles that contain the term "JavaScript" (the oldest of which turns out to be May 12th, 1852).

// The name/value pairs that configure the API query are: (q,JavaScript) and (sort,oldest)
var query = "?q=JavaScript&sort=oldest";

This isn't just guesswork. Figuring out how to put together a query string requires reading through the API's documentation. For The New York Times, it’s all outlined on the Times' developer website. Once you have your query you can join all the pieces together and pass it to loadJSON(). Here is a tiny example that simply displays the most recent headline.

function setup() {

  var apiKey = "sample-key";
  var url = "http://api.nytimes.com/svc/search/v2/articlesearch.json";
  var query = "?q=JavaScript&sort=newest";

  // Here, I format the call to the API by joing the URL with the API key with the query string.
  loadJSON(url+query+"&api-key="+apiKey, gotData);

  function gotData(data) {
    // Grabbing a single headline from the results.
    var headline = data.response.docs[0].headline.main;
    createP(headline);
  }
}

Some APIs require a deeper level of authentication beyond an API access key. Twitter, for example, uses an authentication protocol known as “OAuth” to provide access to its data. Writing an OAuth application requires more than just passing a string into a request. There are some examples this week that use server-side programming in Node to perform the authentication.

Encoding URLs

Certain characters and invalid in URLs. For example, let’s say you were querying wordnik for the words “bath towel”. You would have to say bath%20towel. You could do this yourself with a regex or use URI encoding with encodeURI(). Here is more documentation and an example below.

var query = 'http://api.wordnik.com/v4/word.json/bath towel/api_key=apikeyblahblahblah';
// Encode the query before you ask for it
var encoded = encodeURI(query);
loadJSON(encoded, callback);

encodeURI does not encode the following characters: , / ? : @ & = + $ #. This is as it should be since these are used in URLs to mean certain things. However, if you wanted to have a $ or / as part of some text you are passing into a key/value pair you would want to encode these characters. For this encodeURIcomponent() can be used.