Download source code - 329.9 KB

Introduction

From time to time, you may find yourself needing to supply data to a web page in order to populate a report webpage, etc. that you can't fetch directly from a database, but still want to have some measure of "live" data.

In circumstances like these, you can publish your data to a Google spreadsheet from where its data can be made available. From that point, it's really easy to read the data asynchronously and convert grids of information into arrays of JavaScript objects.

The Google Sheets API is easy enough to get information about, and there are plenty of wrapper libraries around to provide convenient options to help you as programmers. However, these sometimes provide far more capability than you actually need and have their own learning curve to overcome.

This article address the need for basic batch loading of grids of data in your google sheets that represent lists of data objects. It will be done by giving your one single method in an ES5 style module. You can find it in the attached source code file "dalGoogleSheetReader.js".

Background

From time to time, there is a need to build elements of an application front end before the necessary work at the back end systems has been completed. This is especially true when creating proof of concepts applications where there is no appetite (or budget!) to involve the back end database managers.

You always have the option of hard coding some sample data (JSON files, etc.) into such things, but that is just wasted effort even for a prototype/PoC application.

The data that you want to work with often already exists in various old applications and there is nearly always a way to get an extract from such applications into an Excel spreadsheet or similar and from there, it is a trivial task to load it into a Google spreadsheet.

At that point, you are nearly ready to pull the data into your project and show people your progress using real data. That way, when your user community see it, they can focus on what your application is doing and not on spurious made up records.

Ingredients

In order to use the library, there are four things you need to be able to get and use the data:

A Google Spreadsheet Identifier - the spreadsheet must be accessible by sharing link (or public)
An API Key - Generate this in the Google developers console
A list of ranges of data in the sheet - e.g., ['Sheet1!A1:D', 'Sheet2:B2:F6', ...]
A callback function to receive a list of arrays filled with the data objects from the ranges specified in (3)

Once you have these, you only to write one line of code!

simpleGoogleSheetUtility.googleSheetBatchDataReaderAsPromise(sheetId, key, aRanges).then(
  fCallbackExpectingAnArrayOfArraysHoldingJSObjects
)

Clearly, there is a lot of work going on behind the scenes, which I will discuss below. But, before that, let us take a look at an example web page I built that uses this. You can find it in the attached archive.

The example files have been written so that they will work using the file protocol so you just have to extract the zip and open the index.html file with a modern browser. (The code uses some modern ES2015 syntax so it won't work with IE11 without transpiling!)

The example has two sections where the utility is used.

Fetching Data
In Practice Example

How Does the Library Use Google Sheets

Both examples use the Google spreadsheet available at this link.

The third argument passed to the function is an array of strings (A1 Range specifications) used to identify a grid of data in the Google sheet like this:

The picture above shows the grid from "Student!A1:C".

The Google sheet API takes such range specifications and sends you back a JSON object containing grids of data as a two dimensional array of strings. You'll get something like this:

{
  "spreadsheetId": "1lH1TL9T1Ad_9irqkQVadrIkVdA_7Nkpnfn7CAUINY04",
  "valueRanges": [
    {
      "range": "Student!A1:C14",
      "majorDimension": "ROWS",
      "values": [
        [
          "Person-Id",
          "Forename",
          "Lastname"
        ],
        [
          "AA",
          "Andrew",
          "Alpha"
        ],
. . . .
     ]
  }
}

However, this is not what we want to work with!

The information is here, but what we want is to convert the above into something like this:

[
  [
    { "Person-Id": "AA", "Forename": "Andrew", "Lastname": "Alpha" },
    { "Person-Id": "BB", "Forename": "Belinda", "Lastname": "Beta" },      
// More objects . . . . 
  ],
// Additional arrays of objects
]

The advantage to this structure is that you can write code to reference the information about a record by a property name and not some index offset that would all too easily be broken by insertion of a column into your sheet data.

Transforming the Sheet Data

Before we examine the particulars of the JavaScript function and its implementation, I think it's worth showing the pseudo code of the transformation process:

create a "result array list"
for each "valueRange" from google-sheet
  extract the 2D array in "values"

  use the 1st row as a property name map

  for each subsequent row
    create an object using the map of property names and the cells as values 
    add the object to an array
add the array of objects to the "result array list"    
return the "result array list" as a "Resolved Promise"

Now for the real code!

Understanding the Module Code

The namespace "simpleGoogleSheetUtility" and method "googleSheetBatchDataReaderAsPromise" only take 67 lines of code in total including liberal comments, but I won't go through every line here. That would be boring, you have access to the code in the download so open it up in your favourite editor where you'll get all the nice syntax colouring you'll be used to! What I shall do here is focus on the key concepts I am using.

The Revealing Module Pattern

The first thing to point out is that the namespace I have used, simpleGoogleSheetUtility, is actually just an implementation of the Revealing Module Pattern. This is fairly well known and there are plenty of tutorials on the internet that you can take a look at. However, to save you jumping over to your preferred search engine to find an article, I have found one for you: The Revealing Module Pattern

The essence of this particular implementation is:

var simpleGoogleSheetUtility = (function () { // eslint-disable-line no-unused-vars
  'use strict'
  /* globals fetch R */
  // 60 lines of code here

  return {
    googleSheetBatchDataReaderAsPromise: googleSheetBatchDataReaderAsPromise
  }
})()

As you can see, the only method that I have exposed in the module is googleSheetBatchDataReaderAsPromise.

I make no apology for the rather long names that I use. I have previously read Robert C. Martin's book "Clean Code" and I (mostly) agree with the advice it contains. Basically, I like the fact that in a number of months when I come to look at the code using this module, I know that what I will get from the method is a Promise. So, when you use it in an application, you know to expect to write code like:

simpleGoogleSheetUtility.googleSheetBatchDataReaderAsPromise(
  sheetId,
  apiKey,
  ['Sheet1!A1:C', 'Sheet2!B2:F6']).then(
    function (aObjectLists) {
      console.log(`There are ${aObjectLists.length} list of objects from the google sheet.`)
    }
  )

The googleSheetBatchDataReaderAsPromise Function

The actual work of the function comes down to making a call on a Google REST API endpoint. This is going to be an asynchronous operation. Code to use such endpoints can become quite nested, especially if you have several actions that must follow each other. To help make this code clear, I have resorted to the modern ES2015 syntactic sugar of async and await.

The function itself is declared as:

async function googleSheetBatchDataReaderAsPromise (sheetId, apiKey, aRanges)

What this means in practice is that whatever gets returned from the function will be wrapped in a Promise object. It also means that inside the function, we can use the await keyword. This will let us write asynchronous code to look like it's normal synchronous code. It prevents us having to nest then functions in a hard to read pyramid of complex code.

The three input parameters are used to build a URL. I am not going to describe that part of the code, it's just string manipulation. In the example application, this initially gets called and builds the following URL:

https://sheets.googleapis.com/v4/spreadsheets/1lH1TL9T1Ad_9irqkQVadrIkVdA_7Nkpnfn7CAUINY04/values:batchGet?key=AIzaSyARrbqEv9KsydppJNtIjk5ndmw0dfJ7dV8&ranges=Student!A1:C&ranges=Class!A1:C&ranges=Student_Class!A1:I

Where things get interesting is in the call to the fetch function. fetch is available in modern browsers and is essentially a standardised wrapper around the old XmlHttpRequest type services we used previously (or more probably jQuery's $.ajax method!)

const fetchOptions = {
  method: 'GET',
  mode: 'cors',
  cache: 'default'
}

let response = await fetch(url, fetchOptions)
let oSheetBatchJson = await response.json()

What is interesting is the fact that fetch is an asynchronous function. This is far easier to read and understand when you come back to the code later than a whole bunch of nested thenable function calls.

Having got the data into oSheetBatchJson, we just need to transform it into the desired format and return it as a resolved Promise.

return oSheetBatchJson.valueRanges.map(googleSheetConvertValueRangeToObjectArray)

The function googleSheetConvertValueRangeToObjectArray is private to our module. It's sole purpose is to transform a batch item of data from the Google sheets format to our required array of objects.

  function googleSheetConvertValueRangeToObjectArray (valueRange) {
    const aValues = valueRange.values
    const aHeaders = R.compose(
      R.map(prop => R.trim(prop)),
      R.head
    )(aValues)

    const fnTransformArrayPairToObject = function (aProperty, aValues) {
      let ret = {}
      // NOTE: The G Sheets API will return "ragged" arrays of strings. It is necessary
      //     : to check both input arrays are long enough to make properties from.

      for (let i = 0; i < aProperty.length && i < aValues.length; i++) {
        ret[aProperty[i]] = R.trim(aValues[i])
      }
      return ret
    }

    // Since aHeaders is not going to change we can curry the transformer
    const fnTransformArrayRowToObject = R.curry(fnTransformArrayPairToObject)(aHeaders)

    let aObjects = R.compose(
      R.map(fnTransformArrayRowToObject),
      R.tail
    )(aValues)

    return aObjects
  }

Up to this point, I have not mentioned the fact that this library actually has a dependency, Ramda.

Ramda - Functional Library

The Ramda library has a whole bunch of useful helper functions that make processing of objects and lists extremely easy. I don't want to get bogged down in the details of functional programming, that is not the purpose of this article and utility library, but it is important that you are aware that it is required. Also, I am far from an expert in the subject and am simply a student who uses Ramda because it is effective and clear in what it does.

The function is called for each valueRange entry in the oSheetBatchJson. The content of one of these was shown above.

The first thing we do here is pull out the 2D array of values.

We are expecting that the first row contains a list of property names. Given that this data comes from a Google sheet, I don't want to assume that the header names are well formed, so I want to trim any space from the ends of the text. So, I have used a small Ramda composition:

const aHeaders = R.compose(
   R.map(prop => R.trim(prop)),
   R.head
   )(aValues)

Ramda compose function works by passing some data through a sequence of functions from Right to Left. (Or bottom to top as shown here.)

The most important thing to remember about Ramda functions is that they NEVER alter the input data. So, reading right-to-left, what happens here is this:

The 2D array passes to R.head which pulls out the first row only and passes that to R.map, which in this case, processes an input array item by item with a function. That mapped function here is:

function (prop) { return R.trim(prop) }

Finally, the result of the mapping is passed to aHeaders.

So, in short, we go from aValues:

[
  [" a ", "b", "c ", " d"],
  ["a one", "B 1", "C1", "1D"],
  ["a two", "B 2", "C2", "2D"],
  ...
]

to aHeaders:

["a", "b", "c", "d"]

Importantly, aValues has not changed!

The next step is to make a new function that takes two arrays of values and makes an object out of them:

const fnTransformArrayPairToObject = function (aProperty, aValues) {
  let ret = {}
  // NOTE: The G Sheets API will return "ragged" arrays of strings. It is necessary 
  // : to check both input arrays are long enough to make properties from.
  for (let i = 0; i < aProperty.length && i < aValues.length; <code>i</code>++) {
    ret[aProperty[i]] = R.trim(aValues[i])
  }
  return ret
}

When it comes to using this transformation function over all the data, there is a useful functional facility known as currying where a function can be wrapped in another function where some of the arguments are already passed.

// Since aHeaders is not going to change we can curry the transformer
const fnTransformArrayRowToObject = R.curry(fnTransformArrayPairToObject)(aHeaders)

Finally, this curried version of the transformer is used in another composition to process the data rows:

let aObjects = R.compose(
  R.map(fnTransformArrayRowToObject),
  R.tail
)(aValues)

return aObjects

Again, we start with aValues which has the header row removed (R.tail), the remaining rows are then passed through the map function which applies the curried fnTransformArrayRowToObject function leaving us with an array of objects.

That is all there is to know about using the utility library. In one line of code and a .then handler, you will be able to work with object data arrays!

Google Sheet Data Gotchas

Although it is easy to get data from Google sheets like this, there are a few caveats that you should be aware of!

Ragged Arrays
Formatted Data

Ragged Data

The Google Sheets API tries to reduce unnecessary content by ignoring empty data trailing in a row.

For example, if you have a grid like this:

Id	Value	Comment
1	23.3
2	123	Expensive

The data array you would get would be:

[
  ["id","Value","Comment"],
  ["1", "23.3"], // This row has only 2 entries - It is ragged
  ["2", "123", "Expensive"]
]

This is not a problem as such, but if you need objects like this...

{"id": "1", "value": "23.3", "Comment": ""} instead of {"id": "1", "value": "23.3"}

...then you are going to have to wither process the object list or alter the input data on the sheet.

Note, when doing the later, my favoured technique is to use a Google sheet array formula to generate a column of pointless "1" values or similar to the right of the grid. That way, the API would give you something like:

[ 
  ["id","Value", "Comment",   "Pointless"], 
  ["1", "23.3",  "",          "1"],
  ["2", "123",   "Expensive", "1"] 
]

Formatted Data

By default, what you see in a Google sheet is what your data contains. If you have specific numeric formatting (Currency, etc.), you must expect to get that in your data!

For example, if you have a grid like this:

Id	Value	Comment
1	£23.30
2	£123.00	Expensive

The data array you would get would be:

[
  ["id","Value","Comment"],
  ["1", "£23.30"], 
  ["2", "£123.00", "Expensive"]
]

Also, all the data you get is text! If you want something else, then you will have to further process the results.

It is possible to change this behaviour in the options passed to the fetch(), but this is the default behaviour.

See Method: spreadsheets.values.batchGet

Making Specific Data Structures

In the sample application, the "In Practice" example showed a Gantt chart from the Highcharts library. This library required a set of data that needs the start and end time code of some activity bars to display. The problem being that what was passed was text string like "yyyy-mm-dd".

It is not hard to convert this sort of data, but you do need to be aware that the library will give you formatted strings.

Since we have already been using Ramda to process lists, the example Gantt loader makes use of it too.

If you look at the sample in the download, the file index.js uses the utility library at line 66:

let chartConfig = { .....}
simpleGoogleSheetUtility.googleSheetBatchDataReaderAsPromise(sheetId, key, aRanges).then(
  function (o) {
    // THE CHART
    let aData = o[0] //First (and only) object array
    // The data from the googleapis will be {string}, but Highcharts requires 
    // the start and end to be Unix like time code
    // We transform the data using Ramda evolve function on each item "date" property

    let fnStringDateToTimecode = s => new Date(s).getTime()

    let oTransformMap = { start: fnStringDateToTimecode, end: fnStringDateToTimecode }
    let fnConvertData = R.curry(R.evolve)(oTransformMap)
    aData = aData.map(fnConvertData)
    chartConfig.series[0].data = aData
    Highcharts.ganttChart('divGantt', chartConfig)
  }
)

The key help from Ramda here is with its R.evolve function (see https://ramdajs.com/docs/#evolve).

This function takes two arguments, an object mapping property names to a conversion function and some object to be converted.

In our case, we build a transformation object with properties start and end that will use the transformation function fnStringDateToTimecode. This function turns a string date that can be parsed into the underlying time code.

Since we have an array of objects to process, we can use the built in Array map function to evolve each object. However, R.evolve expects two arguments, the second being the data . So, we use the R.curry to wrap the evolve function into a single parameter function where the transformation object is already applied. So, in just five lines of effective code, we have transformed an array of objects of strings into something that can be consumed by Highcharts Gantt!

Hopefully, you will be able to see how useful this can be in mocking up your next application and getting approval to build something really cool!

History

5^th May, 2020: Initial version