Google Cloud Platform: Storing Data on Google Cloud SQL

Ted Neward

0/5 (0 vote)

Jan 27, 2014

CPOL

8 min read

25382

Google Cloud Platform - Part 5: Google Cloud SQL

Part 5: Google Cloud SQL

Again, welcome back. If you’re just tuning in, you might find it easier to begin with the start of the series, beginning with Part 1 found here. If you’re one of those who likes walking into the theater somewhere after the second act, though, by all means, keep reading—we’re building an application on Google Cloud Platform, and in this installment, we’re going to do what most applications need to do: store data.

Google offers several ways of dealing with data storage: Google Cloud SQL, for those applications that want to store data in the time-honored fashion of the relational database and relational model, Google Cloud Storage, for bulk data storage, and Google Cloud Datastore, a non-relational "NoSQL" data storage system. Google Cloud Storage is going to typically be used for large-item storage, such as binaries and/or large images, and as such, generally won’t be something the developer codes in an atomicity smaller than a "file." As such, it generally won’t be something that’s in competition with Google Cloud SQL or Google Cloud Datastore. Between those two, however, the usual "SQL vs. NoSQL" opinions range, and much as the various technical pundits and evangelists might want to disagree, none is "superior" to the other. In fact, many applications may find it beneficial to make use of both (or all three) as part of an application, a technique that is sometimes called "polyglot persistence" or "poly-store". Google Cloud SQL has one significant drawback in that it doesn’t have a free-usage tier, whereas Google Cloud Datastore does, but since most production-quality applications will leave the free-usage tier behind fairly quickly, this shouldn’t really be considered in the architectural decision-making process between the two except in very rare, light usage cases.

All of them are pretty straightforward from the Java developers’ perspective, and we’ll discuss all three over the next three issues. However, one of them has to go first, so while you were reading Part 4, we flipped a coin backstage, and Google Cloud SQL won the toss, so it gets to go first.

Google Cloud SQL

Using Google Cloud SQL in Google Cloud Platform is really not a whole lot different from using a relational database in a traditional JavaEE application. For that reason, most of what a Java developer knows about JDBC and JavaEE will be fully applicable here. In fact, Google Cloud SQL is, at its heart, a MySQL instance running on Google servers, so most of the particulars of working with MySQL will apply to working with Google Cloud SQL; for that reason, any developer who needs more details about the particular nuances of SQL in Google Cloud SQL (degrees of SQL-92 or SQL-99 support, or schema data type support, and so on) should take some time and visit the MySQL website. The local app server that the Google App Engine SDK runs contains support for running Google Cloud SQL databases locally, so once again, once the Google App Engine SDK is installed on the developer’s machine, no other downloads are really necessary (with only a few caveats to that statement, which we’ll see in a few paragraphs).

To create the SQL instance, journey on over to your application’s console in the cloud (at http://cloud.google.com/console, if you don’t remember from Part 2, when we created the application endpoint within the Google Cloud Platform), and click on the project link listed there. Look along the left side of the following page, and click on "Cloud SQL". The big red "New Instance" button does pretty much exactly what it says it does, and the next page contains important settings like tier size (which relates directly to billing, so choose this carefully) and backup windows and so on. Once those settings are good, click the blue "Confirm" button on the right, and voila, a Google Cloud SQL instance is up and running, assuming you have billing enabled in your Google Account.

Note that Google Cloud Platform has a restriction that states that the Google Cloud SQL instance must be located within the same region as the Google App Engine application that uses it, so if the application is running on servers located in the US, the Google Cloud SQL instance must also be running on US servers. (There’s a whole host of legal reasons, many of which are due to the different privacy laws in the US and the EU, why any application running would want to do this, not to mention the negative architectural implications of a US application having to run to the EU for its data, so this really isn’t a limitation, per se.)

Once the SQL instance is created, the story essentially becomes one that is highly familiar to any JavaEE developer: using your JDBC-based persistence tool of choice, whether that is Hibernate, JPA, iBATIS, DTOs, raw JDBC, or (Heaven help us all) your own home-grown object/relational mapping layer, connect to the database using the JDBC URL given in the administrative console, and "do the relational thing". If the O/R-M library/framework doesn’t build out the schema based on the persistent classes described in your application, Google provides either an admin UI (called "SQL Prompt") for scripting out the schema (described at https://developers.google.com/cloud-sql/docs/sql_prompt), or by using the command-line SQL tool that comes as part of the Google App Engine SDK called "google_sql" (described at https://developers.google.com/cloud-sql/docs/commandline).

Note that the Google App Engine SDK itself has support for JDO and JPA interfaces working against Google Cloud SQL, but any other JDBC-based persistence tool of choice will need to be downloaded/provided by the developer. This means that if, for example, the company has a corporate standard using Hibernate for relational data access, the Hibernate JARs (tuned to the MySQL flavor of Hibernate) will need to be downloaded and included as part of the WAR that App Engine will upload to the Google Cloud Platform environment.

For this particular article, I’m going to stick with plain ol’ JDBC, only because it’s a "lowest common denominator" and helps demonstrate how to work with the Google Cloud SQL more directly. Any relational-facing technology that sits on top of JDBC (which, when talking about things running on top of the JVM, realistically means all of them) will be able to work with Google Cloud SQL just fine.

JDBC Connections and Drivers

To connect to a Google Cloud SQL instance, once it’s been created in the Google Cloud Platform control panel, the next step is pretty familiar: as with any JDBC application, the Java code needs to load the correct JDBC driver into the JVM, and open a connection using a JDBC URL. Fortunately, this is pretty straightforward JDBC work; unfortunately, there’s a slight difference in the URL when running locally or in the Google Cloud Platform:

    String url = null;
    try {
      if (SystemProperty.environment.value() ==
          SystemProperty.Environment.Value.Production) {
        // Load the class that provides the new "jdbc:google:mysql://" prefix.
        Class.forName("com.mysql.jdbc.GoogleDriver");
        url = "jdbc:google:mysql://project-id:instance-name";
      } else {
        // Local MySQL instance to use during development.
        Class.forName("com.mysql.jdbc.Driver");
        url = "jdbc:mysql://127.0.0.1:3306";
      }
      url = url + "/guestbook?user=root";
    } catch (Exception e) {
      e.printStackTrace();
      return;
    }

As you can see, Google provides a handy way of determining the current running location, via the ubiquitous Java "system properties" mechanism, embodied in the com.google.appengine.api.utils.SystemProperty class.

Note also that the end of the URL is a "user=root" parameter, to allow the application to log into the database as the root-slash-administrative user. The Google Cloud SQL instance, like most relational database systems, allows for separate users/passwords and security restrictions, but since much of the time these aren’t used by web applications or REST endpoints (preferring instead to control authentication and authorization at the application level), the "root" user simplifies the auth/auth situation. Note that this mechanism relies on the Google Cloud SQL instance specifying which applications in App Engine can access the database—this is configurable in the Google Cloud SQL console. (Recall that when we created the SQL instance, it automatically assumed it was accessible from our Google App Engine application instance.)

If you (or, more likely, your IT security department) need user-level security at the database level, they can be added through the Google Cloud SQL console, the same one we used earlier to configure the Google Cloud SQL instance. And in that case, the user and their password would be passed using the three-argument version of DriverManager.getConnection(), rather than sending it through the URL.

From here, though, the story is pretty much 100% generic, plain-vanilla, no-surprises JDBC:

Connection conn = DriverManager.getConnection(url);
String statement = "INSERT INTO entries (guestName, content)" +
                   " VALUES( ? , ? )";
PreparedStatement stmt = conn.prepareStatement(statement);
stmt.setString(1, fname);
stmt.setString(2, content);
int success = 2;
success = stmt.executeUpdate();

// ...

ResultSet rs = conn.createStatement().executeQuery(
    "SELECT guestName, content, entryID FROM entries");
while (rs.next()) {
    String guestName = rs.getString("guestName");
    String content = rs.getString("content");
    int id = rs.getInt("entryID");
}

This should be familiar ground for any enterprise Java developer, so we’ll just leave the story off here.

Summary

The Google Cloud SQL story is a pretty straightforward one, which in many respects is exactly what we would want it to be—relational databases and SQL are a "known commodity" to most Java developers. Unless there is significant advantage to be had by doing something deeply different, it makes more sense to "go with the flow" of what existing relational-oriented technologies are available. Of course, this isn’t trying to suggest that one should always prefer the RDBMS over other, differently-"shaped" kinds of data storage systems like Google Cloud Datastore. And, as most RDBMS-using developers know, trying to store binaries—particularly large ones—into an RDBMS can be a particularly tricky experience.

Fortunately, Google provides other options for storage, such as Google Cloud Datastore, a non-relational data storage system that we’ll get into next time, and Google Cloud Storage, which we’ll investigate after that. In the meantime, you’ve got an easy-to-use RDBMS accessible to you, so happy coding!