Download source - 547.1 KB

In this article in our series about using portable neural networks in 2020, you’ll learn how to convert a TensorFlow model to the portable ONNX format.

Since ONNX is not a framework for building and training models, I will start with a brief introduction to TensorFlow 1.0 and TensorFlow 2.0. This will be useful for engineers that are starting from scratch and are considering TensorFlow as a framework to build and train their models.

A Brief Introduction to TensorFlow

TensorFlow was originally created by the Google Brain team. Version 1.0 was released in February of 2017. It has an architecture that makes it easy to deploy on any computational device. For example, it can be deployed to server clusters with CPUs, GPUs, or TPUs. It can also be deployed to mobile devices and edge devices. Its flexible deployment has made it a favorite for production environments. However, it has lost market share to PyTorch in research environments because PyTorch introduced a "Define-by-Run" scheme, which is a type of neural network that can be defined dynamically based on computations occurring within the network. TensorFlow 1.0 is based on a "Define-and-Run" scheme, where the network is defined and fixed. The only thing that occurs at run time is that data is fed to the network.

In an attempt to appeal to researchers, TensorFlow 2.0 introduced a "Define-by-Run" scheme. Additionally, TensorFlow 2.0 incorporated Keras. Keras is a high-level API for building Neural networks that is easy to use. Most of the TensorFlow 2.0 examples in the TensorFlow 2.0 docs use Keras, but it is possible to build a Neural network using only TensorFlow 2.0. You may want to do this if you need low-level control over the network or if you are migrating an existing TensorFlow 1.0 network to TensorFlow 2.0.

Installing and Importing the Converter

Before converting your TensorFlow models to ONNX you will need to install the tf2onnx package as it is not included with either version of TensorFlow. The following command will install this utility:

pip install tf2onnx

Once installed, the converter can be imported into your modules using the import below. However, as we will see in the next couple of sections, it is easier to use the tf2onnx utility from the command line.

import tf2onnx

A Quick Look at a Model

Both TensorFlow 1.0 and TensorFlow 2.0 provide a lower-level API. For example, they allow you to set weights and biases for more control of model training. However, for the purpose of converting TensorFlow 1.0 and TensorFlow 2.0 models to ONNX, we will only concern ourselves with the code that specifies inputs, outputs, and the location of the model when saved in one of TensorFlow’s formats. (There is a full demo for this article that predicts numbers from handwritten samples in the MNIST dataset.)

Tensorflow 1.0 uses a placeholder method to create special variables used to indicate the input and output of the model. Below is an example.To make converting to ONNX easier, it is best to specify a name when setting up placeholders. Here the input name is "input" and the output name is "output".

# tf Graph input
X = tf.placeholder("float", [batch_size, num_input], name="input")
Y = tf.placeholder("float", [batch_size, num_classes], name="output")

Once a session is created with a model in TensorFlow 1.0, it can be saved using the code below. What can be a little confusing is that the file format changed from TensorFlow 1.0 to TensorFlow 2.0. TensorFlow 1.0 uses "checkpoint" files to persist models. The sample below specifies a checkpoint file. The convention is to use the extension ckpt. There will be other files saved to the same directory as the ckpt file so it is wise to create a directory just for the saved model.

saver = tf.train.Saver()
save_path = saver.save(sess, './tensorflow/tensorflow_model.ckpt')

TensorFlow 2.0 uses the "SavedModel" format, which makes the conversion process a little easier. Below is the command to save a TensorFlow 2.0 model. Notice that a file name is not specified but a directory. (Incidentally, Keras, which is incorporated into TensorFlow 2.0 uses the same format.)

tf.saved_model.save(model, './tensorflow')

Converting TensorFlow Models to ONNX

The easiest way to convert your TensorFlow models to ONNX is to use the tf2onnx tool from the command line. When used from the command line tf2onnx will convert a saved TensorFlow model to another file that represents the model in ONNX format. It is possible to run the conversion from code, but with TensorFlow models in memory, tf2onnx may have problems freezing your graph. Freezing is a process by which all variables in the graph get converted to constants. This is needed for ONNX since it is an inference graph and there are no variables. The tf2onnx tool contains a function named process_tf_graph, which you can try if you want to convert within code. However, if you end up getting the error message KeyError: tf.float32_ref then it may be best to convert files from the command line.

Below is the command to convert a TensorFlow 1.0 checkpoint file to ONNX. Notice that you need to find the meta file and pass it to tf2onnx. You also need to specify the input name and the output name.

python -m tf2onnx.convert --checkpoint ./tensorflow/tensorflow_model.ckpt.meta --output tfmodel.onnx --inputs input:0 --outputs output:0

Below is the command to convert a TensorFlow 2.0 model. You need to specify the directory that was used to save the model to disk. (It is not saved in a single file.) You also need to specify the ONNX output file. You do not need to specify the input name and the output name.

python -m tf2onnx.convert --saved-model ./tensorflow --output tfmodel.onnx

Summary and Next Steps

In this article, I provided a brief overview of TensorFlow 1.0 and TensorFlow 2.0 for those looking for a deep learning framework for building and training neural networks. I then showed how to install the tf2onnx conversion package and convert TensorFlow models to the ONNX format. I also showed an easy to make mistakes when specifying the shape of input parameters.

Since the purpose of this article was to demonstrate converting TensorFlow models to the ONNX format, I did not go into detail building and training TensorFlow models. The code sample for this post contains code that explores TensorFlow itself. There is a demo for TensorFlow 1.0 which engineers that have existing TensorFlow models will find useful. There is also a TensorFlow 2.0 demo for those who want a lower-level API than what is found in Keras.

Next, we’ll look at using ONNX Runtine from C#.

References