Using Python Scripts from a C# Client (Including Plots and Images)

Thomas Weller

5.00/5 (37 votes)

Aug 26, 2019

CPOL

8 min read

79563

3367

Demonstrates how to run Python scripts from C#

Download sample project - 711.5 KB

Sample Image

Introduction

This article presents a class which lets you run Python scripts from a C# client (PythonRunner). The scripts can produce textual output as well as images which will be converted to C# Images. This way, the PythonRunner class gives C# applications not only access to the world of Data Science, Machine Learning, and Artificial Intelligence, but also makes Python's exhaustive libraries for charting, plotting, and data visualization (for e.g., matplotlib and seaborn) available for C#.

Background

General Considerations

I'm a C# developer for more than ten years now - and what can I say: Over the years, I deeply fell in love with this language. It gives me great architectural flexibility, it has large community support, and a wealth of third-party tools (both free and commercial, much of it is open-source) supporting almost every thinkable use case. C# is an all-purpose language and the number one choice for business application development.

In the last months, I started to learn Python for Data Science, Machine Learning, Artificial Intelligence, and Data Visualization - mostly because I thought that this skill will push my career as a freelance software developer. I soon realized that Python is great for the above tasks (far better than C# ever will be), but it is completely unsuitable for developing and maintaining large-scale business applications. So the question C# vs. Python (a widely discussed topic in the internet) completely misses the point. C# is for developers who do the job of business-scale application development - Python is for data scientists who do - well - data science, machine learning, and data visualization. There's not much that these two jobs have in common.

The task is not that C# developers additionally become data scientists or ML specialists (or the other way round). It would simply be too much to be an expert in both domains. This is IMHO the main reason why components like Microsoft's ML.NET or the SciSharp STACK won't become widely used. The average C# developer won't become a data scientist, and data scientists won't learn C#. Why would they? They already have a great programming language which is very well suited for their needs, and has a large ecosystem of scientific third-party libraries.

With these considerations in mind, I started searching for an easier and more 'natural' way to bring the Python world and the C# world together. Here is one possible solution ...

The Example App

Before we dive into details, a short preliminary note: I wrote this sample app for the sole purpose of demonstrating C#/Python integration, using only some mildly sophisticated Python code. I didn't care too much about the question if the ML code in itself is useful, so please be forgiving in that respect.

Having that said, let me shortly describe the sample application. Basically, it:

gives you a list of stocks that you can select from (6-30),
draws a summary line chart of the (normalized) monthly stock prices,
performs a so-called 'k-Means Clustering Analysis' based on their price movements and shows the results in a treeview.

Let's shortly step through the three parts of the application one by one...

Stock Selection

The DataGrid on the left side of the application's window presents you a list of available stocks that you can select from. You need at least six items to be selected before further action is possible (the maximum number of selected stocks is 30). You may use the controls on the top to filter the list. Also, the list is sortable by clicking the column headers. The Check Random Sample button randomly selects 18 stocks from the list.

Stock list with selected stocks

Adjust Other Parameters

In addition to stock selection, you may also adjust other parameters of the analysis: the analyzed date range and the number of clusters metaparameter for the k-Means analysis. This number cannot be greater than the number of selected stocks.

Bottom row of controls

Analysis Results

If you're done with stock selection and parameter adjustment, you can press the Analyze button in the bottom right corner of the window. This will (asynchronously) call Python scripts that perform the above described steps (draw a chart and perform a k-Means Clustering Analysis). On return, it will process and display the script 's output.

The middle part of the window is a chart with the prices of the selected stocks, normalized such that the price on start date is set to zero and the stock prices are scaled to percentage change from this starting point. The Image that results from running the script is wrapped in a ZoomBox control to enhance accessibility and user experience.

On the very right side of the window, a tree is shown with the processed results of the clustering analysis. It groups (clusters) the stocks based on their relative price movements (in other words: the closer two stocks move together, the more likely it is that they are in the same cluster). This tree is also used as a color legend for the chart.

Chart and tree

Points of Interest

Principal Structure of the Code

Generally, the project consists of:

the C# files
the Python scripts in a scripts subfolder (chart.py, kmeans.py, and common.py)
an SQLite database that is accessed both by C# code and Python scripts (stockdata.sqlite)

Other things to note:

On the C# side, the database is accessed using EF6 and the recipe from this Codeproject article.
Some WPF UI controls come from the Extended WPF Toolkit™.
Of course, a Python environment with all the required packages must be installed on the target system. The respective path is configured via the app.config file.
The C# part of the application uses WPF and follows the MVVM pattern. According to the three-fold overall structure of the application's main window, there are three viewmodels (StockListViewModel, ChartViewModel, and TreeViewViewModel) that are orchestrated by a fourth one (the MainViewModel).

The C# Side

The PythonRunner Class

The central component for running Python scripts is the PythonRunner class. It is basically a wrapper around the Process class, specialized for Python. It supports textual output as well as image output, both synchronously and asynchronously. Here is the public interface of this class, together with the code comments that explain the details:

/// <summary>
/// A specialized runner for python scripts. Supports textual output
/// as well as image output, both synchronously and asynchronously.
/// </summary>
/// <remarks>
/// You can think of <see cref="PythonRunner" /> instances <see cref="Process" />
/// instances that were specialized for Python scripts.
/// </remarks>
/// <seealso cref="Process" />
public class PythonRunner
{
    /// <summary>
    /// Instantiates a new <see cref="PythonRunner" /> instance.
    /// </summary>
    /// <param name="interpreter">
    /// Full path to the Python interpreter ('python.exe').
    /// </param>
    /// <param name="timeout">
    /// The script timeout in msec. Defaults to 10000 (10 sec).
    /// </param>
    /// <exception cref="ArgumentNullException">
    /// Argument <paramref name="interpreter" /> is null.
    /// </exception>
    /// <exception cref="FileNotFoundException">
    /// Argument <paramref name="interpreter" /> is an invalid path.
    /// </exception>
    /// <seealso cref="Interpreter" />
    /// <seealso cref="Timeout" />
	public PythonRunner(string interpreter, int timeout = 10000) { ... }

	/// <summary>
	/// Occurs when a python process is started.
	/// </summary>
	/// <seealso cref="PyRunnerStartedEventArgs" />
	public event EventHandler<PyRunnerStartedEventArgs> Started;

	/// <summary>
	/// Occurs when a python process has exited.
	/// </summary>
	/// <seealso cref="PyRunnerExitedEventArgs" />
	public event EventHandler<PyRunnerExitedEventArgs> Exited;

    /// <summary>
    /// The Python interpreter ('python.exe') that is used by this instance.
    /// </summary>
    public string Interpreter { get; }

    /// <summary>
    /// The timeout for the underlying <see cref="Process" /> component in msec.
    /// </summary>
    /// <remarks>
    /// See <see cref="Process.WaitForExit(int)" /> for details about this value.
    /// </remarks>
    public int Timeout { get; set; }

    /// <summary>
    /// Executes a Python script and returns the text that it prints to the console.
    /// </summary>
    /// <param name="script">Full path to the script to execute.</param>
    /// <param name="arguments">Arguments that were passed to the script.</param>
    /// <returns>The text output of the script.</returns>
    /// <exception cref="PythonRunnerException">
    /// Thrown if error text was outputted by the script (this normally happens
    /// if an exception was raised by the script). <br />
    /// -- or -- <br />
    /// An unexpected error happened during script execution. In this case, the
    /// <see cref="Exception.InnerException" /> property contains the original
    /// <see cref="Exception" />.
    /// </exception>
    /// <exception cref="ArgumentNullException">
    /// Argument <paramref name="script" /> is null.
    /// </exception>
    /// <exception cref="FileNotFoundException">
    /// Argument <paramref name="script" /> is an invalid path.
    /// </exception>
    /// <remarks>
    /// Output to the error stream can also come from warnings, that are frequently
    /// outputted by various python package components. These warnings would result
    /// in an exception, therefore they must be switched off within the script by
    /// including the following statement: <c>warnings.simplefilter("ignore")</c>.
    /// </remarks>
    public string Execute(string script, params object[] arguments) { ... }

	/// <summary>
	/// Runs the <see cref="Execute"/> method asynchronously. 
	/// </summary>
	/// <returns>
	/// An awaitable task, with the text output of the script as 
    /// <see cref="Task{TResult}.Result"/>.
	/// </returns>
	/// <seealso cref="Execute"/>
    public Task<string> ExecuteAsync(string script, params object[] arguments) { ... }

	/// <summary>
	/// Executes a Python script and returns the resulting image 
    /// (mostly a chart that was produced
	/// by a Python package like e.g. <see href="https://matplotlib.org/">matplotlib</see> or
	/// <see href="https://seaborn.pydata.org/">seaborn</see>).
	/// </summary>
	/// <param name="script">Full path to the script to execute.</param>
	/// <param name="arguments">Arguments that were passed to the script.</param>
	/// <returns>The <see cref="Bitmap"/> that the script creates.</returns>
	/// <exception cref="PythonRunnerException">
	/// Thrown if error text was outputted by the script (this normally happens
	/// if an exception was raised by the script). <br/>
	/// -- or -- <br/>
	/// An unexpected error happened during script execution. In this case, the
	/// <see cref="Exception.InnerException"/> property contains the original
	/// <see cref="Exception"/>.
	/// </exception>
	/// <exception cref="ArgumentNullException">
	/// Argument <paramref name="script"/> is null.
	/// </exception>
	/// <exception cref="FileNotFoundException">
	/// Argument <paramref name="script"/> is an invalid path.
	/// </exception>
	/// <remarks>
	/// <para>
	/// In a 'normal' case, a Python script that creates a chart would show this chart
	/// with the help of Python's own backend, like this.
	/// <example>
	/// import matplotlib.pyplot as plt
	/// ...
	/// plt.show()
	/// </example>
	/// For the script to be used within the context of this <see cref="PythonRunner"/>,
	/// it should instead convert the image to a base64-encoded string and print this string
	/// to the console. The following code snippet shows a Python method (<c>print_figure</c>)
	/// that does this:
	/// <example>
	/// import io, sys, base64
	/// 
	/// def print_figure(fig):
	/// 	buf = io.BytesIO()
	/// 	fig.savefig(buf, format='png')
	/// 	print(base64.b64encode(buf.getbuffer()))
	///
	/// import matplotlib.pyplot as plt
	/// ...
	/// print_figure(plt.gcf()) # the gcf() method retrieves the current figure
	/// </example>
	/// </para><para>
	/// Output to the error stream can also come from warnings, that are frequently
	/// outputted by various python package components. These warnings would result
	/// in an exception, therefore they must be switched off within the script by
	/// including the following statement: <c>warnings.simplefilter("ignore")</c>.
	/// </para>
	/// </remarks>
    public Bitmap GetImage(string script, params object[] arguments) { ... }

 	/// <summary>
	/// Runs the <see cref="GetImage"/> method asynchronously. 
	/// </summary>
	/// <returns>
	/// An awaitable task, with the <see cref="Bitmap"/> that the script
	/// creates as <see cref="Task{TResult}.Result"/>.
	/// </returns>
	/// <seealso cref="GetImage"/>
    public Task<Bitmap> GetImageAsync(string script, params object[] arguments) { ... }
}

Retrieving Stock Data

As already mentioned, the sample app uses a SQLite database as its datastore (which is also accessed by the Python side - see below). To this end, Entity Framework is used, together with the recipe found in this Codeproject article. The stock data are then put into a ListCollectionView, which supports filtering and sorting:

private void LoadStocks()
{
	var ctx = new SQLiteDatabaseContext(_mainVm.DbPath);

	var itemList = ctx.Stocks.ToList().Select(s => new StockItem(s)).ToList();
	_stocks = new ObservableCollection<StockItem>(itemList);
	_collectionView = new ListCollectionView(_stocks);

	// Initially sort the list by stock names
	ICollectionView view = CollectionViewSource.GetDefaultView(_collectionView);
	view.SortDescriptions.Add(new SortDescription("Name", ListSortDirection.Ascending));
}

Getting Textual Output

Here, PythonRunner is calling a script that produces textual output. The KMeansClusteringScript property points to the script to execute:

/// <summary>
/// Calls the python script to retrieve a textual list that 
/// will subsequently be used for building the treeview.
/// </summary>
/// <returns>True on success.</returns>
private async Task<string> RunKMeans()
{
	TreeViewText = Processing;
	Items.Clear();

	try
	{
		string output = await _mainVm.PythonRunner.ExecuteAsync(
			KMeansClusteringScript,
			_mainVm.DbPath,
			_mainVm.TickerList,
			_mainVm.NumClusters,
			_mainVm.StartDate.ToString("yyyy-MM-dd"),
			_mainVm.EndDate.ToString("yyyy-MM-dd"));

		return output;
	}
	catch (Exception e)
	{
		TreeViewText = e.ToString();
		return string.Empty;
	}
}

And here is some sample output produced by the script:

0 AYR 0,0,255
0 PCCWY 0,100,0
0 HSNGY 128,128,128
0 CRHKY 165,42,42
0 IBN 128,128,0
1 SRNN 199,21,133
...
4 PNBK 139,0,0
5 BOTJ 255,165,0
5 SPPJY 47,79,79

The first column is the cluster number of the k-Means analysis, the second column is the ticker symbol of the respective stock, and the third column indicates the RGB values of the color that was used to draw this stock's line in the chart.

Getting an Image

This is the method that uses viewmodel's PythonRunner instance for asynchronously calling the required Python script (the path of which is stored in the DrawSummaryLineChartScript property) together with the required script arguments. The result is then processed into a 'WPF-friendly' form, as soon as it becomes available:

/// <summary>
/// Calls the python script to draw the chart of the selected stocks.
/// </summary>
/// <returns>True on success.</returns>
internal async Task<bool> DrawChart()
{
	SummaryChartText = Processing;
	SummaryChart = null;

	try
	{
		var bitmap = await _mainVm.PythonRunner.GetImageAsync(
			DrawSummaryLineChartScript,
			_mainVm.DbPath,
			_mainVm.TickerList,
			_mainVm.StartDate.ToString("yyyy-MM-dd"),
			_mainVm.EndDate.ToString("yyyy-MM-dd"));

		SummaryChart = Imaging.CreateBitmapSourceFromHBitmap(
			bitmap.GetHbitmap(),
			IntPtr.Zero,
			Int32Rect.Empty,
			BitmapSizeOptions.FromEmptyOptions());

		return true;
	}
	catch (Exception e)
	{
		SummaryChartText = e.ToString();
		return false;
	}
}

The Python Side

Suppress Warnings

An important thing to note is that the PythonRunner class throws an exception as soon as the called script writes to stderr. This is the case when the Python code raises an error for some reason or the other, and in this case, it is desirable to re-throw the error. But the script may also write to stderr if some component issues a harmless warning, such as when something gets deprecated any time soon, or something is initialized twice, or any other minor issue. In such a case, we don't want to break execution, but simply ignore the warning. The statement in the snippet below does exactly this:

import warnings

...

# Suppress all kinds of warnings (this would lead to an exception on the client side).
warnings.simplefilter("ignore")
...

Parsing the Command Line Arguments

As we have seen, the C# (client) side calls a script with a variable number of positional arguments. The arguments are submitted to the script via the command line. This implies that the script 'understands' these arguments and parses it accordingly. The command line arguments that are given to a Python script are accessible via the sys.argv string array. The snippet below is from the kmeans.py script and demonstrates how to do this:

import sys

...

# parse command line arguments
db_path = sys.argv[1]
ticker_list = sys.argv[2]
clusters = int(sys.argv[3])
start_date = sys.argv[4]
end_date = sys.argv[5]
...

Retrieving Stock Data

The Python scripts use the same SQLite database as the C# code does. This is realized in that the path to the database is stored as an application setting in the app.config on the C# side and then submitted as a parameter to the called Python script. Above, we have seen how this is done both from the caller side as well as the command line argument parsing in the Python script. Here now is the Python helper function that builds an SQL statement from the arguments and loads the required data into an array of dataframes (using the sqlalchemy Python package):

from sqlalchemy import create_engine
import pandas as pd

def load_stock_data(db, tickers, start_date, end_date):
    """
    Loads the stock data for the specified ticker symbols, and for the specified date range.
    :param db: Full path to database with stock data.
    :param tickers: A list with ticker symbols.
    :param start_date: The start date.
    :param end_date: The start date.
    :return: A list of time-indexed dataframe, one for each ticker, ordered by date.
    """

    SQL = "SELECT * FROM Quotes WHERE TICKER IN ({}) AND Date >= '{}' AND Date <= '{}'"\
          .format(tickers, start_date, end_date)

    engine = create_engine('sqlite:///' + db)

    df_all = pd.read_sql(SQL, engine, index_col='Date', parse_dates='Date')
    df_all = df_all.round(2)

    result = []

    for ticker in tickers.split(","):
        df_ticker = df_all.query("Ticker == " + ticker)
        result.append(df_ticker)

    return result

Text Output

For a Python script, producing text output that is consumable from the C# side simply means: Printing to the console as usual. The calling PythonRunner class will take care of everything else. Here is the snippet from kmeans.py which produces the text seen above:

# Create a DataFrame aligning labels and companies.
df = pd.DataFrame({'ticker': tickers}, index=labels)
df.sort_index(inplace=True)

# Make a real python list.
ticker_list = list(ticker_list.replace("'", "").split(','))

# Output the clusters together with the used colors
for cluster, row in df.iterrows():

	ticker = row['ticker']
	index = ticker_list.index(ticker)
	rgb = get_rgb(common.COLOR_MAP[index])

	print(cluster, ticker, rgb)

Image Output

Image output is not very different from text output: First the script creates the desired figure as usual. Then, instead of calling the show() method to show the image using Python's own backend, convert it to a base64 string and print this string to the console. You may use this helper function:

import io, sys, base64

def print_figure(fig):
	"""
	Converts a figure (as created e.g. with matplotlib or seaborn) to a png image and this 
	png subsequently to a base64-string, then prints the resulting string to the console.
	"""
	
	buf = io.BytesIO()
	fig.savefig(buf, format='png')
	print(base64.b64encode(buf.getbuffer()))

In your main script, you then call the helper function like this (the gcf() function simply gets the current figure):

import matplotlib.pyplot as plt
...
# do stuff
...
print_figure(plt.gcf())

On the C# client side then, this little helper class, which is used by PythonRunner, will convert this string back to an Image (a Bitmap to be precise):

/// <summary>
/// Helper class for converting a base64 string (as printed by
/// python script) to a <see cref="Bitmap" /> image.
/// </summary>
internal static class PythonBase64ImageConverter
{
	/// <summary>
	/// Converts a base64 string (as printed by python script) to a <see cref="Bitmap" /> image.
	/// </summary>
	public static Bitmap FromPythonBase64String(string pythonBase64String)
	{
		// Remove the first two chars and the last one.
		// First one is 'b' (python format sign), others are quote signs.
		string base64String = pythonBase64String.Substring(2, pythonBase64String.Length - 3);

		// Convert now raw base46 string to byte array.
		byte[] imageBytes = Convert.FromBase64String(base64String);

		// Read bytes as stream.
		var memoryStream = new MemoryStream(imageBytes, 0, imageBytes.Length);
		memoryStream.Write(imageBytes, 0, imageBytes.Length);

		// Create bitmap from stream.
		return (Bitmap)Image.FromStream(memoryStream, true);
	}
}

History

24^th August, 2019: Initial version