Click here to Skip to main content
15,867,308 members
Articles / Web Development / ASP.NET
Tip/Trick

Porting the Financial Times JavaScript String Compressor

Rate me:
Please Sign up or sign in to vote.
0.00/5 (No votes)
10 Oct 2013GPL31 min read 14.9K   2   6
Compress and encode a string to save loading time in the browser

Introduction

This tip shows you how to convert images to base64, then compress the resulting string using this code before sending it to the visitor's browser for decompressing and image rebuilding. This is originally from an article entitled "Text re-encoding for optimising storage capacity in the browser."

Background

While developing an iOS app that loaded PDFs into the WebView, I was looking for a way to speed up the transfer of data to the device, and minimize the storage used by the app.

Each page of the PDF was split into an image which was then base64-encoded into a string and sent to the device from the Web Service (written in VB.NET and C#). I came across the original implementation in JavaScript from Financial Times Labs and ported it to C# to use server-side.

Using the Code

This will take any string (long ones are better) and compress it using method #2 described in the article. The snowman is prepended to the string to indicate to the browser-side script that this string needs to be decompressed.

This method can compress the string to ~50% of its original size.

C#
using System;
using System.Collections.Generic;
using System.Text;
using System.IO;

namespace ftCompress
{
	class Program
	{
		public static string CompressString(string s)
		{
		int intStringLength = s.Length;

			// pad the string if necessary
			if (intStringLength % 2 != 0) {
				s += " ";
			}

			// compress the string
			int intStr1, intStr2;
			string strOut = "";

			for (int i = 0; i < intStringLength; i += 2)
			{
				// Char.ConvertToUtf32(string, pos) is JS string.charCodeAt(pos)
				intStr1 = Char.ConvertToUtf32(s, i) * 256;
				intStr2 = Char.ConvertToUtf32(s, i + 1);

				// Char.ConvertFromUtf32(int) is JS String.fromCharCode(int)
				strOut += Char.ConvertFromUtf32(intStr1 + intStr2);
			}

			// Prepend the snowman character to the string
			return Convert.ToChar(9731) + strOut;
		}

		public static string DecompressString(string s)
		{
			// If not prefixed with a snowman, 
			// just return the (already uncompressed) string
			if (s [0] != Convert.ToChar (9731)) {
				return s;
			}

			int m, n;
			string strOut = "";

			for (int i = 1, l = s.Length; i < l; i++) {
				n = Char.ConvertToUtf32 (s, i);
				m = (int)Math.Floor ((decimal)n / 256);

				strOut += (Char.ConvertFromUtf32 (m) + 
				Char.ConvertFromUtf32 (n % 256));
			}

			return strOut;
		}

		static void Main(string[] args)
		{
			string strNew = CompressString("Hello world!");
			string js = "var strCompressed = '" + strNew + "';";

			Console.WriteLine (strNew);
			Console.WriteLine (DecompressString (strNew));
			System.IO.File.WriteAllText(@"ft-compressedString.js", js);
		} 
	}
}

Image 1

Points of Interest

This is my first attempt at writing C#, and my first attempt at porting from one language to another. I'm sure there are better implementations and am happy to hear from anyone who actually implements this into a production system.

History

  • 10/10/13: Updated based on comments; added decompression method
  • 10/9/13: First version, written and tested in Visual Studio 2008 on Windows XP

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)


Written By
Web Developer Golf Tailor LLC
United States United States

I build websites.


I like to build things, especially using WordPress.


I'm happy working in PHP, JS, HTML, CSS, and a bunch of other acronyms, but I'm happiest when I find a way to put the pieces together to create something useful.


You can see more of my work at http://coderbits.com/morganestes.




Comments and Discussions

 
QuestionDoes not work with other languages Pin
Robert Krzysztof Winkler12-Oct-13 23:08
Robert Krzysztof Winkler12-Oct-13 23:08 
AnswerRe: Does not work with other languages Pin
Morgan Estes14-Oct-13 5:38
professionalMorgan Estes14-Oct-13 5:38 
GeneralSome comments Pin
Axel Rietschin9-Oct-13 13:48
professionalAxel Rietschin9-Oct-13 13:48 
GeneralRe: Some comments Pin
Morgan Estes10-Oct-13 7:15
professionalMorgan Estes10-Oct-13 7:15 
GeneralRe: Some comments Pin
Axel Rietschin11-Oct-13 5:06
professionalAxel Rietschin11-Oct-13 5:06 
In general you must adhere to the style and conventions of the language/environement you are using. Sometimes it's just style, sometimes there are good reasons.

Yes s.Length & 1 should be (s.Length & 1) != 0, which is the same, but in C# there is no implicit conversion from int to bool.

s.Length & 1 isolate the last bit of the value by applying a bitwise AND. The result is 1 iif Length is odd.

GeneralRe: Some comments Pin
Morgan Estes14-Oct-13 5:24
professionalMorgan Estes14-Oct-13 5:24 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.