Click here to Skip to main content
15,885,914 members
Articles / Web Development / CSS

A Simple CSS Parser

Rate me:
Please Sign up or sign in to vote.
4.79/5 (9 votes)
25 Feb 2012CPOL2 min read 61.9K   3.2K   26   12
A simple CSS parser designed to work with iTextSharp for HTML to PDF generation

Introduction

Cascading Style Sheets allow developers to create nice user interfaces for the web. They are easy to build, use, and maintain. iTextSharp can take advantage of CSS when using its built in HTML to PDF functionality. Getting the style sheet information from the CSS into iTextSharp requires the developer to read the CSS file and convert it to Dictionary consumable by iTextSharp. This article will illustrate a simple solution for performing just that task. The included solution includes Unit Tests and an ASP.Net project which demonstrate how to use the CSSParser.

Background

While working on an HTML to PDF utility I found the need to parse Cascading Style Sheets. There are many CSS parsers on the internet but none fit my needs. I created this simple Regular Expression based CSS parser in C# to facilitate PDF generation in iTextSharp. The requirements for the CSS Parser are as follows:

Requirements

  1. Read a CSS file
  2. Store CSS in a Collection
  3. Query for the classes and their properties
  4. Query for the elements and their properties
  5. Easy to maintain and enhance
  6. Easily feed the style information into iTextSharp to turn HTML into PDF
  7. It should be lean
  8. Something another developer can use

Using the code

The CSSParser inherits from a generic List of KeyValuePair. The key will be the CSS selector. The value will be another list of key value pairs. The key here is the CSS attribute name. The value will be the CSS property value. I used a generic List instead of a Dictionary because Cascading Style Sheets can have the same selector or attributes listed multiple times.

C#
public partial class CSSParser : List<KeyValuePair<String,List<KeyValuePair<String,String>>>>, ICSSParser

The core of the CSS parser is a regular expression which I found on Stack Overflow (http://stackoverflow.com/a/2694121/899290). The CSSGroups regular expression will take the stylesheet and break it up into named groups. Before parsing the CSS the CSSComments regular expression will be used to remove CSS comments from the file.

C#
public const String CSSGroups = @"(?<selector>(?:(?:[^,{]+),?)*?)\{(?:(?<name>[^}:]+):?(?<value>[^};]+);?)*?\}";

public const String CSSComments = @"(?<!"")\/\*.+?\*\/(?!"")";

private Regex rStyles = new Regex(CSSGroups, RegexOptions.IgnoreCase | RegexOptions.Compiled);

The Read method is responsible for parsing the values in the style sheet and filling the generic List. It will use the .Net Regex class to remove any comments and populate the collections.

C#
public void Read(String CascadingStyleSheet)
{
    this.StyleSheet = CascadingStyleSheet;

    if (!String.IsNullOrEmpty(CascadingStyleSheet))
    {
        //Remove comments before parsing the CSS. Don't want any comments in the collection.
        MatchCollection MatchList = rStyles.Matches(Regex.Replace(CascadingStyleSheet, 
            RegularExpressionLibrary.CSSComments, String.Empty));
        foreach (Match item in MatchList)
        {
            //Check for nulls
            if (item != null && item.Groups != null && 
                item.Groups[SelectorKey] != null && 
                item.Groups[SelectorKey].Captures != null && 
                item.Groups[SelectorKey].Captures[0] != null && 
                !String.IsNullOrEmpty(item.Groups[SelectorKey].Value))
            {
                String strSelector = item.Groups[SelectorKey].Captures[0].Value.Trim();
                var style = new List<KeyValuePair<String,String>>();

                for (int i = 0; i < item.Groups[NameKey].Captures.Count; i++)
                {
                    String className = item.Groups[NameKey].Captures[i].Value;
                    String value = item.Groups[ValueKey].Captures[i].Value;
                    //Check for null values in the properies
                    if (!String.IsNullOrEmpty(className) && !String.IsNullOrEmpty(value))
                    {
                        className = className.TrimWhiteSpace();
                        value = value.TrimWhiteSpace();
                        //One more check to be sure we are only pulling valid css values
                        if (!String.IsNullOrEmpty(className) && !String.IsNullOrEmpty(value))
                        {
                            style.Add(new KeyValuePair<String,String>(className, value));
                        }
                    }
                }
                this.Add(new KeyValuePair<String,List<KeyValuePair<String,String>>>(strSelector, style));
            }
        }
    }
}

Once the list is populated it’s a simple matter of using LINQ or Lambda expressions to pull the information you need. The Classes and Elements properties expose the values of the style sheet as a Dictionary which can be fed to iTextSharp.

C#
public Dictionary<String, Dictionary<String,String>> Classes
{
    get
    {
        if (classes == null || classes.Count == 0)
        {
            this.classes = this.Where(cl => cl.Key.StartsWith("."))
                .ToDictionary(cl => cl.Key.Trim(new Char[] { '.' }), cl => cl.Value
                    .ToDictionary(p => p.Key, p => p.Value));
        }

        return classes;
    }
}

public public Dictionary<String, Dictionary<String,String>> Elements
{
    get
    {
        if (elements == null || elements.Count == 0)
        {
            elements = this.Where(el => !el.Key.StartsWith("."))
                .ToDictionary(el => el.Key, el => el.Value
                    .ToDictionary(p => p.Key, p => p.Value));
        }
        return elements;
    }
}

Using the CSS Parser

The CSSParser gives you two options to read a Cascading Style Sheet, read a CSS file or a string. The ReadCSSFile method will read a CSS file and populate the collections. You can read a String containing CSS information by calling the Read method or passing the CSS values to the constructor.

C#
void lnkParseCSSFile_Click(object sender, EventArgs e)
{
    CSSParser parser = new CSSParser();
    parser.ReadCSSFile(Server.MapPath("~/CSSParserStyle.css"));
    //Display the Original CSS with some formating for the web
    this.divOriginalCSS.InnerHtml = parser.StyleSheet.FixLineBreakForWeb().FixTabsForWeb().FixSpaceForWeb();
    //Display the parsed CSS
    this.divParsedCSS.InnerHtml = parser.ToString();
    this.spnOriginalCSSLength.InnerText = parser.StyleSheet.Length.ToString();
    this.spnParsedCSSLength.InnerText = this.divParsedCSS.InnerHtml.Length.ToString();
}

Points of Interest

The CSSParser Elements and Classes properties target iTextSharp version 5.x

History

  1. Version 1.0- Initial Release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
United States United States
I am a full stack software engineer and architect with the majority of my experience on the Microsoft Stack. I teach martial arts for a non-profit organization.

Comments and Discussions

 
QuestionMy vote of 5 Pin
vJay Yadav16-Jun-14 21:20
professionalvJay Yadav16-Jun-14 21:20 
Questionhow to get @media type? Pin
Jerry Ho28-Jan-13 22:42
Jerry Ho28-Jan-13 22:42 
SuggestionRe: how to get @media type? Pin
Israel Cris Valenzuela31-Jan-13 7:31
Israel Cris Valenzuela31-Jan-13 7:31 
AnswerRe: how to get @media type? Pin
Jonathan Wood27-Aug-13 5:07
Jonathan Wood27-Aug-13 5:07 
QuestionHow to skip duplicate keys from list Pin
ntrraorao18-Jan-13 2:48
ntrraorao18-Jan-13 2:48 
AnswerRe: How to skip duplicate keys from list Pin
Israel Cris Valenzuela31-Jan-13 7:25
Israel Cris Valenzuela31-Jan-13 7:25 
Questioncan you send me source code for multiple css file using c#.net winform appliation Pin
Kay Pee Singh16-Oct-12 18:21
Kay Pee Singh16-Oct-12 18:21 
SuggestionRe: can you send me source code for multiple css file using c#.net winform appliation Pin
Israel Cris Valenzuela31-Jan-13 7:33
Israel Cris Valenzuela31-Jan-13 7:33 
Questionusing CSS Parser in itext sharp Pin
Schin kulkarni.16-Sep-12 20:53
Schin kulkarni.16-Sep-12 20:53 
AnswerRe: using CSS Parser in itext sharp Pin
Israel Cris Valenzuela31-Jan-13 7:15
Israel Cris Valenzuela31-Jan-13 7:15 
QuestionTutorials for using CSS Parser Pin
tqnst6-Mar-12 4:53
tqnst6-Mar-12 4:53 
AnswerRe: Tutorials for using CSS Parser Pin
Israel Cris Valenzuela6-Mar-12 5:47
Israel Cris Valenzuela6-Mar-12 5:47 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.