Click here to Skip to main content
15,867,883 members
Articles / Desktop Programming / MFC

Notepad RE (Regular Expressions)

Rate me:
Please Sign up or sign in to vote.
4.68/5 (107 votes)
22 Mar 2011CPOL9 min read 838.2K   8.5K   234   167
Search and replace text in Notepad RE using Regular Expressions or normal mode. The editor supports drag and drop, file change notifications, and displays the line and column numbers. Unicode support is available too.
Screenshot - notepadre.png

Introduction

This is a simple Notepad replacement. The main feature is that you can Search and Replace optionally using regular expressions. The boost::regex library is used for regex support. Note that the intention is for the boost::regex library to eventually become part of the C++ Standard Library. Replace All is improved compared to normal Notepad, as it builds a new text file in memory and replaces the entire text at once when it has finished. This is much quicker than replacing every match in the edit window as you go along.

As the development of Notepad RE progresses, more sophisticated features are being added.

Features

  • Find and Replace using regex - notepadreDoc.cpp
  • Find and Replace in normal mode - notepadreDoc.cpp
  • GREP/Find in Files capability - FindInFilesDlg.cpp
  • Multiple Undo/Redo - notepadreView.cpp
  • Dockable Find and Replace dialogs - MainFrm.cpp
  • Find will wrap from bottom to top -- or top to bottom, depending on search direction -- if necessary - notepadreDoc.cpp
  • If the file you are editing is changed by another process, you have the option of being asked if you want to reload - notepadreDoc.cpp
  • Line and column displayed in status bar - MainFrm.cpp, called from notepadre.cpp
  • You can drop files as path/filename from Explorer by clearing Options->Drop Files - MainFrm.cpp
  • You can drag and drop text to and from the edit window to and from other applications that support drag and drop - notepadreView.cpp
  • You can re-open an existing file, something that does not work in the standard CEditView class - notepadre.cpp
  • You can open a text file bigger than 1 MB - notepadreView.cpp
  • Unicode is supported - notepadreFile.cpp
  • You can open and re-save UNIX text files correctly - notepadreDoc.cpp
  • The Find/Replace dialog is written from scratch - FindReplaceDlg.cp
  • Help file included - MainFrm.cpp

What are Regular Expressions?

A Regular Expression is simply some text. I think it is safe to assume that anyone who has used a modern computer will have used Find and/or Replace dialogs in more than one application that allows text processing, whether it is Notepad, a word processing program, or a web browser. At the simplest level, a regular expression is no different to the text you type into the edit field of a Find dialog. Where regular expressions differ to normal text is that they give special meaning to certain characters, allowing you to specify textual 'patterns' rather than just literal text. The special characters are the following:

'.', '|', '*', '?', '+', '(', ')', '{', '}', '[', ']', '^', '$' and '\'.

These characters are often known as 'metacharacters' in the jargon of regular expressions. If you have ever typed something like...

*.txt

... or something similar, then you are already familiar with the concept of characters having special meaning in a piece (string) of text. Wildcards -- i.e. the characters '*' and '?' -- used when negotiating most computer file systems are a massively simplified version of regular expressions. As well as being able to match any character ('?') or any string ('*'), regular expressions allow you to specify ranges of characters that can match, repeating textual patterns, alternative matching patterns and even matching positions within text. Note that in the syntax of regular expressions, the wildcard character '?' becomes '.' and '*' becomes '.*'.

If you have never used regular expressions before, then once you have learned the syntax you are in for a pleasant surprise. Once you have mastered their use, you will never look back! The official reference for the boost regular expression library is here. See this Regular Expression Primer for a very basic description. The book Mastering Regular Expressions is very good for when you really want to get in-depth!

Getting the Boost library

Visit Boost.org to obtain the boost regular expressions library.

Building the Boost Library

These instructions are for building under Visual C++ version 6.0

  • Download the ZIP file
  • Unzip the contents to C:\
  • From a command prompt:
    • C:\>"C:\Program Files\Microsoft Visual Studio\VC98\Bin\vcvars32.bat"
    • Ensure the environment variable 'include' includes the path "C:\Program Files\Microsoft Visual Studio\VC98\include"
    • Ensure environment variable 'lib' includes the path "C:\Program Files\Microsoft Visual Studio\VC98\lib"
    • C:\>cd C:\boost_1_39_0\libs\regex\build
    • C:\boost_1_39_0\libs\regex\build\>nmake /f vc6.mak
  • Wait until the build finishes (you might want to get a coffee..!)
  • Add C:\boost_1_39_0 to your includes (Tools, Options, Directories, Include files from the VC menu)
  • Add C:\boost_1_39_0\libs\regex\build\vc6 to your library path (Tools, Options, Directories, Library Files from the VC menu)

Getting the Microsoft HTML Help Workshop

Get it here.

Installing HTML Help

  • Download htmlhelp.exe from the link above
  • Run htmlhelp.exe, installing to the default directory C:\Program Files\HTML Help Workshop
  • Add C:\Program Files\HTML Help Workshop\include to your includes (Tools, Options, Directories, Include files from the VC menu)
  • Add C:\Program Files\HTML Help Workshop\lib to your library path (Tools, Options, Directories, Library Files from the VC menu)

Program Design

File Handling

Notepad RE supports ANSI, Unicode, Big Endian Unicode and UTF-8 file formats. Additionally, Windows, UNIX and Macintosh line endings are supported, including files with inconsistent line endings. The file handling routines are the most tricky parts of Notepad RE.

Regular Expression Syntax

The regular expression syntax is now selectable under the Options menu.

Matching, Including Over More Than One Line

I've aimed to provide default search functionality with the maximum amount of possibilities and the minimum amount of surprises. The basic aim is to provide functionality based on vi, but with several improvements.

  • 'Char Classes' are supported (i.e. [[:CLASS:]] syntax is allowed)
  • 'Intervals' are supported (i.e. {x,y} syntax allowed)
  • 'Back References' are supported (i.e. \1, \2 etc. are allowed)
  • 'Escape in Lists' is supported (i.e. the \ character is the escape character inside [...])
  • + is supported (of course)
  • ? is supported (of course)
  • | is supported (of course)
  • Use Perl-like variables $1, $2, $3 etc. in the Replace field to use captured text
  • .* matches characters on the current line, like vi. To continue a match to the next line, follow .* with \r\n
  • Note that characters \r and \n are treated as whitespace. For example, if you use \s+ as part of your regex, you may be surprised to find you have matched text across lines
  • $ works like it does in vi, but may also be followed by \r\n if you want to match the 'newline' character

References

  • "The C++ Programming Language Special Edition" by Bjarne Stroustrup
  • "Advanced Windows" by Jeffrey Richter, Microsoft Press
  • "The Essence of COM with ActiveX, a Programmer's Workbook" by David S. Platt, Prentice Hall
  • "Mastering Regular Expressions" by Jeffrey E. F. Friedl, O'Reilly
  • "Professional MFC with Visual C++ 6" by Mike Blaszczak, Wrox Press Inc

Future Work

  • Popup menu in Replace dialog for regex replace syntax
  • Investigate syntax highlighting
  • Use the std::tr1 interface to boost::regex
  • Use MicrosoftMS Unicode routines when loading/saving
  • HEX view

This MFC version of Notepad RE will be improved until it is as close to Windows Notepad as possible. After that, I may rewrite it as a WTL program.

History

  • 23 July, 2003
    • Original version posted
  • 2 June, 2007: Version 1.1.0.1
    • Multiple Undo/Redo added
  • 4 June, 2007: Version 1.1.0.2
    • BUG FIX: Replace with empty string works again!
    • Group characters for undo
    • Undoing all changes sets modified flag to FALSE
    • Replacing a selection now treated as an atomic undo/redo
  • 10 June, 2007: Version 1.1.0.3
    • BUG FIX: Clear Undo history when toggling word wrap
  • 12 June, 2007: Version 1.1.0.4
    • BUG FIX: Forgot to add the OnKeyUp function!
  • 14 June, 2007: Version 1.1.0.5
  • 16 June, 2007: Version 1.1.0.6
  • 17 June, 2007: Version 1.1.0.7
    • Added first cut of Find in Files
  • 21 June, 2007: Version 1.1.0.8
    • If Modified flag set before toggling word wrap -- therefore flushing the undo buffer -- don't set to false if subsequently all edits are undone!
    • Various tweaks to Find in Files
  • 27 June, 2007: Version 1.1.0.9
    • BUG FIX: A sequence of replacements is no longer treated as one big transaction by Undo
  • 3 July, 2007: Updated help file
  • 4 July, 2007: Version 1.1.1.0
    • Added popup menu to Find and Replace dialogs for regex syntax
  • 6 July, 2007: Version 1.1.1.1
    • BUG FIX: A sequence of replacements is no longer treated as one big transaction by Redo
    • Finished popup menu in Find and Replace dialogs for regex syntax
  • 10 July, 2007: Version 1.1.1.2
    • Find in Files now sends output to a dockable toolbar
    • Changed tab order in Replace dialog
    • Changed 'Number' regex to be PERL mode friendly
  • 7 August, 2007: Version 1.1.1.3
    • BUG FIX: Shift-Del only creates one Undo entry now!
  • 8 August, 2007: Version 1.1.1.4
    • BUG FIX: Ctrl-C works again...
  • 10 October, 2007: Version 1.1.1.5
    • BUG FIX: Saving with word wrap enabled no longer saves too much text
    • Find in Files now runs in the background
  • 26 June, 2008: Version 1.1.1.6
    • BUG FIX: Check for Non-Windows line endings in CNotepadreFile::CountCharsUTF8() fixed
    • Help file correction (thanks har0ld)
    • Fixes to CRegexSyntaxDlg
  • 17 March, 2009: Version 1.1.1.7
    • Selected text copied to Find and Replace dialogs
  • 23 March, 2009: Version 1.1.1.8
    • Re-enabled ".LOG" support
  • 24 March, 2009: Version 1.1.1.9
    • Find in Files now supports multi-line matching
  • 26 March, 2009: Version 1.1.2.0
    • Sped up Find in Files (make sure you use \r\n for multi-line matching)
  • 27 March, 2009: Version 1.1.2.1
    • Fixed memory leak in PeformGrep()
    • Selected text copied to Find in Files dialog
    • Find in Files now opens with CFile::modeRead | CFile::shareDenyNone
  • 27 March, 2009: Version 1.1.2.2
    • PerformGrep() wasn't counting newlines from the beginning of the file!
  • 29 March, 2009: Version 1.1.2.3
    • More improvements to Find in Files (more responsive, displays progress, etc.)
  • 7 April, 2009: Version 1.1.2.4
    • Double clicking the results from Find in Files now goes to correct line even with word wrap enabled
  • 9 April, 2009: Version 1.1.2.5
    • Ensure only one line is shown per match in the Find in Files results
    • Enable checkbox for Whole Word Only for regex mode (This is a convenience feature. All that happens is that the regex is wrapped in \b(?:)\b)
  • 16 April, 2009: Version 1.1.2.6
    • Changed regex whole word only syntax depending on regex flavour. Still not perfect as some flavours do not support this feature at all, but at least most work correctly now.
  • 8 January, 2011
    • Updated zip file
  • 19 March, 2011: Version 1.1.2.8
    • Added support for loading and saving toolbar positions
  • 21 March, 2011: Version 1.1.2.9 
    • Uses SetWindowPlacement() as it is more accurate than MoveWindow()

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
United Kingdom United Kingdom
I started programming in 1983 using Sinclair BASIC, then moved on to Z80 machine code and assembler. In 1988 I programmed 68000 assembler on the ATARI ST and it was 1990 when I started my degree in Computing Systems where I learnt Pascal, C and C++ as well as various academic programming languages (ML, LISP etc.)

I have been developing commercial software for Windows using C++ since 1994.

Comments and Discussions

 
General1.1.0.7 Posted Pin
Ben Hanson19-Jun-07 0:36
Ben Hanson19-Jun-07 0:36 
GeneralBUG with "Options->File Change Notify" function Pin
Defenestration6-May-04 7:10
Defenestration6-May-04 7:10 
GeneralFeature Pin
Ben Hanson7-May-04 2:12
Ben Hanson7-May-04 2:12 
GeneralRe: Feature Pin
Defenestration7-May-04 10:22
Defenestration7-May-04 10:22 
Generalbackreference in replace with field [modified] Pin
Mikka9874-May-04 8:16
Mikka9874-May-04 8:16 
GeneralUpdate "the fine manual" Pin
Ben Hanson4-May-04 23:27
Ben Hanson4-May-04 23:27 
General2.0.0.0 in Work Pin
Ben Hanson2-Mar-04 1:41
Ben Hanson2-Mar-04 1:41 
GeneralRe: 2.0.0.0 in Work Pin
Thomas Weidenmueller18-Mar-04 3:52
Thomas Weidenmueller18-Mar-04 3:52 
I'm missing a Search in files (similar to MS VC's search functionality). Since i don't have MS VC i really need a good tool to search files Wink | ;)
GeneralThe include file #include <htmlhelp.h> Pin
coronys2-Feb-04 1:44
coronys2-Feb-04 1:44 
GeneralRe: The include file #include <htmlhelp.h> Pin
Ben Hanson2-Feb-04 4:27
Ben Hanson2-Feb-04 4:27 
GeneralBoost include files Pin
MCofer23-Dec-03 6:49
MCofer23-Dec-03 6:49 
GeneralRe: Boost include files Pin
Ben Hanson23-Dec-03 10:02
Ben Hanson23-Dec-03 10:02 
GeneralRe: Boost include files Pin
MCofer25-Dec-03 10:42
MCofer25-Dec-03 10:42 
GeneralRe: Boost include files Pin
Ben Hanson25-Dec-03 23:52
Ben Hanson25-Dec-03 23:52 
GeneralRe: Boost include files Pin
MCofer26-Dec-03 5:51
MCofer26-Dec-03 5:51 
Generalaccess violation Pin
Vlad Vissoultchev26-Nov-03 23:01
Vlad Vissoultchev26-Nov-03 23:01 
GeneralRe: sharing violation *accessing* Pin
Vlad Vissoultchev26-Nov-03 23:55
Vlad Vissoultchev26-Nov-03 23:55 
GeneralRe: Correct Behaviour Pin
Vlad Vissoultchev14-Dec-03 12:51
Vlad Vissoultchev14-Dec-03 12:51 
GeneralRe: Correct Behaviour Pin
Vlad Vissoultchev15-Jan-04 2:18
Vlad Vissoultchev15-Jan-04 2:18 
GeneralRe: Correct Behaviour Pin
Vlad Vissoultchev15-Jan-04 4:13
Vlad Vissoultchev15-Jan-04 4:13 
General1.0.5.8 Posted Pin
Ben Hanson21-Jan-04 4:55
Ben Hanson21-Jan-04 4:55 
GeneralRegEx tool Pin
Adrian Bacaianu21-Nov-03 1:11
Adrian Bacaianu21-Nov-03 1:11 
GeneralRe: RegEx tool Pin
Ben Hanson21-Nov-03 1:20
Ben Hanson21-Nov-03 1:20 
GeneralNice work Pin
lucas_shawn14-Nov-03 14:20
lucas_shawn14-Nov-03 14:20 
GeneralCommand Line Update Included Pin
Ben Hanson14-Nov-03 21:49
Ben Hanson14-Nov-03 21:49 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.