I doubt you will need to handle all the Unicode characters in the range 0x0000 to 0x1F8FF, and I think the correct strategy for getting speedy look-up here is to pre-compute the possible character output for each character input ... not to calculate, and test, character by character.
Here's a sample implementation:
private Dictionary<char,> nextCharLookUp = new Dictionary<char,>();
private List<char> excludeChars;
private void buildCharLists(int startChar, int endChar)
{
excludeChars = new List<char>
{
'\'','"','(',')',','
};
for (int i = startChar; i <= endChar; i++)
{
char c1 = Convert.ToChar(i);
char c2 = Convert.ToChar(i + 1);
if (excludeChars.Contains(c1)) continue;
if (excludeChars.Contains(c2))
{
nextCharLookUp.Add(c1,c1);
}
else
{
nextCharLookUp.Add(c1,c2);
}
}
}
Obviously, if you can pre-process your file/string before parsing it to remove any characters you don't want to handle, you'll save time.
Left for
you is to write some code that uses the 'nextCharLookUp Dictionary built here to parse file/string data: oh yes, you'll still have to test for "forbidden" characters, and do the right thing.
Dictionary look-up in .NET is highly optimized, i.e., fast (under-the-hood .NET Generic Dictionaries are Hash-Tables).