It's hard to decide internally which is similar and which is not. If you want to automate this, you need to give some AI to your program.
Well, in SQL Server there are some options which can take you close to the what you wanted to achieve but accurate as it requires AI to decide, as I already said.
In application you definitely got much power to build your own logic and achieve more accurate result but let's see how we can quicky use some SQL Server functions to get something smilar.
SOUNDEX (Transact-SQL)[
^]
DIFFERENCE (Transact-SQL)[
^]
I would suggest to read these two documentation carefully.
Example (taken from above documenation link):
SELECT SOUNDEX ('Smith'), SOUNDEX ('Smythe');
Result:
----- -----
S530 S530
We got same value as the result for both. Now let's use
DIFFERENCE
to compare these values.
SELECT DIFFERENCE ('Smith','Smythe');
Result:
-----
4
Ok, so what's "4" means here?
Quote:
The integer returned is the number of characters in the SOUNDEX values that are the same. The return value ranges from 0 through 4: 0 indicates weak or no similarity, and 4 indicates strong similarity or the same values.
SELECT DIFFERENCE ('Smith','Jones');
Result:
-----
2
Using your inputs-
SELECT DIFFERENCE ('chocolate','chocolates'),DIFFERENCE ('inch','innch');
Result:
----- -----
4 4
Hope, you get an idea about what I was trying to explain.
Please let me know in case of any further queries.