Quote:
Please do suggest which algorithm will fit the bill to correctly identify a given record(s) based on the set of values provided.
You are in the worst possible situation, the flat text data imply the use of brut force. The name of fields in every records make it even worse the CSV format.
The only optimization you can expect is to first check the field that will best filter the records. But it will not change much the time to process the data.
So:
1) read all lines, no optimization
2) for each line, split on comma, no optimization
3) for each field, split on equal, modest optimization possible depending on filter.