More CSV Files Than You Can Shake a Stick At
It has been estimated that there are more CSV files floating around the Internet and stashed within the innards of hard drives than there are longhorns in Laredo or Sylvester Stallone movie sequels. That being the case, you may want to make use of those hordes of data by making the content of those files available in the form of a collection of easily traversable data. So why not parse them and store their contents in a generic list?
It's easy - say you have a class named Platypus defined like so, mirroring the contents of the CSV file:
public class Platypus
{
public int Id { get; set; }
public String duckbillName { get; set; }
public double billLength { get; set; }
public double poisonToeLength { get; set; }
public String description { get; set; }
}
Just so you can better envision what's happening, the contents of the CSV file may be something like this:
- id,duckbill_name,billLength,poisonToeLength,description
- 1,Daffy Duckbill,3.14,0.17,Daffy has a great personality - or should I say platypusality
- 2,Plato,8.67,5.309,Plato has a platonic relationship with Daffy and has eyes like plates
You can now write a method to open the CSV file, parse it, populate the appropriate generic list, and then return it to the caller, like so:
public List<Platypus> GetPlatypus(string pondNum)
{
List<Platypus> platypus = new List<Platypus>();
const String PlatypusBase = @"C:\Duckbills\PlatypusPalace{0}.csv";
String fileToLoad = String.Format(PlatypusBase, pondNum);
using (StreamReader r = new StreamReader(fileToLoad))
{
string line;
while ((line = r.ReadLine()) != null)
{
string[] parts = line.Split(',');
if (parts[0] == "id") continue;
Platypus dbp = new Platypus();
dbp.Id = Convert.ToInt32(parts[0]);
dbp.duckbillName = parts[1];
dbp.billLength = Convert.ToInt32(parts[2]);
dbp.poisonToeLength = Convert.ToInt32(parts[3]);
dbp.description = parts[4];
platypus.Add(dbp);
}
}
return platypus;
}
Calling it is easy - assuming the CSV files are named in a pattern like "PlatypusPalace1", "PlatypusPalace2", "PlatypusPalace42", etc., this is all you need:
List<Platypus> platypi = new List<Platypus>();
platypi = GetPlatypus("1879");
And of course, spreadsheet files can be saved as CSV, so anything you have in an Excel (or Open Office Calc, etc.) file can be populated into class collections, and from there you can populate database tables, perhaps after scrubbing and normalizing the data.
A "big ball of information overflow mud" (45 gazillion columns in a spreadsheet that has had data added to it since the dawn of digital offices) can be rejuvenated by massaging and renovating the data via analysis, parsing, and normalization (dispatching the raw data into appropriate lookup tables, one-to-many tables, and many-to-many tables). Long live SQL!
Call to Action
If you find this tip useful, consider making a donation to POTWU (Platypi of the World Unite) or DUH (Duckbills Unlimited, Huh?). Alternatively, you could sew a set of mittens (we'd need four) for my pet Platypus.