Click here to Skip to main content
15,885,546 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hello Gurus

I have very complex csv file that contain data like below, infact I removed few columns, still it is a lengthy row but I want to give such complex row as example to get the best output..
10.0,"[""Categories""]",,0,,,,,"[""Automotive and Transport"", ""Business and Finance"", ""Chemicals and Materials"", ""Company Reports"", ""Consumer Goods and Services"", ""Country Reports"", ""Energy and Natural Resources"", ""Food and Beverage"", ""Government and Public Sector"", ""Healthcare"", ""Humanities Books"", ""Industry Standards"", ""Manufacturing and Construction"", ""Military Aerospace and Defense"", ""Pharmaceuticals"", ""Science Books"", ""Telecommunications and Computing""]","[""Automotive"", ""Commercial Aerospace"", ""Maritime"", ""Private Transport Services"", ""Public Transport"", ""Rail"", ""Banking"", ""Business"", ""Economics"", ""Finance"", ""Insurance"", ""Legal"", ""Property"", ""Adhesives and Sealants"", ""Advanced Materials"", ""Ceramics"", ""Chemicals"", ""Composites"", ""Glass"", ""Metals and Minerals"", ""Nanomaterials"", ""Packaging"", ""Plastics"", ""Pulp and Paper"", ""Recycling and Recycled Materials"", ""Rubber"", ""Textiles"", ""Wax"", ""Wood and Wood Products"", ""Arts and Crafts Supplies"", ""Baby and Child Products"", ""Clothing"", ""Consumer Electronics"", ""Consumer Services"", ""Household"", ""Media and Entertainment"", ""Personal Care Products"", ""Retail"", ""Sporting Goods and Equipment"", ""Toys"", ""Travel and Tourism"", ""ICON Country Market Reports"", ""Biofuels"", ""Combined Heat and Power (CHP)"", ""Electricity"", ""Energy Maps"", ""Energy Storage"", ""Environmental"", ""Fossil Fuels"", ""Mining"", ""Nuclear Power"", ""Pipelines"", ""Renewable Energy"", ""Utilities"", ""Water"", ""Agriculture"", ""Beverage"", ""Food"", ""Food Ingredients"", ""Food Processing"", ""Food Safety"", ""Food Service"", ""Tobacco"", ""Government"", ""International Relations"", ""Public Sector"", ""Biotechnology"", ""Healthcare Services"", ""Laboratory Equipment"", ""Medical Devices"", ""Library and Information Science"", ""Political Science"", ""Psychology"", ""Society and Social Sciences"", ""Construction"", ""Engineering"", ""Manufacturing and Industry"", ""Security Services"", ""Ammunition"", ""Biodefense"", ""C4ISR"", ""Homeland Defense"", ""Infantry Weapons and Equipment"", ""Military Aircraft"", ""Military Logistics"", ""Military Unmanned Systems"", ""Missiles and Missiles Technology"", ""Naval Vessels"", ""Radar Systems"", ""Tanks and Armored Vehicles"", ""Anesthetic Drugs"", ""Animal Pharmaceuticals"", ""Antibiotics"", ""Biopharmaceuticals"", ""Clinical Trials"", ""Drug Delivery"", ""Drug Discovery"", ""Drugs by Therapeutic Area"", ""Generic Drugs"", ""Oncology Drugs"", ""Over the Counter (OTC) Drugs"", ""Pharmaceutical Intermediates"", ""Pharmaceutical Manufacturing"", ""Pharmacoeconomics"", ""Vitamins and Dietary Supplements"", ""Biology and Life Sciences"", ""Chemistry"", ""Geography"", ""Mathematics"", ""Physics"", ""Computing and Technology"", ""Internet and E-Commerce"", ""Telecommunications and Networks""]","","width=device-width, initial-scale=1.0, maximum-scale=1.0",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"{""Content-Type"":""text/html; charset=utf-8"",""Via"":""1.1 varnish"",""Fastly-Debug-Digest"":""e8f4fd7caeda67cf3daa747e38e1310cb8d84d9e2af22e6156b3d792b4ef7821"",""Content-Length"":""400501"",""Accept-Ranges"":""bytes"",""Date"":""Wed, 05 Apr 2017 21:56:34 GMT"",""Age"":""16376"",""Connection"":""keep-alive"",""X-Served-By"":""cache-lhr6350-LHR, cache-iad2128-IAD"",""X-Cache"":""HIT, HIT"",""X-Cache-Hits"":""1, 1"",""X-Timer"":""S1491429394.073711,VS0,VE7"",""Cache-Control"":""no-cache, no-store, private, must-revalidate, max-age=0, max-stale=0, post-check=0, pre-check=0"",""Expires"":""0"",""Pragma"":""no-cache""}"


As you can see there additional commas as part of the enclosed double quotes.. I want to ignore them and replace the comma delimiter with any unique character. This data has many special characters like |, &, ^ etc. so I will have to decide about unique delimiter.

Looking for something in either powershell or vbscript

I did record a macro by opening the csv file in excel and then replace all commas with "##". This is working manually but when the macro is executed, I am able to see many rows with the commas enclosed in double quotes.

The file is actually huge about 7GB so can't really do it manually

What I have tried:

Set wb = Application.Workbooks.Open(ToPath, UpdateLinks:=0)

    DisplayAlerts = False
    ActiveSheet.Cells.Select
    Cells.Replace What:=",", Replacement:=";", LookAt:=xlPart, SearchOrder _
        :=xlByRows, MatchCase:=False, SearchFormat:=False, ReplaceFormat:=False
    wb.Close savechanges:=True
    Application.WindowState = xlNormal
Posted
Updated 31-May-17 8:10am
Comments
RedDk 31-May-17 13:34pm    
Well, for starters, with respect to that replacement of "inner" commas, one would expect better results if you widened the scope of the replacement string (by widening the scope of the target) first.

But sounds like a job for msdos 6.12! Sorry, don't know about puershell ...
Maciej Los 31-May-17 14:00pm    
DOS? No!
RedDk 31-May-17 14:28pm    
Ha! It's been so long I had to actually open a .csv to check my gut ... in order to recall that it was, indeed, a text file! So use whatever you've got around the house.
Maciej Los 31-May-17 13:59pm    
How does it related to: VBScript or PowerShell? You used VBA.
As to me, it looks like JSON data.

1 solution

 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900