Click here to Skip to main content
15,920,031 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hello,

I have a .txt from which i want to remove the whitespaces and the line breaks. The task that i want to do is remove all the whitespaces and line breaks from the .txt file and parse them as DateTime spilitting it from "-".


The .txt file looks like this < here "*" mean each whitespace>
-------------------------------------
09:00-16:00
***********19:00-23:00
10:00-17:00
***********20:00-24:00
07:00-14:00
************18:00-22:00

04:00-09:00
*************12:00-18:00

04:00-10:00
***********14:00-19:00

05:00-12:00

16:00-20.00

06:00-13:00
***********17:00-21:00
-------------------------------

The result should look like this.
--------------------------------------------
09:00-16:00
19:00-23:00
10:00-17:00
20:00-24:00
07:00-14:00
18:00-22:00
04:00-09:00
12:00-18:00
04:00-10:00
14:00-19:00
05:00-12:00
16:00-20.00
06:00-13:00
17:00-21:00
----------------------------------------------

Actually the data is extracted from internet and the whole code looks like this.

void update()// it extracts data from the website
       {
           HtmlWeb htmlweb = new HtmlWeb();
           HtmlAgilityPack.HtmlDocument document = htmlweb.Load("http://www.myrepublica.com/portal/index.php?action=pages&page_id=8");

           var extracted = document.DocumentNode.Descendants("span").Where(x => x.Attributes.Contains("style"));

           var schedule = "";
           foreach (var link in extracted)
           {
               // Saving the data to a variable
               schedule +="\n" + string.Format("{0}", link.InnerText);
           }

           var new_schedule = "";
           new_schedule = schedule.Substring(schedule.IndexOf("Group 1"), schedule.IndexOf("Substations"));
           label1.Text = new_schedule;


           StreamWriter writer = new StreamWriter("schedule.txt");
           writer.Write(label1.Text);
           writer.Close();
       }


       // button which loads the manipulated data.
       private void btn_load_Click(object sender, EventArgs e)
       {
           List<string> lst_schedule = new List<string>();
           StreamReader reader = new StreamReader("schedule.txt");

           while (reader.ReadLine() !=null)
           {
               lst_schedule.Add(reader.ReadLine());
           }
           reader.Close();
           foreach (string s in lst_schedule)
           {
               s.Replace(" ", string.Empty);
               s.Replace("\n", string.Empty);
               label1.Text += "\n" + s;
           }





Any kind of help is appreciated.
Posted

Try to adapt this:
C#
using System;
using System.Text.RegularExpressions;
public class Program
{
    public static void Main()
    {
        string input = "09:00-16:00    19:00-23:00     10:00-17:00    20:00-24:00\n07:00-14:00    18:00-22:00    04:00-09:00    12:00-18:00";
        string pattern = "\\s+";
        string replacement = "\n";
        Regex rgx = new Regex(pattern);
        string result = rgx.Replace(input, replacement);
        Console.WriteLine(result);
    }
}
 
Share this answer
 
Comments
BibhutiAlmighty 25-Jan-15 19:43pm    
string pattern = "\\s+";

What does this line of code mean ?
Peter Leow 25-Jan-15 21:29pm    
in Regex, \s means any whiteapace, + means one or more, while the first \ is just to escape it. Read more: http://www.codeproject.com/Articles/9099/The-Minute-Regex-Tutorial
BillWoodruff 27-Jan-15 3:37am    
+5
Peter Leow 27-Jan-15 3:40am    
Thank you, BillWoodruff.
I assume you want to extract the time values and parse them as DateTime.
The snippet below shows how you could do it:
string pattern = "(?<StartHours>[0-9]{2}):(?<StartMinutes>[0-9]{2})-(?<EndHours>[0-9]{2}):(?<EndMinutes>[0-9]{2})";

Regex regex = new Regex(pattern);

MatchCollection mc = regex.Matches(input);

DateTime now = DateTime.Now;

foreach(Match match in mc)
{
    int startHours = Convert.ToInt32(match.Groups["StartHours"].Value);
    int startMinutes = Convert.ToInt32(match.Groups["StartMinutes"].Value);
    int endHours = Convert.ToInt32(match.Groups["EndHours"].Value);
    int endMinutes = Convert.ToInt32(match.Groups["EndMinutes"].Value);

    DateTime dtStart = new DateTime(now.Year, now.Month, now.Day, startHours, startMinutes, 0);
    DateTime dtEnd = new DateTime(now.Year, now.Month, now.Day, endHours, endMinutes, 0);
}

(input is the content of your .txt file)
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900