Click here to Skip to main content
15,884,425 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Trying to get the title of a website URL entered by the user in a textbox but I am not sure what to put in place of "source".

String title = Regex.Match(source, @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>", RegexOptions.IgnoreCase).Groups["Title"].Value;
                                label2.Text = title;



main portion of my code

HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(this.textBox1.Text);

                    using (HttpWebResponse myHttpWebResponse = (HttpWebResponse)myRequest.GetResponse())
                    {
                        if (myHttpWebResponse.StatusCode == HttpStatusCode.OK)
                        {
                            string message = "200 OK";
                            string caption = "Status Code";
                            MessageBoxButtons buttons = MessageBoxButtons.OK;
                            DialogResult result;

                            // Displays the MessageBox.
                            result = MessageBox.Show(message, caption, buttons);

                            Stream streamResponse = myHttpWebResponse.GetResponseStream();

                            // Get stream object
                            StreamReader streamRead = new StreamReader(streamResponse);

                            Char[] readBuffer = new Char[256];
                            // Read from buffer
                            int count = streamRead.Read(readBuffer, 0, 256);
                            while (count > 0)
                            {
                                
                                // get string
                                String resultData = new String(readBuffer, 0, count);

                                // Write the data
                                richTextBox1.Text += (resultData);
                                
                                label3.ForeColor = System.Drawing.Color.Green;
                                label3.Text = " 200 OK ";
                                label3.Visible = true;

                                
                                //string title = Regex.Match(source, @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>", RegexOptions.IgnoreCase).Groups["Title"].Value;
                                //label2.Text = title;
                                label2.Visible = true;

                                // Read from buffer
                                count = streamRead.Read(readBuffer, 0, 256);

                            }// Release the response object resources.
                            streamRead.Close();
                            streamResponse.Close();
                            myHttpWebResponse.Close();
                        }
                    }
                }

                catch
                {
                    
                    if (textBox1.Text != (@" ^(\b(https):(\/\/|\\\\)[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?|\bwww\.[^\s])"))
                    {
                        string message2 = "Status Code";
                        string caption2 = "404 Not Found";
                        MessageBoxButtons buttons2 = MessageBoxButtons.OK;
                        DialogResult result2;
                        result2 = MessageBox.Show(message2, caption2, buttons2);
                        richTextBox1.Clear();
                        richTextBox1.Text = "Page not found!";
                        label3.Text = "Error 404 ";
                        label3.ForeColor = System.Drawing.Color.Red;
                        label3.Visible = true;
                    }
}
}


What I have tried:

String title = Regex.Match(textBox1.Text, @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>", RegexOptions.IgnoreCase).Groups["Title"].Value;
                                label2.Text = title;


did not work. what can I change?
Posted
Updated 5-Oct-21 20:57pm
v2
Comments
j snooze 5-Oct-21 17:18pm    
you need to retrieve the html source code from the url supplied in your text box. Right now it looks like you are trying to find the title by searching the URL link...you need to get the pages HTML first.
candijen 5-Oct-21 17:47pm    
yes I have an html request. I will edit the question and post that part but I don't know how to display the title.
Afzaal Ahmad Zeeshan 5-Oct-21 18:51pm    
Have you tried to use HtmlAgilityPack? https://html-agility-pack.net/

1 solution

Don't use a regex - it's not a good choice for HTML processing.
Instead, use the HTMLAgilityPack[^] and it becomes a couple of lines of code.
HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load(url);
string title = doc.DocumentNode.SelectSingleNode("//title").InnerText;
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900