Click here to Skip to main content
15,903,385 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
So in my windows form I've made a textbox which the user uses to enter the website address and then click a button to screen scrap that website into a browser control. Now what I want to do is save that screen scraped website to an sql database. How would I go about doing this?

EDIT: Heres my code wbHtmlpage is the name of my web browser.

EDIT 2: btnRead is where i'm reading the file from the database

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Net;
using System.IO;
using System.Data.SqlClient;

namespace FilmAndEntertainmentSystem
{
    public partial class Form2 : Form
    {
        public Form2()
        {
            InitializeComponent();
        }

        private string GetWebsiteHtml(string url)
        {
            WebRequest request = WebRequest.Create(url);
            WebResponse response = request.GetResponse();
            Stream stream = response.GetResponseStream();
            StreamReader reader = new StreamReader(stream);
            string result = reader.ReadToEnd();
            stream.Dispose();
            reader.Dispose();
            return result;
        }

        private void btnGetHTML_Click(object sender, EventArgs e)
        {
            string html = this.GetWebsiteHtml(this.txtUrl.Text);
            this.wbHtmlpage.DocumentText = html;
        }

        private void btnScreenSave_Click(object sender, EventArgs e)
        {
            string html = this.GetWebsiteHtml(this.txtUrl.Text);
            this.wbHtmlpage.DocumentText = html;

            byte[] bytes = System.Text.Encoding.ASCII.GetBytes(html);

            // set up data connection

            SqlConnection cs = new SqlConnection("Data Source=MASTER\\MASTER;Initial Catalog=FilmDB;Integrated Security=True");

            // Set up adapter manager

            SqlDataAdapter da = new SqlDataAdapter();

            using (SqlCommand com = new SqlCommand("INSERT INTO Website (WebsiteImage) VALUES (@Image)", cs))
            {
                com.Parameters.AddWithValue("@Image", bytes);
                cs.Open();

                com.ExecuteNonQuery();

                cs.Close();
            }
        }

        private void btnRead_Click(object sender, EventArgs e)
        {
            string html = this.GetWebsiteHtml(this.txtUrl.Text);
            this.wbHtmlpage.DocumentText = html;

            byte[] bytes = System.Text.Encoding.ASCII.GetBytes(html);

            // set up data connection

            SqlConnection cs = new SqlConnection("Data Source=MASTER\\MASTER;Initial Catalog=FilmDB;Integrated Security=True");

            // Set up adapter manager

            SqlDataAdapter da = new SqlDataAdapter();

            // Data set
            DataSet ds = new DataSet();

            da.SelectCommand = new SqlCommand("Select WebsiteImage From Website Where WebsiteID = 3", cs);

            da.Fill(ds, "Website");

            response.ContentType = "Image";
            response.BinaryWrite(bytes);

            cs.Open();
            cs.Close();



        }
    }
}
Posted
Updated 16-May-11 5:35am
v5

Depending on the size of the data, use an image datatype in your tables, and read the data from the WebBrowser control.DocumentText - this gives you the HTML for the page as a string.
Then just convert it to bytes:
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(s);
and insert it to your database.
using(SqlCommand com = new SqlCommand("INSERT INTO myTable (scrapedData) VALUES (@SD)", con))
   {
   com.Parameters.AddWithValue("@SD", bytes);
   s.ExecuteNonQuery();
   }


[edit]Passed string "s" to parameters instead of byte[] "bytes" - OriginalGriff[/edit]


"Yeah i know but how do I read it onto the web browser control?"


Response.ContentType = "image/JPEG";
Response.BinaryWrite(myBytesFromDataBase);



"It still isnt working, its not recognizing the response or the BinaryWrite. I've put my code in the question so you can take a look."

Perhaps, if somewhere between reading the data from the database and writing it into the response, you actually used the info from the database, it might work better as an image than the HTML does...

        private void btnRead_Click(object sender, EventArgs e)
        {
            string html = this.GetWebsiteHtml(this.txtUrl.Text);
            this.wbHtmlpage.DocumentText = html;
            byte[] bytes = System.Text.Encoding.ASCII.GetBytes(html);
 
...

            da.SelectCommand = new SqlCommand("Select WebsiteImage From Website Where WebsiteID = 3", cs);
            da.Fill(ds, "Website");

--->>> Perhaps a bit of code to use your database info here, might help a bit!
 
            response.ContentType = "Image";
            response.BinaryWrite(bytes);
 
            cs.Open();
            cs.Close();
        }
I would also suggest that your connection "cs" might be more usefully opened before you try to read from it, rather than after... :laugh:
 
Share this answer
 
v4
Comments
programmer1234 12-May-11 14:46pm    
what is the 's' ment to represent in the bytes code?
OriginalGriff 12-May-11 14:50pm    
It's what we technical bods call "a c*** up". Should have been the byte array.
Sorry about that...:blush:
programmer1234 12-May-11 14:51pm    
so what would go in there instead?
OriginalGriff 12-May-11 14:54pm    
Um...see revised answer?
OriginalGriff 12-May-11 14:55pm    
Ah! I see what you mean, I assumed that bit would be obvious: s is the string you loaded the output of the WebBrowser control Document Text property into.
You probably want to use a blob field - some web pages can be huge.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900