Introduction
Interacting with sites that use JavaScript to generate content has, until now, been either very complex, or almost impossible. This tutorial will demonstrate the usage of the WebRobot v1.1 component to interact with the social bookmarking site digg, which employs JavaScript heavily to generate the displayed content, and to interact with it.You may click here to download the completed application, and you may also download a free trial version of the WebRobot v1.1 component here, or here for users of the .NET Framework 2.0.
First, we will create our instance of the WebRobot component, and enable AJAX mode:
Private wrobot As New foxtrot.xray.WebRobot
Private Sub Form1_Load(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles MyBase.Load
wrobot.AJAX = True
End Sub
Private Sub Form1_Closing(ByVal sender As Object, _
ByVal e As System.ComponentModel.CancelEventArgs) _
Handles MyBase.Closing
wrobot.Dispose()
End Sub
We created our instance of the WebRobot, enabled AJAX mode, and then, on the Closing event of our form, we called the Dispose metod to release all resources. Now, we will log in to digg:
wrobot.LoadPage("http://digg.com")
Dim loginform As foxtrot.xray.Form = wrobot.GetFormByContainsAction("login")
Dim userfield As foxtrot.xray.Input.Text = loginform.Fields(0)
Dim pswdfield As foxtrot.xray.Input.Password = loginform.Fields(1)
Dim sbmtfield As foxtrot.xray.Input.Submit = loginform.Fields(3)
userfield.Value = username
pswdfield.Value = password
sbmtfield.Click()
After loading the main page, and filling out the login form, we clicked on the submit button. We could have used the WebRobot's SubmitForm method, but since this page may use JavaScript for form and button events, it would be safer to just simulate a click, so that any code gets interpreted. The Click event blocks until all actions are performed and any necessary page navigation is complete.
Now, we can start parsing through the main page content, to detect all the news items displayed. The WebRobot v1.1 component has an Element object and a FindElements method that allow sifting through the page. The Event object also exposes a Click method, to allow clicking on the elements you find after parsing. Let's look for news items:
Dim newsitems As New System.Collections.ArrayList
Dim elements() As foxtrot.xray.Element = wrobot.FindElements("div")
For Each item As foxtrot.xray.Element In elements
Dim text As String = item.Text.TrimStart(vbCrLf.ToCharArray()).ToLower
If (text.IndexOf("<div class=news-summary") = 0) Then
newsitems.Add(item)
End If
Next
Now, we have the DIVs containing our news items. Note the use of the Text property of the elements to search for the class of the DIV.
Now that we have our list of DIVs, we will parse the content from them:
For Each newsitem As foxtrot.xray.Element In newsitems
Dim artinfo As New ArticleInfo
Get the H3s in the item, to look for the title
Dim titledata() As foxtrot.xray.Element = newsitem.FindElements("H3")
Dim urldata() As foxtrot.xray.Element = titledata(0).FindElements("A")
Dim ahref As String = urldata(0).Text
Dim parser As New _
System.Text.RegularExpressions.Regex("href=""(.*)"".*>(.*)</", _
System.Text.RegularExpressions.RegexOptions.IgnoreCase Or _
System.Text.RegularExpressions.RegexOptions.Singleline)
artinfo.URL = parser.Matches(ahref).Item(0).Groups.Item(1).Value
artinfo.Title = parser.Matches(ahref).Item(0).Groups.Item(2).Value
.
.
.
Next
We found the URL and title of the story by searching within the DIV. Now, we will find the amount of diggs, the digg This! link, and the digg discussion for each news item:
Dim digginfo() As foxtrot.xray.Element = newsitem.FindElements("strong")
For Each item As foxtrot.xray.Element In digginfo
Dim text As String = item.Text.TrimStart(vbCrLf.ToCharArray()).ToLower
If (text.IndexOf("<strong id=diggs-strong-") = 0) Then
parser = New System.Text.RegularExpressions.Regex(">(.*)</", _
System.Text.RegularExpressions.RegexOptions.IgnoreCase Or _
System.Text.RegularExpressions.RegexOptions.Singleline)
artinfo.Diggs = _
Integer.Parse(parser.Matches(text).Item(0).Groups.Item(1).Value)
End If
Next
urldata = newsitem.FindElements("A")
For Each item As foxtrot.xray.Element In urldata
If (item.Text.IndexOf("digg it") > -1) Then
artinfo.DiggLink = item
ElseIf (item.Text.IndexOf("class=more") > -1) Then
parser = New System.Text.RegularExpressions.Regex("href=""(.*)"".*>(.*)</", _
System.Text.RegularExpressions.RegexOptions.IgnoreCase Or _
System.Text.RegularExpressions.RegexOptions.Singleline)
artinfo.DiggMore = parser.Matches(item.Text).Item(0).Groups.Item(1).Value
End If
Next
Dim litem As New ListViewItem(artinfo.Title)
articlelist(litem) = artinfo
ListView1.Items.Add(litem)
We have populated our form with the article info. Now, we add code to load a web browser instance with the link story that was clicked on:
Private Sub ListView1_DoubleClick(ByVal sender As Object, _
ByVal e As System.EventArgs) Handles ListView1.DoubleClick
If (ListView1.SelectedItems.Count > 0) Then
Dim item As ListViewItem = ListView1.SelectedItems(0)
Dim artinfo As ArticleInfo = articlelist(item)
System.Diagnostics.Process.Start(artinfo.URL)
End If
End Sub
Now, we create a context menu, to be displayed whenever the user right-clicks on an article. This context menu will show the amount of diggs (in MenuItem1), enable the user to digg the story (in MenuItem2), and also launch a browser instance with the digg discussion (in MenuItem3). First, we will add code to update the digg count and wether the news item has been dugg or not:
Private Sub ListView1_Click(ByVal sender As Object, _
ByVal e As System.EventArgs) Handles ListView1.Click
If (ListView1.SelectedItems.Count > 0) Then
ListView1.ContextMenu = ContextMenu1
Dim item As ListViewItem = ListView1.SelectedItems(0)
Dim artinfo As ArticleInfo = articlelist(item)
MenuItem1.Text = artinfo.Diggs.ToString & " Diggs"
If (artinfo.DiggLink Is Nothing) Then
MenuItem2.Text = "Dugg!"
MenuItem2.Enabled = False
Else
MenuItem2.Text = "Digg this!"
MenuItem2.Enabled = True
End If
Else
ListView1.ContextMenu = Nothing
End If
End Sub
Now, we can add code to digg a news item:
Private Sub MenuItem2_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles MenuItem2.Click
If (ListView1.SelectedItems.Count > 0) Then
ListView1.ContextMenu = ContextMenu1
Dim item As ListViewItem = ListView1.SelectedItems(0)
Dim artinfo As ArticleInfo = articlelist(item)
If Not (artinfo.DiggLink Is Nothing) Then
artinfo.DiggLink.Click()
artinfo.DiggLink = Nothing
artinfo.Diggs += 1
MenuItem2.Text = "Dugg!"
MenuItem2.Enabled = False
MenuItem1.Text = artinfo.Diggs.ToString & " Diggs"
End If
End If
End Sub
Finally, we add code to load a browser window with the digg discussion link:
Private Sub MenuItem3_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles MenuItem3.Click
If (ListView1.SelectedItems.Count > 0) Then
Dim item As ListViewItem = ListView1.SelectedItems(0)
Dim artinfo As ArticleInfo = articlelist(item)
System.Diagnostics.Process.Start(artinfo.DiggMore)
End If
End Sub
We have interacted with digg, simulating a real user clicking on links. Short of captchas, there is no way for a web application to know that it's not a real user at the helm.
For more information on the WebRobot v1.1 component, visit
http://foxtrot-xray.com/main/prod/dev/web_robot. You may also download the full documentation at
http://foxtrot-xray.com/main/prod/dev/documentation.chm/view.
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.