/ Sign-up
Your question

Pulling HTML Information using VB.Net and Webbrowser

  • Programming
  • Apps
Last response: in Apps General Discussion
October 31, 2012 5:02:08 AM

First off let me state that this isn't any kind of homework, I've been programming with since January on my own time.

The purpose of the application:

I'm trying to make a custom webbrowser for the browser game Travian,following their TOS, I'm only going to show information that is available to all players regardless of if they use "Gold" or not. I do not want to mess with/ inject any scripts (like greasemonkey uses) into the browser to get the information because Travian's TOS states that scripts that do this are forbidden.

My problem:

I'm trying to pull some information from the webpage using the webbroswer in Visual Basic, but the information I want to pull looks like this in HTML;

  1. <div class="boxes-contents cf"><table id="production" cellpadding="1" cellspacing="1">
  2. <thead>
  3. <tr>
  4. <th colspan="4">
  5. Production per hour: </th>
  6. </tr>
  7. </thead>
  8. <tbody>
  9. <tr>
  10. <td class="ico">
  11. <img class="r1" src="img/x.gif" alt="Wood" title="Wood" />
  12. </td>
  13. <td class="res">
  14. Wood:
  15. </td>
  16. <td class="num">
  17. 1320 </td>
  18. </tr>
  19. <tr>
  20. <td class="ico">
  21. <img class="r2" src="img/x.gif" alt="Clay" title="Clay" />
  22. </td>
  23. <td class="res">
  24. Clay:
  25. </td>
  26. <td class="num">
  27. 1401 </td>
  28. </tr>
  29. <tr>
  30. <td class="ico">
  31. <img class="r3" src="img/x.gif" alt="Iron" title="Iron" />
  32. </td>
  33. <td class="res">
  34. Iron:
  35. </td>
  36. <td class="num">
  37. 1230 </td>
  38. </tr>
  39. <tr>
  40. <td class="ico">
  41. <img class="r4" src="img/x.gif" alt="Wheat" title="Wheat" />
  42. </td>
  43. <td class="res">
  44. Wheat:
  45. </td>
  46. <td class="num">
  47. 1112 </td>
  48. </tr>
  49. </tbody>
  50. </table>
  51. </div>

I've pulled other information, such as village name and resource information like this;

*NOTE* This is inside a Timer so it will refresh the information ever 2ms just like the website does, There's no catch because you have to login to the site and the try statement simply keeps it from crashing while you login.

  1. Try
  2. GroupBox3.Text = WebBrowser1.Document.All.Item("villageNameField").InnerHtml()
  3. lblWood.Text = WebBrowser1.Document.All.Item("l1").InnerHtml()
  4. lblClay.Text = WebBrowser1.Document.All.Item("l2").InnerHtml()
  5. lblIron.Text = WebBrowser1.Document.All.Item("l3").InnerHtml()
  6. lblWheat.Text = WebBrowser1.Document.All.Item("l4").InnerHtml()
  7. Catch
  9. End Try

I've tried using the same technique to get the production information but because they all use "num" and "res" it didn't work. I've never done anything like this in VB before and any advice/help is greatly appreciated.

The question:

How can I pull the production information in the same way I did the resource information if the id's are the same for all of the production classes?

More about : pulling html information net webbrowser

a b L Programming
November 5, 2012 10:55:36 PM

This is a simple parsing problem. Notice that after each item, for example Wheat, that the quantity is the next text not inclosed in tags.
November 13, 2012 11:00:30 AM

So can I use .Innerhtmle() to pull the information or is there a better way?
November 13, 2012 4:16:09 PM

First, unless you're using Windows 8 it's not going to happen every 2 ms. It's going to happen at some multiple of 15.6. This is how the kernel works Win 7 and back.

Second, I'd suggest using the HttpWebRequest/Response objects and use a regular expression to scrape the number you're looking for.