Tuesday, July 14, 2009

Retrieve Content From Web Page - Screen Scraping

To retrieve the HTML code of a URL(this process is know as Screen Scraping), .NET provides WebClient class under System.Net namespace.Here I created a sample method which takes a URL and returns the HTML code.Include System.Net & System.Text namespaces

    private string GetPageContent(string url)
    {
             
string src = string.Empty;
             
try
             {
                     
WebClient client = new WebClient();
                     
UTF8Encoding encoding = new UTF8Encoding();
                      src = encoding.GetString(client.DownloadData(url));
              }
             
catch (Exception ex)
             {
                      Response.Write(ex.Message);
              }
              
return src;
      }


1 comment:

Bijayani said...

Hi,

I happened to see your post find it quite informative. I would like to share a link where a software engineer has shared a tip on "Screen Scraping in ASP.NET". I am sharing just it for the knowledge purpose.

Here is the link:
http://www.mindfiresolutions.com/Screen-Scraping-in-ASPNET-800.php

Hope you find it useful and of assistance.

Thanks,
Bijayani