marți, 6 decembrie 2011

C# Save image from HTML page in WebBrowser control

So you have a C# windows application, a WebBrowser control inside it, and once you navigated to a webpage, you want to save the images inside the loaded HTML webpage.

The best solution I found here: http://stackoverflow.com/questions/2566898/save-images-in-webbrowser-control-without-redownloading-them-from-the-internet/8397454#8397454

The main ideas are:

- Add a reference in your project to Microsoft.mshtml (Project -> Add reference -> GAC -> Microsoft.mshtml).

- Add a 'using' statement at the top of your .cs program:

using mshtml;

- Use the following code to save all images in the HTML document

  IHTMLDocument2 doc = (IHTMLDocument2)WebBrowser1.Document.DomDocument; // gets the dom document of our object WebBrowser1
  IHTMLControlRange imgRange = (IHTMLControlRange) ((HTMLBody) doc.body).createControlRange(); // setting up the controls so we can copy/paste the HTML image objects

  foreach (IHTMLImgElement img in doc.images) // walk through all the images inside the dom document
  {
    imgRange.add((IHTMLControlElement) img); // set up which image (the current one) we are controlling
    imgRange.execCommand("Copy", false, null); // copy the current controlled image
    using (Bitmap bmp = (Bitmap) Clipboard.GetDataObject().GetData(DataFormats.Bitmap)) // create a bitmap object
    {
      bmp.Save(@"C:\downloadedimages\"+img.nameProp); // save the bitmap object to this path on our harddrive
    }
  }
 

Please note the WebBrowser1 name - you must change this to your WebBrowser control name.
Also note the path to where the image is saved - some security measures should be taken here against unwanted file names, executables, etc. as well as making sure the path points to the correct location on your hard drive.

This saves the time and network traffic of re-downloading the image again. There is an alternative to get/reconstruct the absolute image source URL, reading again and then writing the contents in a local file. This didn't work for me as i didn't want to download the image again. There are some server setting which also check for an existing session or cookie, or referrer, by which a direct access to the image is prevented, so the alternative may not work.

The current method is fastest, and it's most intuitive, just as you would do it yourself - right click and save the image from any webpage. Of course, if the 'right-click' is not disabled through Javascript or other method, which in our case is bypassed :)




Un comentariu: