Convert/Display PDF & Word Files as Images in .NET

Recently I had to write some code as a sample to be able to display Word and Acrobat files on a Web page in a 2-page view. We couldn’t simply use a plugin or application on the user’s machine to do so. The solution was to export each page out to an image and then display the pages in any way we needed. We therefore needed to do some document export code from our application.

Word to Images

Converting Word files (.DOC & .DOCX) to images was fairly simple, although I do think I probably took a longer approach. The problem is that the older DOC format and the newer DOCX format have different APIs to work with. So instead of doing this, I simply exported them both to XPS and then used the XPS API to retrieve images for each page. This last part was not obvious to me till I found the one line of code required to do this on a forum – I apologize for not linking to this as I can’t find that link anymore. The credit for that part is wholly the original author’s.

PDF to Images

Converting PDF to images was a major issue. There is no direct way of doing this. There are many 3rd party components that are available to do this, but most of them cost a bomb. Some free ones like PDFSharp are able to iterate pages but there is no way to export a complete page to an image without walking through the entire structure of the page and redrawing everything.

This is where I found the GFLAX library. This requires GhostScript for Windows to be installed on the machine as well. You can register the DLL and then reference it in your .NET code.

Code Sample

I’m attaching the entire code sample to this post as a download. The code is released with the open source BSD license. All external components (Word and Office interop assemblies, GFL, etc.) have the copyright of their owners and must be adhered to.

Usage

Once you download the attachment and extract it, open it in Visual Studio 2008. Make sure you’ve installed GhostScript from the link above and run a “regsvr32 GFLAX.dll” for the GFLAX Library. Add reference to Microsoft.Office.Interop.Word on your machine (and remove the marked lines from Web.config) from the .NET tab and to GFLAX from the COM tab.

Run the application and upload .doc, .docx and .pdf files and you can then view them in the browser directly.

WordDisplay 
Two page display of an uploaded Word file. Works with PDF files too.

Download:
Tags: , , ,
Categories: ASP.NET | Development | Office | Tips | Download

7 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

Comments

Pingback from topsy.com

Twitter Trackbacks for
        
        Random Rants & Raves | Convert/Display PDF & Word Files as Images in .NET
        [vinodunny.com]
        on Topsy.com

December 2. 2009 23:13 | topsy.com |

Can I download your PDF files?

December 15. 2009 18:30 | club penguin cheats United States |

Where i can download your pdf file?

December 20. 2009 21:43 | best weight loss pills United States |

You can download the code for this - not the PDF file itself. It should work with any PDF file.

December 21. 2009 07:36 | vinod United States |

Thank You

August 2. 2010 00:09 | Your love is my drug |

Comments are closed