Click here to Skip to main content
15,867,308 members
Articles / Programming Languages / C#
Article

PDF creation using C# (and Office) from RTF/DOC files

Rate me:
Please Sign up or sign in to vote.
3.18/5 (38 votes)
25 Feb 20042 min read 381.3K   17.5K   112   44
Converts RTF, DOC to PDF; sample is a part of a big Project that converts nearly everything, parts can be used to convert html, bmp ,Lotus 1-2-3 documents...

Introduction

Some time ago I had to write a C# application that was able to convert documents into various formats. The hardest part was to find a way to create PDF files without the use of any third party products. Here is a solution.

Background

The source you see is out of a larger conversion-application. It is a "stand alone" projects, for educational use and describes a possible way of converting documents. I took me a lot of work to figure this out, so please don't copy the code; drop me a line if you wish to use a part of it. My comments are written in German, I had no time to build a proper release, sorry for that.

Using the code

The code listed below describes the main part of the program. I will give you a brief look, at the idea behind. Crystal reports (the .NET reporting system) is able to create PDF files. The only problem is it can only create PDF files out of a database. It requires an ole objects in the Database. But if you have a (very) close look at CR (IDA :-) ) you will find out that it is able to process, bmp, emf, and wmf. So we only have to insert this kind of data in to a table (as blob) and hand it over to CR. Emf can be created by using PowerPoint, PPT can read html, WinWord can create html. The only problem left is the organization of our pages, we have to split the document manually, I did this by using a Richtextbox.

Now that we know the Way we can convert a rtf into a PDF:

  1. We load the rtf into a richtextbox.
  2. We split in into parts. every part is loaded into WinWord, saved as html, the WinWord header is being destroyed, the html page is loaded into PowerPoint and saved as emf. The emf file is written into a Access database as a blob object.
  3. Crystal Reports gets a rpt "template" the database the report is being created and saved as PDF.

I tried to use Ole32 functions, but I didn't find a way to accomplish this, if you know a way in C#.NET please let me know.

C#
private static void DoRTF2ALL(
   CrystalDecisions.Shared.ExportFormatType outTp)
 {
   int lastsplit = 0;
   int nextsplit = 0;
   int pageheight= 650;  //"Länge" unserer Seite
   int pcount= 1;
   Point xx;
   object Unknown =Type.Missing;
   Word.Application newApp;
   PowerPoint.Application  app;
   PowerPoint.Presentation ppp;
   string[] TempEnt;



   RichTextBox rtf = new RichTextBox();
   rtf.Height=25000;// nur page size required
   rtf.Width=4048;// Solle reichen
   //Console.WriteLine(rtf.Bounds.ToString());
   rtf.LoadFile(scrfile, RichTextBoxStreamType.RichText);
   nCoreHlp.EmptyDB(WorkDir + "\\" + Database);
   while ((lastsplit+1)<rtf.Text.Length) ////start page split
   {


     // die ersten paar seiten wegschneiden
     rtf.SelectionStart = 0;
     rtf.SelectionLength =lastsplit;
     rtf.Cut();
     for (int r=0;r<=rtf.Text.Length;r++) ////parse through whole text
     {
       xx = rtf.GetPositionFromCharIndex(r);
       nextsplit = rtf.Text.Length;
       if (int.Parse(xx.Y.ToString()) > pageheight)
           {nextsplit=r-1;r=rtf.Text.Length;}

     }
     lastsplit=lastsplit+nextsplit;// ende wegschneiden
     rtf.SelectionStart = nextsplit;
     rtf.SelectionLength =rtf.Text.Length-nextsplit;
     rtf.Cut();
     rtf.SaveFile(WorkDir + \\temp.rtf,
            RichTextBoxStreamType.RichText);
     //////////////////////////////////////////// insert db
     newApp = new Word.Application();
     newApp.Visible = false;

     object Source=WorkDir + "\\temp.rtf";
     object Target=WorkDir + "\\temp.html";

     newApp.Documents.Open(ref Source,ref Unknown,
       ref Unknown,ref Unknown,ref Unknown,
       ref Unknown,ref Unknown,ref Unknown,
       ref Unknown,ref Unknown,ref Unknown,
       ref Unknown,ref Unknown,ref Unknown,ref Unknown);

     object format = Word.WdSaveFormat.wdFormatHTML;// kein XML, nutzen?
     newApp.ActiveDocument.SaveAs(ref Target,ref format,
       ref Unknown,ref Unknown,ref Unknown,
       ref Unknown,ref Unknown,ref Unknown,
       ref Unknown,ref Unknown,ref Unknown,
       ref Unknown,ref Unknown,ref Unknown,
       ref Unknown,ref Unknown);
     newApp.Quit(ref Unknown,ref Unknown,ref Unknown);

     //kill word head
     StreamReader sr;
     bool not=true;
     while (not)
     {
       try
       {

         sr = new StreamReader(WorkDir + "\\temp.html");
         not=false;
         StreamWriter sw = new StreamWriter(WorkDir + "\\temp.txt");
         String line;

         while ((line = sr.ReadLine()) != null)
         {
           if (line.CompareTo(
             "<meta name=ProgId content=FrontPage.Editor.Document>")!=0)
             sw.WriteLine(line); else
             //line.Replace //Marina
             sw.WriteLine("<meta name=ProgId content=Word.Documens>");
         }

         sr.Close();
         sw.Flush();
         sw.Close();

       }
       catch (Exception e){e=e;}
     }// kill word head end

     File.Delete(WorkDir + "\\temp.html");
     File.Move(WorkDir + "\\temp.txt", WorkDir + "\\temp.html");
     //File.Delete(WorkDir + "\\temp.txt");


     app = new PowerPoint.Application();
     ppp = app.Presentations.Open(WorkDir + "\\temp.html",
       /*Microsoft.Office.Core.MsoTriState.msoCTrue*/0,
       /*Microsoft.Office.Core.MsoTriState.msoTrue*/0,
       Microsoft.Office.Core.MsoTriState.msoFalse);//visible? immer no
     ppp.SaveAs(WorkDir + "\\temp",
       PowerPoint.PpSaveAsFileType.ppSaveAsEMF,
       Microsoft.Office.Core.MsoTriState.msoFalse);
     app.Quit();

     //output fangen
     TempEnt = Directory.GetFiles(WorkDir + "\\temp\\", "*.emf");
     nCoreHlp.InsertDB1(WorkDir + "\\" + Database,TempEnt[0],pcount);
     pcount++;
     Console.WriteLine("1 Page Converted");// debug
     //Console.ReadLine();
     ////////////////////////////////////////////////// insert db off
     rtf.LoadFile(scrfile, RichTextBoxStreamType.RichText);

   }//////////// page split done
   ///Create PDF
   ReportDocument doc = new ReportDocument();

   doc.Load(WorkDir + "\\DtoD.rpt");
   doc.Database.Tables[0].Location = (WorkDir + "\\DtoD.mdb");

   doc.ExportOptions.ExportFormatType = outTp;
   doc.ExportOptions.ExportDestinationType =
           ExportDestinationType.DiskFile;
   //DiskFileDestinationOptions
   DiskFileDestinationOptions diskOpts = new DiskFileDestinationOptions();
   diskOpts.DiskFileName = dstfile;
   doc.ExportOptions.DestinationOptions = diskOpts;
   doc.Export();
   doc.Close();
   Directory.Delete(WorkDir + "\\temp\\", true);
   File.Delete(WorkDir + "\\temp.rtf");
   File.Delete(WorkDir + "\\temp.html");
 }

Points of Interest

MS-Office 2000 or < has to be installed. I included the Word & PowerPoint interfaces in the project , the C# dlls are only in the bin/debug directory, because of the file size.

History

  • 1st version of the demo project.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Germany Germany
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralWord to PDF creater Pin
Rajput Jitendra8-May-06 21:42
Rajput Jitendra8-May-06 21:42 
GeneralRe: Word to PDF creater Pin
awen_k1-Sep-08 5:02
awen_k1-Sep-08 5:02 
GeneralRe: Word to PDF creater Pin
jayeshthummar13-Oct-19 21:35
jayeshthummar13-Oct-19 21:35 
Questionhow to create DtoD.rpt Pin
sridhar chatla13-Apr-06 6:08
sridhar chatla13-Apr-06 6:08 
Generalerror Pin
sewedy3-Mar-06 14:51
sewedy3-Mar-06 14:51 
GeneralRe: error Pin
Doan Quang Minh16-Jan-07 20:17
Doan Quang Minh16-Jan-07 20:17 
GeneralDemo Code Pin
Davey P13-Jan-06 5:31
Davey P13-Jan-06 5:31 
QuestionI would like to use a part of your implementation Pin
vishalkmehta15-Nov-05 15:52
vishalkmehta15-Nov-05 15:52 
Hi Stefan!

I am intrigued by the topic and would like to use (with your permission) a part of your implementation that would do a conversion of DOC and PPT to a BMP or JPEG. Please let me know if you can provide some help.

Thanks much.



Best regards,
Vishal K. Mehta
Generalbmp to pdf conversion Pin
leojose26-May-05 20:24
leojose26-May-05 20:24 
GeneralRe: bmp to pdf conversion Pin
J kam25-Oct-05 3:13
J kam25-Oct-05 3:13 
GeneralBugs Pin
Zu Luong29-Apr-05 8:28
Zu Luong29-Apr-05 8:28 
Question3rd party products? Pin
Kapslok27-Jun-04 23:41
Kapslok27-Jun-04 23:41 
AnswerRe: 3rd party products? Pin
Ryan Beesley6-Mar-05 21:38
Ryan Beesley6-Mar-05 21:38 
GeneralDon't know what to say... Pin
Oha Ooh5-Mar-04 2:10
Oha Ooh5-Mar-04 2:10 
GeneralRe: Don't know what to say... Pin
TravisO6-Sep-05 5:58
TravisO6-Sep-05 5:58 
GeneralRe: Don't know what to say... Pin
stormhun25-Jun-06 23:44
stormhun25-Jun-06 23:44 
GeneralRe: Don't know what to say... Pin
Heeut4-May-14 23:54
professionalHeeut4-May-14 23:54 
Generalpdf creation with graphics on html/doc file Pin
prafulla4-Mar-04 23:28
prafulla4-Mar-04 23:28 
GeneralRe: pdf creation with graphics on html/doc file Pin
Tugs22429-Jul-04 21:39
Tugs22429-Jul-04 21:39 
GeneralAdditional info Pin
MREBI26-Feb-04 11:02
MREBI26-Feb-04 11:02 
GeneralRe: Additional info Pin
Links2348-Mar-04 2:43
Links2348-Mar-04 2:43 
GeneralRe: Additional info Pin
saber378866124-Aug-04 20:58
saber378866124-Aug-04 20:58 
GeneralRe: Additional info Pin
Dr. Federico De Martino24-Oct-06 1:13
Dr. Federico De Martino24-Oct-06 1:13 
GeneralRe: Additional info Pin
kant772-Jul-07 19:18
kant772-Jul-07 19:18 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.