Click here to Skip to main content
15,884,537 members
Articles / Desktop Programming / Win32
Tip/Trick

Simple Word Document Viewer

Rate me:
Please Sign up or sign in to vote.
3.00/5 (3 votes)
10 Nov 2018CPOL2 min read 14.2K   952   5  
Simple Word Document File Viewer

Introduction

This article describes how to build a simple Microsoft Word document viewer (.docx) format.

It is useful for viewing the Word document in your project for any purpose.

The viewer is very simple at the current state and needs a lot more development. This article will describe only the concept.

The viewer depends on two major open source libraries:

The viewer language is Visual Basic .NET.

Background

I was working on a project whose main data exists in a Word document and I found that the only way for data entry is to view the document on a form and choose and select parts of it and copy it for saving in the database.

I searched for a Word document viewer on the internet and did not find any. All that I found is a library for reading the (.docx) format and returning the data in .NET object, I chose DocX for this purpose.

Then I thought if I could read the file and view it myself, I search for RTF library and chose the String builder for RTF.

By compiling these two libraries, I could build this viewer.

Using the Code

The viewer solution consists of two projects:

  • WordDocViewer, a Windows form application
  • WordFile, a class library project

The Windows Form project is the host and responsible for viewing the RTF result on an MDI child form using RichTextBox control.

The RTF result is built by the RTFlib after reading the document by DocX library in the class library project.

The class library is very simple - it has two classes:

  • Document represents the Word document and can load the Word document file and parses the pages.
  • Page because the DocX library has no page class. I create one to keep each page paragraphs together.

I could parse the pages by searching for the line feed character in each paragraph and when I find it, I split the paragraph into two parts of text and consider the new part is a new paragraph.

This is the Load function:

VB.NET
Public Function Load(File As String) As Boolean
    Try
        Me.Doc = DocX.Load(File)
        Dim Page As Page = New Page With {._Index = Me.Pages.Count + 1}
        Dim Pos As Short = 0
        Dim Text As String = String.Empty

        Me.Pages.Add(Page)
        For Each Paragraph As Novacode.Paragraph In Me.Doc.Paragraphs
            If Paragraph.Text.Contains(vbLf) Then
                Text = Paragraph.Text
                Pos = Text.IndexOf(vbLf)

                Paragraph.ReplaceText(Text.Substring(Pos + 1), String.Empty)
                Page.Paragraphs.Add(Paragraph)

                Page = New Page With {._Index = Pages.Count + 1}
                Page.Paragraphs.Add(Paragraph.InsertParagraphAfterSelf(Text.Substring(Pos + 1)))
                Me.Pages.Add(Page)
            Else
                Page.Paragraphs.Add(Paragraph)
            End If
        Next
        Return True
    Catch ex As Exception
        Throw
    End Try

    Return False
End Function

To view images in the viewer, the RTFlib needs to pass a Drawing.Image type parameter to its InsertImage function and for that, I create the GetImage function in the Document class.

VB.NET
Public Function GetImage(Picture As Novacode.Picture) As Drawing.Image
    Dim DocImage As Novacode.Image = Nothing
    Dim Image As Drawing.Image = Nothing
    Dim stream = Nothing

    DocImage = Me.Doc.Images.Find(Function(T) T.Id = Picture.Id)
    If DocImage IsNot Nothing Then
        stream = DocImage.GetStream(IO.FileMode.Open, IO.FileAccess.Read)
        Dim Buffer(stream.Length) As Byte
        stream.Read(Buffer, 0, Buffer.Length)
        Image = Drawing.Image.FromStream(stream)
        stream.Close()

    End If

    Return Image
End Function

Here is a captured image of the viewer:

Word Viewer

Points of Interest

The viewer is very simple and easy to understand and is also very easy to convert to C# language.

History

  • 8th November, 2018: First release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
Syrian Arab Republic Syrian Arab Republic
Individual Developer since 23 years

Main development language is VB.net

Main development stream is web based applications

Comments and Discussions

 
-- There are no messages in this forum --