Click here to Skip to main content
15,912,457 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Hi I like to extract text along with its Coordinates

my code is

C#
public static void Main(string[] args)
        {

            PDDocument document = null;
            try
            {
                document = PDDocument.load(@"pdfpath");
                if (document.isEncrypted())
                {
                    try
                    {
                        document.decrypt("");
                    }
                    catch
                    {

                    }
                }
                Program printer = new Program();
                var allPages = document.getDocumentCatalog().getAllPages();
                //for (int i = 0; i < allPages.size(); i++)
                {
                    PDPage page = (PDPage)allPages.get(0);
                    PDStream contents = page.getContents();
                    if (contents != null)
                    {
                     printer.processStream(page, page.findResources(), page.getContents().getStream());
                    }
                    Console.ReadLine();
                }

            }
            finally
            {
                if (document != null)
                {
                    document.close();
                }
            }

        }

        

        protected internal override void processTextPosition(TextPosition text)
        {
            string test = "String[" + text.getXDirAdj() + "," +
                    text.getYDirAdj() + " fs=" + text.getFontSize() + " xscale=" +
                    text.getXScale() + " height=" + text.getHeightDir() + " space=" +
                    text.getWidthOfSpace() + " width=" +
                    text.getWidthDirAdj() + "]" + text.getCharacter();
        }
    }
Posted
Comments
Richard MacCutchan 13-Sep-14 4:47am    
Do you have a question?
Bala4 . V 13-Sep-14 5:40am    
The above code is not working
ZurdoDev 15-Sep-14 8:13am    
Why does it not work?
kbrandwijk 13-Sep-14 12:37pm    
What have you tried? Is the processTextPosition method called? What's the result value of string test during debug? Did you put a watch on text to see the properties during debug? Also, you have cut and paste pieces of your code. Normally, there is no virtual method called processTextPosition in the regular Program.cs class that contains Main, because it is not derived from some base class.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900