Click here to Skip to main content
15,123,727 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
OCR SCAN PART PDF DOCUMENT AND WRITE VALUE IN TEXTBOX C#

Some help?   Some recommendation?   I try this but this is for all page 


What I have tried:

private void Button1_Click(object sender, EventArgs e)
        {
            using (OpenFileDialog ofd = new OpenFileDialog() { Filter = "PDF Files|*.pdf", ValidateNames = true })

            {
                if (ofd.ShowDialog() == DialogResult.OK)
                {
                    try
                    {
                        iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(ofd.FileName);
                        StringBuilder sb = new StringBuilder();
                       // for (int i =1; i <= reader.NumberOfPages; i++)
                        for (int i = 1; i <= reader.NumberOfPages; i++)

                        {

                                sb.Append(PdfTextExtractor.GetTextFromPage(reader,i));

                        }
                        textBox1.Text = sb.ToString();
                        reader.Close();

                    }

                    catch (Exception ex)

                    {
                        MessageBox.Show(ex.Message,"Message",MessageBoxButtons.OK,MessageBoxIcon.Error);
                    }
                }
            }
        }
Posted
Comments
lmoelleb 2-Jul-19 6:47am
   
Start by being specific in your question. What do you want instead of all pages? A single page? A section of a page - if so, how is that section specified? If it is a single page and you do not know how to adjust this example, then I recommend you follow basic C# tutorials so you understand each line in this simple example - if you understand what each line is doing it is trivial to change it to a single page.
Goran Bibic 2-Jul-19 11:03am
   
Just first page
lmoelleb 2-Jul-19 11:45am
   
If you can't see where the code loops over the pages and process them one by one, then you need to go back to the c# tutorials - and run it through th debugger line by line. Copying code samples from various sources is fine, but always examine the code and learn how it works.
Goran Bibic 3-Jul-19 7:05am
   
Code work, but for complete page. Need help for exmple rectangle ona page on specific location to scan values an put into textbox. Understand?
lmoelleb 3-Jul-19 7:09am
   
I recommend updating the question to be specific, don't just write comments here - why should people read that? Change the question to ask how to get text inside a rectangle in a PDF, and remove all the unimportant things like updating textbox as you clearly already know how to do that (your original code did it). The more specific you are, the more likely someone with the required knowledge can reply (or even more likely: you can google it).
Goran Bibic 4-Jul-19 3:23am
   
NEED TO SCAN FROM FIRST PAGE TO TEXTBOX SPECIFIED LOCATION 420X120 SIZE RECTANGLE 80x80....What is not clear?
Richard MacCutchan 2-Jul-19 7:41am
   
That is not OCR scan, but reading the text of the PDF.
Goran Bibic 2-Jul-19 11:03am
   
Reader text of the PDF...sory
Richard MacCutchan 2-Jul-19 11:37am
   
So what exactly is the question? If you know how to extract the text from all pages, it must be a simple matter to extract it from a single page.
Goran Bibic 4-Jul-19 3:23am
   
NEED TO SCAN FROM FIRST PAGE TO TEXTBOX SPECIFIED LOCATION 420X120 SIZE RECTANGLE 80x80....What is not clear?
Richard MacCutchan 4-Jul-19 3:47am
   
Why do you keep talking about scanning when you are reading the text with iTextSharp?
Goran Bibic 6-Jul-19 8:42am
   
Ok. NEED TO read FROM FIRST PAGE pdf document TO TEXTBOX SPECIFIED from specified LOCATION 420X120, SIZE reading RECTANGLE 80x80....Is question now corect?
Richard MacCutchan 6-Jul-19 9:46am
   
How many more times do you think you need to be told what the answer is? If you need only the first page then just read the first page, do not loop over all pages. And there is no way from the text content that you can guess what size rectangle the text will take up.
Goran Bibic 6-Jul-19 11:31am
   
https://www.youtube.com/watch?v=Q_JxpGzhNqQ ... need this BUT for pdf

Why you write if you don't now answer on my question?
Richard MacCutchan 6-Jul-19 13:38pm
   
Mainly because your question makes no sense. If you want to extract using OCR then you must start with an image of the text. That is a completely different issue from reading the text of a PDF file. So follow the Youtube video to see how it is done.
lmoelleb 4-Jul-19 3:49am
   
Writing in all capitals is not a nice thing to do, stop it. I do not know how to do what you are requesting, so I am trying to give you advise on how you can improve the chance of someone else being able to help you. You are on the questions and answer page, but you appear to think it is a forum. It is not. Make it EASY for people to help you by ensuring the actual question (not comments, the question at the very top) is not filled with irrelevant stuff an clearly identify what you have tried and where you are stuck - with as little code as possible (for example, you do not want to loop over pages, so why is it in your example code).
Goran Bibic 6-Jul-19 8:42am
   
Ok. NEED TO read FROM FIRST PAGE pdf document TO TEXTBOX SPECIFIED from specified LOCATION 420X120, SIZE reading RECTANGLE 80x80....Is question now corect?
lmoelleb 6-Jul-19 11:09am
   
No. It appears you believe you are using a forum. You are not. You are using the Q&A section. You should edit your original question to make it clear, not keep repeating yourself in the comments. And stop writing words in capitals, makes it hard to read. I am just trying to help you understand how to increase the chance of getting answers to your questions. If you do not want to follow the advice fine, I guess it is not important to you to get an answer.
Goran Bibic 6-Jul-19 11:30am
   
https://www.youtube.com/watch?v=Q_JxpGzhNqQ ... need this BUT for pdf
Sinisa Hajnal 3-Jul-19 2:25am
   
Goran, ovo nije dobro pitanje, imaš gotov kod, nemaš pogrešku, a ne razumiješ ga.
As the commenters above said, you have everything you need in the code above. Especially since you need only first page (the easiest). Remove the loop, take only 0-th index and you're done.
Goran Bibic 3-Jul-19 7:04am
   
Razumijem, ja sve drug moj. Ovo je kod za cijelu stranicu, Meni treba kod za neki rectangle koji ce skenirati tacno oznaceni dio samo na prvoj stranici. Ostale mi nisu bitne

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900