I need to find and replace a placeholder string in a PDF file. The PDF file is loaded with the iText library and I have been trying to follow code samples to follow some code samples I have dug up, more often than not for the original Java implementation.
The problem is that the samples don't work for my PDF file. I get a PdfDictionary with PdfObjects, but when I try to filter out the objects with texts I get no results. I know that there is a text in there, because I first took a look at the contents of the file with a PDF parser. The parser will not allow me to make changes and write them back, but at least I know that there is something in there that can be found.
Taking a closer look at the PdfDictionary object, I found only one flavor of PdfObject in it: PdfIndirect reference. The name suggests that I must resolve these references to get objects which I can examine and modify, but i can't find any sample code for that.
What I have tried:
I have to work with an improvised setup with several computers and remote desktops at the moment, so I can't just post my experimental code right now. This is what I have:
1) Open a PdfReader (works)
2) Get a PdfDocument object with the reader (works)
3) Iterate through the pages of the document and get a Pdfpage object (works)
4) (For each page) get a PdfDictionary from the page object (works)
5) Get Pdf objects from the dictionary with dictionary.Get(PdfName.Contents) (works)
6) Normally i would just have to iterate over the results from step 5), but I only get PdfIndirectReference objects. How can I resolve and edit these references?
MemoryStream stream;
PdfReader reader;
PdfDocument document;
Dictionary<String, PdfFormField> fields;
PdfPage page;
PdfDictionary dict;
PdfStream content;
int pages;
int i;
using (stream = new MemoryStream(BinaryFile))
{
using (reader = new PdfReader(stream))
{
using (document = new PdfDocument(reader))
{
pages = document.GetNumberOfPages();
for (i = 1; i <= pages; i++)
{
page = document.GetPage(i);
dict = page.GetPdfObject();
var xcontent = dict.Get(PdfName.Contents);
if (xcontent != null)
{
PdfArray thearray= xcontent as PdfArray;
foreach (PdfObject obj in thearray)
{
PdfStream strm = obj as PdfStream;
if(strm != null)
{
byte[] data = strm.GetBytes();
UTF8Encoding enc = new UTF8Encoding();
string test = enc.GetString(data);
}
}
}
}
}
}
}