This is an example how to get all the text of the PDF File.
public string ReadPdfFile(string filename)
{
PdfReader pdfReader = new PdfReader(filename);
string fullText = string.Empty;
for (int nPage = 1; nPage <= pdfReader.NumberOfPages; page++)
{
ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy();
PdfReader reader2 = new PdfReader(filename);
String s = PdfTextExtractor.GetTextFromPage(reader2, nPage, its);
s = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(s)));
strText = strText + s;
reader.Close();
}
return strText;
}
For archive what do you want to do, you have to crete your own Extraction Strategy. Once I make mine to get the text by its position based on the LocationTextExtractionStrategy (is in the ITextSharp source code).
You should create you own TextExtractionStrategy with your conditions.
I hope this was useful.
Mauro.