Click here to Skip to main content
15,911,035 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have tried pdfsharp and itextsharp. I prefer pdfsharp, however no one seemed to finish the sample code for extracting an image from a pdf file and saving it in jpeg format and most importantly for me, to include the height and width properties of the extracted image.

I have tried the sample code shown in the documentation but it doesn't work for me. I am sure someone has completed the incomplete code shown in the documentation. Can you lead me to that code?

I have successfully written code creating PDF files using pdfsharp using vb.net in visual studio 2012 but I extracted the image using form which extracts the whole page. I want just the image.
Posted
Updated 16-Feb-22 4:38am
Comments
Patrice T 1-Dec-15 17:23pm    
Contact the author and user's forum.

Check this documentation and sample source code
http://www.pdfsharp.net/wiki/ExportImages-sample.ashx?AspxAutoDetectCookieSupport=1[^]

Sample source code is in C# which you can convert to VB.Net using any online code converter like
http://converter.telerik.com[^]

Hope, it helps :)
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 1-Dec-15 17:21pm    
Sure, a 5.
—SA
Sergey Alexandrovich Kryukov 1-Dec-15 17:26pm    
By the way, I just wrote a useful addition to your advice on translation; please see Solution 2.
—SA
Suvendu Shekhar Giri 1-Dec-15 22:03pm    
It's really useful. Thanks for sharing
Member 10628309 2-Dec-15 2:35am    
In my request I stated that the code of your reference doesn't work for me. I get no error messages but it doesn't seem to export anything. Here is my vb.net code. Can you spot my problem? Also how do I get the jpeg file?

Imports PdfSharp.Pdf
Imports PdfSharp.Pdf.IO
Imports System.IO
Imports PdfSharp.Pdf.Advanced

Public Class Form1

Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Const filename As String = "D:\ISRace\FinishData\MM PHRF.pdf"
Dim document As PdfDocument = PdfReader.Open(filename)
Dim imageCount As Integer = 0
' Iterate pages
For Each page As PdfPage In document.Pages
'Get resources dictionary
Dim resources As PdfDictionary = page.Elements.GetDictionary("/Resources")
If resources IsNot Nothing Then
'Get external objects dictionary
Dim xObjects As PdfDictionary = resources.Elements.GetDictionary("/XObject")
If xObjects IsNot Nothing Then
Dim items As ICollection(Of PdfItem) = xObjects.Elements.Values
' Iterate references to external objects
For Each item As PdfItem In items
Dim reference As PdfReference = TryCast(item, PdfReference)
If reference IsNot Nothing Then
Dim xObject As PdfDictionary = TryCast(reference.Value, PdfDictionary)
' Is external object an image?
If (xObject IsNot Nothing AndAlso xObject.Elements.GetString("/Subtype") = "/Image") Then
ExportImage(xObject, imageCount)
End If
End If
Next
End If
End If
Next
System.Windows.Forms.MessageBox.Show(imageCount & " images exported.", "Export Images")
End Sub

Private Shared Sub ExportImage(image As PdfDictionary, ByRef count As Integer)
Dim filter As String = image.Elements.GetName("/Filter")
Select Case filter
Case "/DCTDecode"
ExportJpegImage(image, count)
'Exit Select
'Case "/FlateDecode"
' ExportAsPngImage(image, count)
' Exit Select
End Select
End Sub

Private Shared Sub ExportJpegImage(image As PdfDictionary, ByRef count As Integer)
Dim stream As Byte() = image.Stream.Value
Dim fs As New FileStream([String].Format("Image{0}.jpeg", System.Math.Max(System.Threading.Interlocked.Increment(count), count - 1)), FileMode.Create, FileAccess.Write)
Dim bw As New BinaryWriter(fs)
bw.Write(stream)
bw.Close()
End Sub

End Class
Sorry, this is unrelated to your original question, but might be a helpful addition to the Solution 1.

There is a better way to translate any project (a whole project, really), between C# and VB.NET. This is open-source ILSpy. You compile one assembly take resulting module and open it in ILSpy, then select different language and disassemble it. The quality of the generated code is above all the expectations. More importantly, this is a complete project code, guaranteed to compile and work exactly as the original one.

Please see my past answer for some more detail: Code Interpretation, C# to VB.NET[^].

—SA
 
Share this answer
 
v2
Comments
Suvendu Shekhar Giri 1-Dec-15 21:58pm    
Very useful. Thanks for sharing.
5ed !
Sergey Alexandrovich Kryukov 1-Dec-15 22:47pm    
Thank you, Suvendu Shekhar.
—SA
My code for get all the images from pdf

PdfDocument document = PdfReader.Open(textBox_fichero.Text);
 PdfSharp.Pdf.PdfObject[] Todosobjetos = document.Internals.GetAllObjects();
 for(int I=0;I<Todosobjetos.Length;I++)
           {
               if(Todosobjetos[I].GetType().Name == typeof(PdfDictionary).Name)
               {
                   PdfDictionary Referencia = Todosobjetos[I] as PdfDictionary;
                   string Tipo = Referencia.Elements.GetString("/Type");
                   if(Tipo== "/XObject")
                   {
                       if(Referencia.Elements.GetString("/Subtype")=="/Image")
                               {
                                  string extension = "";
                                  ExportImage(Referencia, Nombre, ref extension))
                                  Contador_Imagenes++;
                               }
                   }
               }
           }


then extract image. works for jpg and png

pubilc void ExportImage(PdfDictionary image, string  nombre_fichero,ref string extension)
        {
            string filter;
           
            try
            {
                filter = image.Elements.GetName("/Filter");
            }
            catch
            {
                filter = "/DCTDecode";
            }

            try
            {
                switch (filter)
                {
                    case "/DCTDecode":
                        {
                            extension=".jpg";
                            ExportJpegImage(image, nombre_fichero);
                        }
                        break;
                    case "/FlateDecode":
                        {
                            extension = ".png";
                            ExportAsPngImage(image, nombre_fichero);
                        }
                        break;
                }
            }
            catch
            {
                Errores ++ ;           
            }

        }


private void ExportJpegImage(PdfDictionary ImagePdfDictionary, string nombre_fichero)
       {
           // Fortunately JPEG has native support in PDF and exporting an image is just writing the stream to a file.
           byte[] stream = ImagePdfDictionary.Stream.Value;

           if (CheckBox_InvertirColores.Checked)
           {
               using (MagickImage magickImage = new MagickImage(stream))
               {
                   magickImage.Negate();
                   magickImage.Write(nombre_fichero);
               }
           }
           else
           {
               FileStream fs = new FileStream(nombre_fichero, FileMode.Create, FileAccess.Write);
               BinaryWriter bw = new BinaryWriter(fs);
               bw.Write(stream);
               bw.Close();
               bw.Dispose();
               fs.Close();
               fs.Dispose();
           }
       }
       private void ExportAsPngImage(PdfDictionary image, string nombre_fichero)
       {
           int width = image.Elements.GetInteger(PdfImage.Keys.Width);
           int height = image.Elements.GetInteger(PdfImage.Keys.Height);
           var canUnfilter = image.Stream.TryUnfilter();
           byte[] decodedBytes;

           if (canUnfilter)
           {
               decodedBytes = image.Stream.Value;
           }
           else
           {
               PdfSharp.Pdf.Filters.FlateDecode flate = new PdfSharp.Pdf.Filters.FlateDecode();
               decodedBytes = flate.Decode(image.Stream.Value);
           }

           int bitsPerComponent = 0;
           while (decodedBytes.Length - ((width * height) * bitsPerComponent / 8) != 0)
           {
               bitsPerComponent++;
           }

           System.Drawing.Imaging.PixelFormat pixelFormat;
           switch (bitsPerComponent)
           {
               case 1:
                   pixelFormat = System.Drawing.Imaging.PixelFormat.Format1bppIndexed;
                   break;
               case 8:
                   pixelFormat = System.Drawing.Imaging.PixelFormat.Format8bppIndexed;
                   break;
               case 16:
                   pixelFormat = System.Drawing.Imaging.PixelFormat.Format16bppArgb1555;
                   break;
               case 24:
                   pixelFormat = System.Drawing.Imaging.PixelFormat.Format24bppRgb;
                   break;
               case 32:
                   pixelFormat = System.Drawing.Imaging.PixelFormat.Format32bppArgb;
                   break;
               case 64:
                   pixelFormat = System.Drawing.Imaging.PixelFormat.Format64bppArgb;
                   break;
               default:
                   throw new Exception("Unknown pixel format " + bitsPerComponent);
           }

           Array.Reverse(decodedBytes);

           Bitmap bmp = new Bitmap(width, height, pixelFormat);
           BitmapData bmpData = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), ImageLockMode.WriteOnly, bmp.PixelFormat);

           System.Runtime.InteropServices.Marshal.Copy(decodedBytes, 0, bmpData.Scan0, decodedBytes.Length);

           bmp.UnlockBits(bmpData);
           bmp.RotateFlip(RotateFlipType.Rotate180FlipNone);
           bmp.Save(nombre_fichero, System.Drawing.Imaging.ImageFormat.Png);
       }
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900