Click here to Skip to main content
15,881,668 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I am able to convert the HTML string to PDF, but in the PDF file I can see few empty half rows at the end of the page.


Image reference below(
https://i.stack.imgur.com/s6RKw.png[^])



I am using the below code to convert the HTML string to PDF using itext

C#
//HTML Construction
string html = @"HTML STRING COMES HERE";
using (var memoryStream = new MemoryStream())
{
    Document document = new Document(PageSize.LETTER, 15, 15, 20, 30);
    document.SetPageSize(iTextSharp.text.PageSize.A4.Rotate());

    //var document = new Document(PageSize.A4, 50, 50, 60, 60);
    var writer = PdfWriter.GetInstance(document, memoryStream);
    document.Open();

    using (var cssMemoryStream = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(Constants.CssText)))
    {
        using (var htmlMemoryStream = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(html)))
        {
            XMLWorkerHelper.GetInstance().ParseXHtml(writer, document, htmlMemoryStream, cssMemoryStream);
        }
    }

    document.Close();

    pdf = memoryStream.ToArray();
    return pdf;


Kindly help me that how to avoid those empty half rows or suggest mw any other open-source that can convert HTML to pdf with css styles

What I have tried:

I have tried the above code and tried modifying the property, but didn't got the solution
Posted
Updated 12-Aug-20 8:45am
v3
Comments
Garth J Lancaster 12-Aug-20 23:39pm    
It would be nice to see some details of the HTML string ... I see 2 possibilities
1) you clean up the HTML string before your existing routine - so code mostly as it is now, with an extra call to clean up (ie remove the empty rows) .. that could involve parsing the HTML and then deleting the empty rows and returning a string to use as you do now
2) OR re-writing/extending .ParseXHtml() to do this ... although not quite the same, there's some thoughts on this here https://stackoverflow.com/questions/36180131/using-itextsharp-xmlworker-to-convert-html-to-pdf-and-write-text-vertically
DerekT-P 13-Aug-20 4:59am    
my personal experience - though possibly not with the latest versions - is that I couldn't get iTextSharp to render non-trivial HTML correctly, especially when using stylesheets (as opposed to inline styles). I switched to using Pechkin, which relies on webkit DLLs to render the HTML, and apart from some complications with background images found it renders PDFs very well. The only downside is it's 32-bit (maybe there is a 64-bit version, but I've yet to find it!). Pechkin is free, too.
Member 12406065 14-Aug-20 4:11am    
Hi Derek, I tried Pechkin now. I am able to convert it, but its not aligned properly. if I have 5 table in a HTML. Its split up is not showing properly. table header is in one page and other contents are in next page
DerekT-P 14-Aug-20 5:11am    
Try controlling page breaks in CSS:
@page {
size: A4;
margin: 0;
}
@media print {
html, body {
width: 210mm;
height: 297mm;
}
tbody::after {
content: ' ';
display: block;
page-break-after: always;
page-break-inside: avoid;
page-break-before: avoid;
}
}

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900