Click here to Skip to main content
15,884,298 members
Please Sign up or sign in to vote.
3.40/5 (2 votes)
See more:
I have a requirement to download epaper from the news paper sites for eg.
we will have a link of newspaper

http://epaperbeta.timesofindia.com/index.aspx?eid=31808&dt=20150905[^] * its just a demo link to test.

we need to download this news paper PDFs through code and save the same in some folder. There will be several e-papers site form which we will download news paper PDFs.

Thanks..
Posted
Comments
What have you tried and where is the issue?
suhel_khan 6-Sep-15 2:36am    
No tadit till now I haven't done anything, Not getting how to start with.

1 solution

HI.

Its a bit tricky process and you need to build a specialized html parsing logic depending on which news paper site you are targeting.

Suppose you are targeting
http://epaperbeta.timesofindia.com/index.aspx?eid=31808&dt=20150905[^]

Then figure out how you get url of PDF inside this html page.

Just for clue I can tell you that download link available at the top which invokes the below java script code.


JavaScript
function downloadpdftest() {
    var getslidevalue = parseInt(sudoSlider.getValue("currentSlide"), 10);
    var nextsudoslider = sudoSlider.getSlide(getslidevalue);
    var nextslideid = nextsudoslider.find('img').attr('src');
    var fPath = nextslideid.toString();
    fPath = fPath.replace(".JPG", ".pdf");
    fPath = fPath.replace(".jpg", ".pdf");
    var currPDFName = fPath.replace("Page", "PagePrint");
    window.open(currPDFName, 'PDF', 'left=150,top=10,width=750,height=700,scrollbars=yes,status=yes');
}


Now try to generate currPDFName (pdf url) by fetching the content of this page in c# using WebRequest or WebClient Class.


http://stackoverflow.com/questions/16642196/get-html-code-from-a-website-c-sharp[^]code see


Then parse html to generate PDF url.

Once you get pdf url then below code to download PDF file.

C#
using(WebClient client = new WebClient())
{
    client.DownloadFile("http://www.irs.gov/pub/irs-pdf/fw4.pdf", @"C:\Temp.pdf");
}


http://stackoverflow.com/questions/2913830/download-pdf-programatically[^]
 
Share this answer
 
Comments
suhel_khan 6-Sep-15 2:58am    
Thanks buddy :)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900