Click here to Skip to main content
15,884,298 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I got a text file of URLs like archive.org/details/ightrailwaycon00parkgoog ,but i need change any of them to a form like archive.org/download/lightrailwaycon00parkgoog/lightrailwaycon00parkgoog.pdf ,
in which the word "details" got replaced by "download",and the thing after the last
slash got replaced by "/itself.pdf".I need to do that to all the URLs in the text file with the general form of archive.org/details/x.Please help me with that.

What I have tried:

note pad and some text editors,like not pad++ and EditPad.Pro
Posted
Updated 2-Aug-18 2:41am

Use Notepad++, record a macro on the first line then play the macro to the end of the file.

Alternatively, load the text file into a database such as SQL Express or Access etc and write some SQL to make the changes then export back out to a text file.

Could do something similar in Excel too.

Edit - explicit instructions

1. Load the file into Notepad++
2. Position the cursor at the start of the 1st line
3. Click "Macro" on the menu (or use Alt-M)
4. Click "Start Recording"
5. Hit the End key (your cursor should now be at the end of line 1)
6. Use Ctrl+Shift+LeftArrow to highlight the last "word" of that line
7. Press Ctrl+C to copy that word
8. Press the End key again
9. Type /
10. Press Ctrl+V to paste the word into place
11. Type .pdf
12. Hit the DownArrow key then the Home key (cursor will now be at the start of line 2)
13. Click Macro, Stop Recording
13. Click Playback (or Ctrl+Shift+P) as many times as required or use the Play to end of file option that I described in my comments below
14. When complete use Ctrl+H to get the Find & Replace dialog
15. Type /details/ in the "Find What" box
16. Type /download/ in the "Replace with" box
17. Click Replace All
18. Save the file

Alternatively in Excel
1. Paste your file into column A
2. In Column B type the last bit for the first few rows - Excel will suggest the rest of the fill greyed out (I can't recall it's proper name - something like Flash Fill)
3. Click on the grey suggestions and Excel will fill in the whole of Column B for you
4. In cell C1 type =A1 & "/" & B1 & ".pdf"
5. Click in Cell C1 and double click the little solid square in the bottom right of the border highlighting the cell
6. Excel will fill in the whole of column C for you
7. When complete use Ctrl+H to get the Find & Replace dialog
8. Type /details/ in the "Find What" box
9. Type /download/ in the "Replace with" box
10. Click Replace All
11. Copy Column C and paste in back into your file using an editor of your choice
 
Share this answer
 
v3
Comments
Member 13935422 2-Aug-18 7:51am    
Thanks.I'm not familiar with it's tools.Any way,i ran the editor,opened the text file,selected the first line then clicked the start recording,but it keep
recording.What should i do know?
Member 13935422 2-Aug-18 7:57am    
Thanks.I'm not familiar with the editor's tools.Any way,i ran the editor opened the text file,selected the first line,clicked the start recording,but
it keep recording.What should i do?
CHill60 2-Aug-18 8:31am    
Click on Macro again and click on "Stop Recording".
Then click on Macro again and click "Playback" or use Ctrl-Shift-P.
When you are sure it is running correctly, click Macro, Run a Macro Multiple Times then select "Run until end of file" in the dialog box and click "Run"

There are tutorials and other help resources available - click on the question mark on the menu bar
Member 13935422 2-Aug-18 9:01am    
I did that,but The playback is disable.
I think my question was not clear enough.Let me explain it this way,
i got a text file containing URLs in general form of archive.org/details/x ,like archive.org/details/sd ,
archive.org/details/ghry6udg , archive.org/details/socialsci .Now i want to turn all of them to the general form of archive.org/download/x/x.pdf ,so the second example above becomes
archive.org/download/ghry6udg/ghry6udg.pdf .Thanks.
CHill60 2-Aug-18 10:55am    
Playback is disabled because you didn't record the macro. I have updated my solution with step-by-step instructions
With Notepad++ you can use regular expressions for search and replace.
See How to use regular expressions in Notepad++ (tutorial)
 
Share this answer
 
Comments
CHill60 2-Aug-18 8:31am    
Nice.
Member 13935422 2-Aug-18 9:54am    
Thanks.Could you please tell me how could i do that in the case i explained above,in details?You know,the link you gave me is complicated.
Jochen Arndt 2-Aug-18 10:56am    
Yes, it is complex.

Perform the replecment in two steps:
The replacement of /details/ to /download/ is simple, as usual and does not require a regex.

For the second use a grouped match of everything after the last slash until end of line (untested: "\/([^/]+)$") and replace that with
match + "/" + match + ".pdf"
where match is referenced by \1:
\1\/\1.pdf

You have to check it because that is from scratch without testing it.


archive.org/details/ightrailwaycon00parkgoog
archive.org/download/lightrailwaycon00parkgoog/lightrailwaycon00parkgoog.pdf

Indeed a replace with RegEx seems appropriate.

Just a few interesting links to help building and debugging RegEx.
Here is a link to RegEx documentation:
perlre - perldoc.perl.org[^]
Here is links to tools to help build RegEx and debug them:
.NET Regex Tester - Regex Storm[^]
Expresso Regular Expression Tool[^]
RegExr: Learn, Build, & Test RegEx[^]
Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript[^]
This one show you the RegEx as a nice graph which is really helpful to understand what is doing a RegEx:
Debuggex: Online visual regex tester. JavaScript, Python, and PCRE.[^]
This site also show the Regex in a nice graph but can't test what match the RegEx:
Regexper[^]
 
Share this answer
 
Comments
CHill60 2-Aug-18 11:05am    
:laugh: The OP can't handle recording a macro in NP++ and found the tutorial for regex in Notepad++ too complex.

I like the graph ones tho!
Patrice T 2-Aug-18 11:33am    
Thank you. I particularly like the debugex which also allow testing.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900