Page 1 of 1

Unable to Decode HTML URL ASCII

PostPosted: Fri Mar 05, 2021 4:57 pm
by pinman
Sorry to have to ask via the forum but I'm really struggling trying to change the filenames (which contain HTML ASCII codes) of files that have been scraped from an archived guestbook on an old website.

There are thousands of files which reside in various sub-folders, but they always follow the following format:

comment.php%3fgb_id%3d1030

What I would like to do is remove all of the text in the filename that matches .php%3fgb_id%3d and replace it with a simply 'dash' - and append the .html suffix to the end (after the numbers).

Is anyone able to point me in the right direction please?

Re: Unable to Decode HTML URL ASCII

PostPosted: Fri Mar 05, 2021 7:53 pm
by BogStandard
This works ONLY if ".php%3fgb_id%3d" is always in the filename.

RegEx(1)
Match: (.*).php%3fgb_id%3d(.*)
Replace: \1-\2

Filters(12)
Mask: *.html
Folders: untick
Files: tick
Subfolders: untick (tick only when you have successfully tested)

Before
comment.php%3fgb_id%3d1030.html
comment.php%3fgb_id%3d1031.html
comment.php%3fgb_id%3d1032.html
comment.php%3fgb_id%3d1033.html
comment.php%3fgb_id%3d12345.html
comment.php%3fgb_id%3d12345a.html

After
comment-1030.html
comment-1031.html
comment-1032.html
comment-1033.html
comment-12345.html
comment-12345a.html

For the example you have given this works, but do test it on a copy of a folder first.

Regards..

Re: Unable to Decode HTML URL ASCII

PostPosted: Sat Mar 06, 2021 6:48 am
by Luuk
Greetings everyone! This good example of why its always better to first say the NowNames.ext ===> NewNames.ext
BogStandards solution does work perfectly, because he must assume NowNames.html, since its to be the most likely.

This another solution based on BogStandards, but now assumes you post "NowName.ext" with crazy long extension ??
So instead, its now assuming that you want "Comment.php%3fgb_id%3d1030" ===> "Comment-1030.html" ????

In the menu, first putting checkmark in "Renaming Options, File/Folder Extensions, Rename File Extensions"
So then a modified "Match" and "Replace" to be like..
(.*)\.php%3fgb_id%3d(\d+)$
\1-\2.html

If these solutions are not helping you, please to say in the format... "NowNames.ext" ===> "NewNames.ext"

Re: Unable to Decode HTML URL ASCII

PostPosted: Mon Mar 08, 2021 2:34 pm
by BogStandard
Ah. Yes I have read something that wasn't written by pinman and also not recognised the extension, which type I have not seen before.

I agree that clear Before and After examples make it easier to understand what is being requested.

Regards..