Removing specific parts of a filename

A swapping-ground for Regular Expression syntax

Removing specific parts of a filename

Postby martin0121 » Mon Sep 28, 2020 7:57 pm

I was wondering if anyone can help me streamline this a bit more. I'm still just learning but would like to be able to know how to make this work more effectively... Any help would be greatly appreciated.

I have a set of files like this....
2c2_657_John_Smith_95839401e97k7_1426.jpg
Each section may be a different number of characters.

I would like to be able to select any, often multiple parts of the filename between the underscores to be kept and some to be removed.
Eg... The end result could be:
2c2_657_1426.jpg
Or
2c2_657_John_Smith.jpg
Or any other variation.

I have figured out and used this to get me a working result, but through trial and error:
(.*)(.*)(.*)(_.*)(_.*)
\1\3\5

Could someone please tell me if this is the right way, if there is a better method I'd love to get it working so It can be interchangeable and so I can understand it, right now I feel like I'm taking stabs in the dark.

Thank you. ????
martin0121
 
Posts: 3
Joined: Mon Sep 28, 2020 7:43 pm

Re: Removing specific parts of a filename

Postby Luuk » Tue Sep 29, 2020 11:04 am

Greetings. Sorry Im not expert with Regex(1) but when all the files have five "_" , its good for match to have all five "_" also, and never to use "(.*)(.*)(.*)" because the first "(.*)" is greedy and forbids the next "(.*)" to own anything for itself. If not to keep "_" it must relocate outside parentheses, so instead of "(_.*)" it can be "_(.*)".
Luuk
 
Posts: 692
Joined: Fri Feb 21, 2020 10:58 pm

Re: Removing specific parts of a filename

Postby therube » Tue Sep 29, 2020 6:53 pm

(I cannot check, but...)

If you're using the new RegEx method,
If all of your names have 6 fields,
Thinking you could use something like this:

Code: Select all
Match:  %1_%2_%3_%4_%5_%6
Replace: 

In the Replace, simply include the parts in there that you want to retain.

So if all you wanted were the first & last name, use %3 %4 in the replace.
Or, %4, %3, if you wanted the name reversed.
Or, First name: %3, Last name: %4, if you also wanted some additional text in there.

See if something like that works?
therube
 
Posts: 1314
Joined: Mon Jan 18, 2016 6:23 pm

Re: Removing specific parts of a filename

Postby Luuk » Tue Sep 29, 2020 9:09 pm

Yes, Im tested therube's explanation for the newest version. It works exactly, and also makes the code much easier. Its ony needed to put the checkmark in "Simple" for Regex(1).
Luuk
 
Posts: 692
Joined: Fri Feb 21, 2020 10:58 pm

Re: Removing specific parts of a filename

Postby martin0121 » Wed Sep 30, 2020 8:57 am

therube wrote:(I cannot check, but...)

If you're using the new RegEx method,
If all of your names have 6 fields,
Thinking you could use something like this:

Code: Select all
Match:  %1_%2_%3_%4_%5_%6
Replace: 

In the Replace, simply include the parts in there that you want to retain.

So if all you wanted were the first & last name, use %3 %4 in the replace.
Or, %4, %3, if you wanted the name reversed.
Or, First name: %3, Last name: %4, if you also wanted some additional text in there.

See if something like that works?



That is fantastically simple and exactly what I was looking for, thank you.
martin0121
 
Posts: 3
Joined: Mon Sep 28, 2020 7:43 pm

Re: Removing specific parts of a filename

Postby martin0121 » Wed Sep 30, 2020 8:59 am

Luuk wrote:Greetings. Sorry Im not expert with Regex(1) but when all the files have five "_" , its good for match to have all five "_" also, and never to use "(.*)(.*)(.*)" because the first "(.*)" is greedy and forbids the next "(.*)" to own anything for itself. If not to keep "_" it must relocate outside parentheses, so instead of "(_.*)" it can be "_(.*)".



I was wondering what was going on when I couldn't get rid of either of the first two sections regardless of what I replaced, so it must have been the first (.*) at work. Now I know. And I'll stay clear of them. Thanks.
martin0121
 
Posts: 3
Joined: Mon Sep 28, 2020 7:43 pm


Return to Regular Expressions