Valid RegEx Not Working

A swapping-ground for Regular Expression syntax

Valid RegEx Not Working

Postby rjemery » Sun Mar 29, 2009 10:55 pm

I am using Bulk Rename Utility 32-bit Unicode version 2.7.1.1 on a XP Pro SP3 system.

I have file names of the form: First Last (Birthday).txt
I wish to convert them to: Last, First Birthday.txt

where Birthday is a eight digit number of the form YYYYMMDD

To no avail, I have tried to use the following regular expression -- among other variations as well -- in RegEx (1):

Match: \(.*\) \(.*\) (\(........\))
Replace: \2, \1 \3

What am I doing wrong?

Is there some BRU option that needs to be enabled first?
rjemery
 
Posts: 17
Joined: Sat Sep 15, 2007 6:51 am

Re: Valid RegEx Not Working

Postby GMA » Sun Mar 29, 2009 11:56 pm

Hi, rjemery:
Problem is that the first four "\" aren't required, and are actually making your expression invalid. Just remove those and it'll work fine. Also, I'll recommend using "\d{8}" (which means "find eight digits") instead of the dots:

MATCH: (.*) (.*) (\(\d{8}\))
REPLACE: \2, \1 \3


That's the expression you provided, but modified so it'll work. But when I look at this...

I wish to convert them to: Last, First Birthday.txt

...it seems like you also want to remove the parenthesis from the resulting names, right? In that case you should use:

MATCH: (.*) (.*) \((\d{8})\)
REPLACE: \2, \1 \3


Best regards,

Gabriel.
GMA
 
Posts: 91
Joined: Sun Dec 02, 2007 1:30 pm
Location: Argentina

Re: Valid RegEx Not Working

Postby rjemery » Mon Mar 30, 2009 12:35 am

Gabriel,

Yes, I do wish to remove the extraneous parentheses. However, your suggested regex did not work either. I did manage to make use of (.*) (.*) (.)(........)(.)

It seems BRU does not like any \(

Choosing (.*) (.*) (.)(\s{8})(.) also failed. Yes, I made use of \s{8} instead of \d{8} because in some instances, months have a letter abbreviation embedded.

I have a new issue. If not already capitalized, I desire to capitalize the first letter of fields \1 and \2, while the remaining string is to be lowercase. How might I accomplish that?
rjemery
 
Posts: 17
Joined: Sat Sep 15, 2007 6:51 am

Re: Valid RegEx Not Working

Postby GMA » Mon Mar 30, 2009 12:22 pm

Yes, I do wish to remove the extraneous parentheses. However, your suggested regex did not work either.

(.*) (.*) \((\d{8})\) will change "First Last (xxxxxxxx).txt" to "Last, First xxxxxxxx.txt", no doubt about it (where "xxxxxxxx" = 8 digits). Maybe there's something in the filename you provided as an example that doesn't actually reflect your real filenames.

It seems BRU does not like any \(

Don't know what to say... I allways use "\(" and "\)" in BRU/RegEx without a problem. I created a dummy file named "First Last (20090329).txt" to test the expression and it's working just fine.

Choosing (.*) (.*) (.)(\s{8})(.) also failed. Yes, I made use of \s{8} instead of \d{8} because in some instances, months have a letter abbreviation embedded.

But (\s{8}) means "match 8 blank spaces"... it'll never work :?. Also, you don't need "(.)"; you could just use a dot (although the correct way is \( and \) to match the parenthesis) and groups 1-3 instead of 1-5 in the REPLACE box. Check "Additional features > Regular Expressions" in BRU's help file for more info on syntax and expressions. Maybe you could try using simply

MATCH: (.*) (.*) \((.*)\)
REPLACE: \2, \1 \3


I have a new issue. If not already capitalized, I desire to capitalize the first letter of fields \1 and \2, while the remaining string is to be lowercase. How might I accomplish that?

Use Case (4) > Title.

---------------

Now, let me guess what the actual problem might be. Do your filenames have special characters in them? If yes, I'm afraid that's a known problem with BRU and it's RegEx function. It'll only work with filenames that don't have accented or special characters. I know this first hand because my first language is spanish (lots of accents) and RegEx refuses to work with such filenames.
Hope that helped.

Gabriel.
GMA
 
Posts: 91
Joined: Sun Dec 02, 2007 1:30 pm
Location: Argentina

Re: Valid RegEx Not Working

Postby rjemery » Mon Mar 30, 2009 5:31 pm

Gabriel,

Again, I am much appreciative for your kind help.

I find that for Match the use of any delimiter preceded by a backslash causes regex not to function in BRU. I have no further explanation for this.

No, the test case file names do not contain any accented, special or Unicode characters, but thanks for the warning should I encounter that situation in the future.

I have subsequently learned of the meaning of \d versus \s, but as I said above, the use of a backslash anywhere within Match negates regex completely in my implementation of BRU.

Regarding the use of Case (4) > Title, how do I limit Title to work on just First and Last and not the leading character of Birthday? Many dates don't follow the YYYYMMDD rule.
rjemery
 
Posts: 17
Joined: Sat Sep 15, 2007 6:51 am

Re: Valid RegEx Not Working

Postby GMA » Mon Mar 30, 2009 8:59 pm

I find that for Match the use of any delimiter preceded by a backslash causes regex not to function in BRU. I have no further explanation for this.

Me neither. I can't think of any possible reason why backslashes in RegEx won't work, because it's something handled completely by BRU, not something that may or may not work depending on the user specs. Furthermore, you say you have v2.7.1.1 32-bit, XP Pro, no accents and/or special characters in the file names... exactly the same as me when I tested the expression. I have no further recommendations except temporarily removing "Bulk Rename Utility.ini" from BRU's folder so you can try with a clean default config. In fact, if you want, try also the following:

1. Create a file named "(123).txt"
2. Open it in BRU and try this RegEx:

MATCH: \(...\)
REPLACE: abc


That's as basic as it gets to check if BRU/RegEx can match a filename with parenthesis. If that doesn't work, sorry, I'm out of ideas.

Regarding the use of Case (4) > Title, how do I limit Title to work on just First and Last and not the leading character of Birthday? Many dates don't follow the YYYYMMDD rule.

Mmm... that's too specific. The only way I can think of is using Case (4) > Excep. to leave out certain words, like:

Jan:Feb:Mar:Apr:Jun:Jul:Aug:Sep:Nov:Dec

The problem with that is, for example, that adding "Jan" to the exceptions list also affects other words like "Janet", "Jane", etc. (common names that you may have in the first part of your filenames). So, no, I'm afraid there's no ideal solution for that problem.

Gabriel.
GMA
 
Posts: 91
Joined: Sun Dec 02, 2007 1:30 pm
Location: Argentina

Re: Valid RegEx Not Working

Postby rjemery » Tue Mar 31, 2009 2:02 am

Gabriel,

1. Create a file named "(123).txt"
2. Open it in BRU and try this RegEx:

MATCH: \(...\)
REPLACE: abc


Well, that worked, even without re-establishing the INI file. However, as soon as I tried to expand the test case, the first additional inclusion of a backslash killed any regex response.

I think we have beaten this issue to death. As long as there are alternate regex patterns that got the job done, it's time to move on.

I'm left with attempting to capitalize the first characters of the first and last names without affecting any other part of the fields or strings involved.

FWIW, for the name capitalization, I did try to match with: (.)(.*, )(.)(.*)$
and replace with: \u\1\2\u\3\4

but the two \u operators were interpreted literally. In any event, I got the job done by using the aforementioned regex expressions in vi and creating a rename batch command for each and every file.
rjemery
 
Posts: 17
Joined: Sat Sep 15, 2007 6:51 am

Re: Valid RegEx Not Working

Postby GMA » Tue Mar 31, 2009 11:49 am

I think we have beaten this issue to death. As long as there are alternate regex patterns that got the job done, it's time to move on.

OK, that's up to you. If there's something else that comes to your mind I have no problem in at least trying to help.

FWIW, for the name capitalization, I did try to match with: (.)(.*, )(.)(.*)$
and replace with: \u\1\2\u\3\4

Yeah, that's a very good approach, but unfortunately it doesn't work in BRU; not sure why. I don't have a full idea of which expressions/patterns/etc. are or aren't supported by BRU. Normally, both \u\1\2\u\3\4 or \U\1\E\2\U\3\E\4 should work with your MATCH expression. I suppose only those listed in BRU's help file/RegEx section are the ones that work. The MATCH field doesn't seem to support the use of the backslash other than for regrouping (i.e. only followed by numbers).
Anyway, at least not everything's bad since you seem to have found a way via a batch. Don't forget you can alternatively use the "File > Import Rename-Pairs" function (that has the advantage of previewing the results).
Best regards,

Gabriel.
GMA
 
Posts: 91
Joined: Sun Dec 02, 2007 1:30 pm
Location: Argentina


Return to Regular Expressions