Phone-book Style Names

Post any Bulk Rename Utility support requirements here. Open to all registered users.

Phone-book Style Names

Postby omni555 » Wed Aug 21, 2013 11:21 am

I have several thousand e-books stored in folders named for the authors. The problem is that the authors' names are structured as "John William Smith". When doing a sort or a search, the author's first name is detected and used in the operation. I wish to rename the relevant folders (and some files as well) to appear in "phone-book" format - eg. "Smith, John William"

Using ^([A-Z][a-z]*) ([A-Z][a-z]*) in the Match field and /2, /1 in the Replace field works for names like John Lane -> Lane, John, but Ed McBain would result in Mc, Ed, Kevin O'Brien gives O, Kevin and Edgar Allan Poe results in Allan, Edgar.

I fixed the McBain type problems (but not O'Brien) by using ^([A-z]*) ([A-z]*) in the match field, but that didn't solve everything...

Putting ^([A-z]*) ([A-z]*) ([A-z]*) in the Match field and 3/, 1/ 2/ in the replace field further allowed the proper renaming of the "John William Smith" type names to "Smith, John William", but now the 2-field names did not get processed (ie, John Smith remained John Smith).

Other problem names include those with initials followed by periods (H. G. Wells did not rename, while H G Wells DID give Wells, H G...), George R R Martin produced R, George R (I'm guessing THAT one could be fixed by adding another " ([A-z]*)" in the Match field and changing the Replace field to "4/, 1/ 2/ 3"

I'm thinking it should be fairly simple to modify the expression(s) in the Match field to allow non-alphabetic characters to be included (such as the period after an initial, apostrophe such as in O'Brien, or other special characters in the name that are valid in file-names), however I am not familiar with how to do this.

The other point I would like help with is how to modify the Match field expression and the Replace field expression to allow processing of names of 2, 3, 4 or even more parts at the same time. (such as John Smith, John William Smith, John william Randell Smith, etc... I COULD do it by writing separate rules for each grouping of names and applying each rule to only the names it would work with, but I would much rather a single rule that could take multiple-part names into account in one sweep to avoid having to search through all the names to select only those the rule would apply to, and possibly selecting some that could get mangled if wrongly selrcted...). Essentially, I want to be able to take the surname and move it to the beginning of the file/folder name, while leaving everything else intact.

Thank you in advance for any help you can provide. :D


...
Since posting this request, I have done further experimentation and solved (?) several of the issues I was enquiring about.

I have changed the expression used to ([A-z.'-]*). This allows me to work with names such as O'Brien, McDonald, and Mary-Janice as single fields, and also allows processing of initials followed by periods (such as in H. G. Wells).

I have further saved as "favorites" the following:

#1 Name swap 2.bru
Match: ^([A-z.'-]*) ([A-z.'-]*)
Replace: \2, \1

#2 Name swap 3.bru
Match: ^([A-z.'-]*) ([A-z.'-]*) ([A-z.'-]*)
Replace: \2, \1

#3 Name swap 4.bru
Match: ^([A-z.'-]*) ([A-z.'-]*) ([A-z.'-]*) ([A-z.'-]*)
Replace: \2, \1

I still, however, need to apply these one at a time, AND in reverse order, (otherwise, if I applied #1 first, the result would be any file/folder name containing more than 2 fields would be truncated to having only 2, such as H. G. Wells giving G., H., Mary Janice Davidson giving Janice, Mary and George R R Martin giving R, George) and any names whose format falls outside those covered still either do not get processed or get corrupted (such as Tatiana de Rosnay results in Rosnay, Tatiana de and Robert Hans van Gulik gives Gulik, Robert Hans van).

So, STILL looking for a little help fine-tuning this project! :)
omni555
 
Posts: 4
Joined: Tue Aug 20, 2013 10:29 pm

NameFormat --> Last, First

Postby truth » Fri Aug 23, 2013 7:08 pm

Another way is to match LastSpace, since everything afterwards should be the surname
It's a wide match, but you can put * !*,* in 12Filter if you've done any renaming
The below filters-out Names with comma's & moves the last 'word' to beginning as surname,
It also drops/disregards any redundant trailing spaces in Name, if thats an issue?
12Filter=* !*,*
(.+) ([^ ]+)
\2, \1

Going by your examples, & assuming 2-worded surnames begin lowercased, a 2nd regex fixes them:
It matches Names with a comma and ending with a lowercased 'word', moving it in front of surname,
12Filter=*,*
(.*) ([a-z]+)
\2 \1

Sorry, I dont see a 1-step solution, especially not knowing the possible surname formats
I saw the failure examples but no success, so I dont know if 3-worded surnames are possible?
Should it have been van Gulik, Robert Hans or Hans van Gulik, Robert?
I feel like I've been short-changed somehow, I've only got 2 names!

EDIT: Since we know surnames max-out at 2 words, & the 1st-one must be lowercased:
There is a 1-step regex-solution that wont touch any previously renamed files (with a comma)
While you dont need to 12Filter out commas, I'd prob do it anyways for a cleaner display:
^([^,]+?) ([a-z]+ [^, ]+$|[^, ]+$)
\2, \1

<----- Group1-----> Space <--------------------------- Group 2 ----------------------------------->
(NonCommasUntil) ....... (LowerCasesSpaceNonCommas/Spaces$ OR NonCommas/Spaces$)

The very 1st ^ just means BOL(Beginning-of-Line), while $ = EOL: End-of-Line(FileName)
The ^ in sets negate matching, example: [^qz]+ matches anything until either q or z
Last edited by truth on Sun Aug 25, 2013 11:27 am, edited 2 times in total.
truth
 
Posts: 221
Joined: Tue Jun 25, 2013 3:39 am
Location: Earth, OrionArm, MilkyWay

Re: Phone-book Style Names

Postby omni555 » Fri Aug 23, 2013 9:46 pm

Thanks for the reply, truth!!! I applied your suggestions and my frustration level with this project has gone WAY down! Not ALL problems solved, but a great start!

I didn't realize how the filters were used (that was a big help when working with groups of files/folders containing items I did not want processed). Also, I am still getting used to the syntax of the expressions. What I really need is a "BRU and Expression Syntax For Dummies" type manual!!! :)

Now I need to work with files/folders in the format

(first name) [(other middle names)] (last name) ( - ) (book title)

where the book title might contain any number of words, as well as a variety of characters like ", -, _, $, etc, for example

Bob W. J. Hacker - Mary's ($5.00) Shoes, in Black - A "Bargain"!

to get

Hacker, Bob W. J. - Mary's ($5.00) Shoes, in Black - A "Bargain"!

In general,

(last name) (, ) (first name) [(other middle names)] ( - ) (book title)

Essentially, I want to do the same as before, but leave everything after the " - " untouched.

Again, thanks for the help, and I'm sure I'll be posting more questions soon!!!
omni555
 
Posts: 4
Joined: Tue Aug 20, 2013 10:29 pm

Re: Phone-book Style Names

Postby omni555 » Fri Aug 23, 2013 9:51 pm

truth wrote:Should it have been van Gulik, Robert Hans or Hans van Gulik, Robert?



Oops, sorry! It should be van Gulik, Robert Hans.

I feel like I've been short-changed somehow, I've only got 2 names!


Lucky you! I ended up with FOUR!!!
omni555
 
Posts: 4
Joined: Tue Aug 20, 2013 10:29 pm

Re: Phone-book Style Names

Postby omni555 » Fri Aug 23, 2013 10:27 pm

OK!!! Got it!!!

Match: (.+) ([^ ]+) - (.+)

Replace: \2, \1 - \3

Filter: * !*,*

It WORKS!!! :)

Arthur C. Clarke - Childhood's End.mobi

produced

Clarke, Arthur C. - Childhood's End.mobi


One more thing to take care of... double - as in

Sherrilyn Kenyon - The League 01 - Born Of The Night

now gives

01, Sherrilyn Kenyon - The League - Born Of The Night

OK, just need it to pick up the first " - " and ignore any others in the book title...
omni555
 
Posts: 4
Joined: Tue Aug 20, 2013 10:29 pm

Name1 Name2 - Title --> Name2, Name1 - TitleUntouched

Postby truth » Sat Aug 24, 2013 5:29 am

OK, got it... New scenario, very similiar, wasnt sure at first:
If you filter-out commas in this one, you wont be able to rename files like the shoes-example
Note you could have used (.+?) in your last Group1 to match at least 90% of what you prob needed

Still no worries, regex matching is alot more comprehensive, but you do have to look at more files
The below works without the need to 12Filter, it cant touch any previously renamed files
^([^,]+?) ([^, ]+)( - .+)
\2, \1\3

<-----Group1-----> ...... <------ Group2 -------><----------- Group3 --------->
(NonCommasUntil)Space(NonCommas/Spaces)(Space-SpaceEverythingElse)

Dont jump the gun, its best to err on the side of caution, just give everything a final lookover.
I cant think of any issues, but if something turns up, just post back with details.

EDIT:
If you had already filtered-out commas with * !*,* since there were so many:
You could use *-*,* !*,*-* to ONLY show files with a comma after the dash (within Title)
truth
 
Posts: 221
Joined: Tue Jun 25, 2013 3:39 am
Location: Earth, OrionArm, MilkyWay


Return to BRU Support