Homogenizing TV Show Names

A swapping-ground for Regular Expression syntax

Homogenizing TV Show Names

Postby linglingfool » Sun Oct 19, 2008 3:46 am

Hi,

I'm trying to use the program iPodifier, which requires that all show names it imports be in the format Showname-S01E01 Episode Name.ext. I'm trying to parse files in a variety of formats, including:

Showname.S01E01.foo.bar.ext
Showname.101.foo.bar.ext
Showname.1x01.foo.bar.ext

I came up with the regex expression
Code: Select all
s0?(\d\d?)e([\d]{2})|0?(\d\d?)([\d]{2})|0?(\d\d?)x([\d]{2})
which checks out as doing what I want to do in Regex Coach (isolating the season and episode numbers into regex variables). However, when I try to run them through BRU/BRC, I get the message back that the files don't need to be renamed. Any ideas?

Thanks.
linglingfool
 
Posts: 1
Joined: Sun Oct 19, 2008 3:40 am

Re: Homogenizing TV Show Names

Postby Glenn » Mon Oct 20, 2008 8:25 pm

You haven't explained what exactly you are using in your Match and replace.
The regular expression you show matches the season/episodes for the sample filenames, but the first alternation,
s0?(\d\d?)e([\d]{2})
will match
S01E01
Why would you want to match a filename that is already in the correct format?
You should try to match only filenames which are incorrect and change those.

You also have to understand that Bulk Rename works a little differently in how it uses the match/replace than normal regex Search/replace. You must capture ALL parts of the filename that you want to retain, not just those to be changed.

From you regex, I deduce that the episode is always going to have 2 digits, and the season will be 1 or 2.
I am assuming you may have Season/episode numbers such as s1x01, 1e01, s101 etc.
Also I assume s and e can be either upper or lower case

To simplify this as much as possible it's probably easiest not to capture existing S or E and supply our own in replacement.
This reduces it to adding S and E to 2 digit seasons, and adding S0 and E to one digit seasons.
This means you will have to make 2 renaming passes, one for each.
In order to not match correct filenames we use negative lookahead
(?!S\d{2}E\d{2})

To ignore case we use
(?i)

So, use this as you first pass
single digit season
Match
(?i)(.+?\.)(?!S\d{2}E\d{2})\D?(\d)\D?(\d{2}\..*)+
Replace
\1S0\2E\3

Then use this as your second pass
double digit season
Match:
(?i)(.+?\.)(?!S\d{2}E\d{2})\D?(\d{2})\D?(\d{2}\..*)+
Replace:
\1S\2E\3

This should work if all the filenames are similar to the samples you supplied.
Regex forum posters often forget to include examples of all possibilities and if this is the case you will have to modify it to suit the new criteria.
Anyway, this should get you started.

Cordially,
Glenn
Glenn
 
Posts: 28
Joined: Fri Apr 14, 2006 4:53 pm
Location: Winnipeg, Canada

Re: Homogenizing TV Show Names

Postby bobthedinosaur » Mon Sep 21, 2009 3:13 pm

Hi, sorry to resurrect an old thread but I'm trying to do the same thing that linglingfool was trying do to. I've tried using your code Glenn but it doesn't seem to work in the current version of bulk rename utility.
Could you please please help me with a code for extracting the name (although not always possible and not critical), (but at least) season number and episode number from the filename of an episode and putting it back together so as to homogenize the season's folder?

here are the examples of how episode filenames are provided by the distributors:
How I Met Your Mother - 407 - not a father's day.[Rocafresh].avi
HIMYM - 424.avi (name would be unavailable here)
aaf-km.s01e03.avi
aaf-penn.and.teller.bullshit.s07e06.hdtv.xvid.avi
dina-flashpoint204.avi
farscape - 1x02 - I, E.T..avi
knight.rider.2008.S01E04.HDTV.XviD-LOL.avi

thanks in advance!
bobthedinosaur
 
Posts: 2
Joined: Mon Sep 21, 2009 2:54 pm

Re: Homogenizing TV Show Names

Postby Glenn » Mon Sep 21, 2009 9:21 pm

Bob,
When you say it doesn't work, do you mean when applied to the original filenames in the first post, or do you mean with the filenames you have. I wouln't expect it to work with your samples because many have a different structure than the ones the regex was designed for.
Could you be a bit more specific about what you would like the result to look like.
Also, is the 3rd line starting HIMYM and the line following all one name that got wrapped?

Cordially,
Glenn
Glenn
 
Posts: 28
Joined: Fri Apr 14, 2006 4:53 pm
Location: Winnipeg, Canada

Re: Homogenizing TV Show Names

Postby bobthedinosaur » Mon Sep 21, 2009 10:06 pm

Hi thanks for the quick reply!

Yes, I mean it doesn't work for my filenames, not the ones posted by linglingfool.
I figured the codes might work for my files as the episode and season syntaxes look the same but I honestly don't have a clue what I'm doing with regex.
As for the list of examples I gave you, each line ending in .avi is a complete filename, only in the 2nd example did I add a comment inside the round brackets. Those are examples of the sort of filenames that I encounter and I'm hoping regex can extract the info I need from them and give me something neat and tidy.

My intended output would be thus:
'showname' - SxExx - 'episode name'.avi
the show name I could manually add in BRU as many filenames have truncated show names which I don't imagine a short line of code could interpret
the episode name is a bit of info that is sometimes present in the file name, it's the title given to the specific episode of the show and it usually comes after the episode number but could be followed by all sorts of junk info as well
I hope I've given you the right info!

Thanks again!
bobthedinosaur
 
Posts: 2
Joined: Mon Sep 21, 2009 2:54 pm


Return to Regular Expressions