Dots in title of tv series

A swapping-ground for Regular Expression syntax

Dots in title of tv series

Postby notused » Sat May 16, 2015 7:35 pm

Hi,

I'd like some assistance with some regex that's been bothering me for quite some time.
I'm trying to rename files with this format:
Code: Select all
Pielikers - 01x01 - I Like Pie.DIMENSION.English.C.updated.Addic7ed.com.srt

This works fine using following regex:
Code: Select all
(.*) - (\d\d)x(\d\d) - ([^\.]*)
\1 S\2E\3 \4


However some series titles contain ".", for example:
Code: Select all
Person of Interest - 04x13 - M.I.A..DIMENSION.English.C.updated.Addic7ed.com.srt

Which leaves me with:
Code: Select all
Person of Interest S04E13 M.srt


The only characteristic I can find in ".DIMENSION.English.C.updated.Addic7ed.com.srt" is the fact that the sequence starts with a dot and that the whole sequence does not contain any spaces (whitespace) at all.
Several parts in that sequence may change, the "DIMENSION", "English", "C", "updated", all those can change, maybe they can add more fields to the file name, who knows.
I mean I could hardcode it all (for example, look for 6 dots at the end and take everything before that), but maybe some field also contains a dot...
Code: Select all
(.*) - (\d\d)x(\d\d) - (.*)(\..*){6}

is there a simple way to retrieve my title in both the general case and the special case where the title contains spaces?

Thanks in advance!
notused
 
Posts: 4
Joined: Sat May 16, 2015 7:25 pm

Re: Dots in title of tv series

Postby Stefan » Tue May 19, 2015 11:56 am

Please post a few before/after file names, like


BEFORE:
Pielikers - 01x01 - I Like Pie.DIMENSION.English.C.updated.Addic7ed.com.srt
Person of Interest - 04x13 - M.I.A..DIMENSION.English.C.updated.Addic7ed.com.srt

AFTER:
?
?


so we can guess what you are after.



.
Stefan
 
Posts: 736
Joined: Fri Mar 11, 2005 7:46 pm
Location: Germany, EU

Re: Dots in title of tv series

Postby notused » Thu May 21, 2015 8:20 pm

BEFORE:
Pielikers - 01x01 - I Like Pie.DIMENSION.English.C.updated.Addic7ed.com.srt
Person of Interest - 04x13 - M.I.A..DIMENSION.English.C.updated.Addic7ed.com.srt

AFTER:
Pielikers S01E01 I Like Pie.srt
Person of Interest S04E13 M.I.A..srt
notused
 
Posts: 4
Joined: Sat May 16, 2015 7:25 pm

Re: Dots in title of tv series difficult to extract wanted parts

Postby Stefan » Fri May 22, 2015 7:27 am

Thanks. Now I even understand your first post :D
 

BEFORE:
Pielikers - 01x01 - I Like Pie.DIMENSION.English.C.updated.Addic7ed.com.srt
Person of Interest - 04x13 - M.I.A..DIMENSION.English.C.updated.Addic7ed.com.srt

AFTER:
Pielikers S01E01 I Like Pie.srt
Person of Interest S04E13 M.I.A..srt



RULE:
Match everything until first hyphen: "(.+) - "
Match two digits + "x" + two digits: "(\d\d)x(\d\d) - "
Match everything until ".DIMENSION" plus the rest: "(.+)\.DIMENSION.+"


USE:
RegEx(1)
Match: "^(.+) - (\d\d)x(\d\d) - (.+)\.DIMENSION.+$"
Repla: "\1 S\2E\3 \4"

(use without the quotes)



 
Stefan
 
Posts: 736
Joined: Fri Mar 11, 2005 7:46 pm
Location: Germany, EU

Re: Dots in title of tv series

Postby notused » Fri May 22, 2015 5:37 pm

Thanks a lot already, however I think you did not fully understand my post, which of course is my fault :D
The part after the series title can change, it does not contain the same word every time (so no DIMENSION or something).
Some more examples:

BEFORE:
Pielikers - 01x01 - I Like Pie.DIMENSION.English.C.updated.Addic7ed.com.srt
Person of Interest - 04x13 - M.I.A..DIMENSION.English.C.updated.Addic7ed.com.srt
Series Name - 02x02 - F.B.I..something.somethingmore.C.status.Addic7ed.com.srt
Silicon Valley - 02x05 - Server Space.IMMERSE.English.C.updated.Addic7ed.com.srt
Silicon Valley - 02x04 - The Lady.ASAP.English.C.updated.Addic7ed.com.srt
Silicon Valley - 02x03 - Bad Money.KILLERS.English.C.updated.Addic7ed.com.srt
Silicon Valley - 02x02 - Runaway Devaluation.ASAP.French.C.updated.Addic7ed.com.srt


AFTER:
Pielikers S01E01 I Like Pie.srt
Person of Interest S04E13 M.I.A..srt
Series Name S02E02 F.B.I..srt
Silicon Valley S02E05 Server Space.srt
Silicon Valley S02E04 The Lady.srt
Silicon Valley S02E03 Bad Money.srt
Silicon Valley S02E02 Runaway Devaluation.srt
notused
 
Posts: 4
Joined: Sat May 16, 2015 7:25 pm

Re: Dots in title of tv series

Postby Stefan » Fri May 22, 2015 7:49 pm

 

Now you come with that.




Can you find a common pattern over all file names? Perhaps the amount of dots from the right?

If not, you have to sort the file names by pattern and rename each group separately.





 
Stefan
 
Posts: 736
Joined: Fri Mar 11, 2005 7:46 pm
Location: Germany, EU

Re: Dots in title of tv series

Postby notused » Sat May 23, 2015 12:08 pm

BEFORE:
Pielikers - 01x01 - I Like Pie.DIMENSION.English.C.updated.Addic7ed.com.srt
Person of Interest - 04x13 - M.I.A..DIMENSION.English.C.updated.Addic7ed.com.srt
Series Name - 02x02 - F.B.I..something.somethingmore.C.status.Addic7ed.com.srt
Silicon Valley - 02x05 - Server Space.IMMERSE.English.C.updated.Addic7ed.com.srt
Silicon Valley - 02x04 - The Lady.ASAP.English.C.updated.Addic7ed.com.srt
Silicon Valley - 02x03 - Bad Money.KILLERS.English.C.updated.Addic7ed.com.srt
Silicon Valley - 02x02 - Runaway Devaluation.ASAP.French.C.updated.Addic7ed.com.srt


AFTER:
Pielikers S01E01 I Like Pie.srt
Person of Interest S04E13 M.I.A..srt
Series Name S02E02 F.B.I..srt
Silicon Valley S02E05 Server Space.srt
Silicon Valley S02E04 The Lady.srt
Silicon Valley S02E03 Bad Money.srt
Silicon Valley S02E02 Runaway Devaluation.srt

RULE:
Match everything until the first hyphen, but only if there is something (one-or-more): "(.+) - "
Match two digits + "x" + two digits + hyphen: "(\d\d)x(\d\d) - "
Match everything, but only if there is something (one-or-more): (.+)
Match "." + everything, 6 times: "(\..*){6}"

USE:
RegEx(1):
Code: Select all
(.+) - (\d\d)x(\d\d) - (.+)(\..*){6}
\1 S\2E\3 \4


The amount of dots (6) from the right is the only thing I can find, I already found this regex that does that, explanation above.
Maybe there is a more elegant way to do this, since my regex-fu isn't that good :p
Last edited by notused on Sat May 23, 2015 1:49 pm, edited 1 time in total.
notused
 
Posts: 4
Joined: Sat May 16, 2015 7:25 pm

Re: Dots in title of tv series

Postby Stefan » Sat May 23, 2015 1:26 pm

No, looks fine.

But you use the star multiplier which means None-ore-more of the expression on the left.
I used the plus operator which means one-ore-more ....., could make a difference one time?!?

And, maybe you can modify your post as you have bbcode [-b-] for bold face in your last code block.
That may confuse others trying to use your solution for there own use.

If I find some time (over the weekend or next week) I will see if I find a better solution.
But,.... if it works it is fine. And it's already nifty enough, One first have to think about that at first place.

Well done.
Stefan
 
Posts: 736
Joined: Fri Mar 11, 2005 7:46 pm
Location: Germany, EU


Return to Regular Expressions


cron