Remove everything after the last instance of '

A swapping-ground for Regular Expression syntax

Remove everything after the last instance of '

Postby vengab » Sun Dec 20, 2009 3:10 pm

Hi all,

I need some help please with a regex for the following situation

I have filenames like...

0016'Broadway It's Not' February 10, 1981. (-) ERIN MORAN;SCOTT BAIO' February 10, 1981. ( Pho.jpg

I need all characters removed AFTER the LAST instance of '
which in this particular example is everything after ...BAIO'

so it becomes

0016'Broadway It's Not' February 10, 1981. (-) ERIN MORAN;SCOTT BAIO.jpg

(I don't really care if the last ' stays or not - preferably not but it's easy enough to remove the last character in a separate step)

The number of characters after the last ' varies


In another step, I then need the 0016 at the start to be replaced with the year (1981)

so it becomes

1981'Broadway It's Not' February 10, . (-) ERIN MORAN;SCOTT BAIO.jpg

the rest of the clean-up is relatively straight forward
so the final result will be

1981 'Broadway It's Not' February 10 - Erin Moran, Scott Baio.jpg


TIA

David
vengab
 
Posts: 6
Joined: Wed Jan 14, 2009 8:37 am

Re: Remove everything after the last instance of '

Postby Stefan » Sun Dec 20, 2009 4:08 pm

Hi David, excellent presentation of an question, thanks :wink:

Now i try as i have understood...


FROM:
0016'Broadway It's Not' February 10, 1981. (-) ERIN MORAN;SCOTT BAIO' February 10, 1981. ( Pho.jpg
TO:
0016'Broadway It's Not' February 10, 1981. (-) ERIN MORAN;SCOTT BAIO.jpg

RegEx(1)
Match: (.+)'
Repla: \1

Explanation:
match anything greedy till an '
. means search any char or sign
+ means one-or-more of this till you have found all you can get
() means group the match for backreference
\1 means give me the contain of group 1



------------------

FROM:
0016'Broadway It's Not' February 10, 1981. (-) ERIN MORAN;SCOTT BAIO.jpg
TO:
'Broadway It's Not' February 10, 1981. (-) ERIN MORAN;SCOTT BAIO.jpg

RegEx(1)
Match: (.+?)('.+)
Repla: \2

Explanation:
match anything non-greedy till an '
. means search any char or sign
+? means one-or-more of this, but stop if the very next term is found
() means group the match for backreference
('.+) means anything else, incl. the ' -sign
\2 means give me the contain of group 2, while we drop the \1-group
Hint: the first group \1 is not really needed here, but it disturbs us not too. So Match: .+?('.+) with Repla: \1 should work too.

---------------

Bring the year to the beginning:
I hope there is only exactly ONE 4-digit number in your string?

FROM:
'Broadway It's Not' February 10, 1981. (-) ERIN MORAN;SCOTT BAIO.jpg
TO:
1981'Broadway It's Not' February 10, . (-) ERIN MORAN;SCOTT BAIO.jpg

RegEx(1)
Match: (.+?)(\d{4})(.+)
Repla: \2\1\3

Explanation:
match anything non-greedy till an 4-digit number
(.+?) = match anything non-greedy till 1981 ====> \1 will hold "'Broadway It's Not' February 10, "
(\d{4}) = an number like '1981' , \d means an digit, {4} means 4 of them ====> \2 = will hold "1981"
(.+) = you should know now already ====> \3 = will hold the rest: " . (-) ERIN MORAN;SCOTT BAIO"

Now replace the string with the backreference groups in any order you like,
e.g. as you asked for: \2\1\3



I hope i have covered all of your question.
I think the first two RegEx's can be combined two one... but first do an test if they works the way you want.

See this older threads for an RegEx syntax overview:
=> Getting Started: http://www.bulkrenameutility.co.uk/forum/viewtopic.php?f=3&t=5
=> Go ahead: http://www.bulkrenameutility.co.uk/forum/viewtopic.php?f=3&t=27


HTH :lol:
If yes: please help two others too.
Stefan
 
Posts: 736
Joined: Fri Mar 11, 2005 7:46 pm
Location: Germany, EU

Re: Remove everything after the last instance of '

Postby vengab » Sun Dec 20, 2009 4:44 pm

Hi Stefan,

Thank you so very much for the super fast reply :D

and... all your regex's worked like a charm :D :D :D


David
vengab
 
Posts: 6
Joined: Wed Jan 14, 2009 8:37 am

Re: Remove everything after the last instance of '

Postby vengab » Sun Dec 27, 2009 6:00 pm

Hi all,

I have another dilemma...

I have
01-03-78'Grandpa's Visit' Henry Winkler, Danny Thomas 0558.jpg

I need to change the dates to yy-mm-dd so 01-03-78 at the start becomes 78-03-01

any help will be greatly appreciated

TIA

David
vengab
 
Posts: 6
Joined: Wed Jan 14, 2009 8:37 am

Re: Remove everything after the last instance of '

Postby Stefan » Sun Dec 27, 2009 10:34 pm

Hi David,

it's always the same story: split the file name into parts and reorder them as you need.

>01-03-78'Grandpa's Visit' Henry Winkler, Danny Thomas 0558.jpg

So catch: (two digits)dash(two digits)dash(two digits)(the rest...)
RegEx(1)
Match: (\d\d)-(\d{2})-(\d\d)(.+)
GroupNo \1 ,,,, \2 ,,,,, \3 ,, \4
Holds,,,,,01,-,,,03,,,-,,78,,,'Grandpa's Visit' Henry Winkler, Danny Thomas 0558


>I need to change the dates to yy-mm-dd so 01-03-78 at the start becomes 78-03-01
Repla: \3-\2-\1\4
Get,,,,78-03-01'Grandpa's Visit' Henry Winkler, Danny Thomas 0558


untested, HTH :lol:
If yes: please help two others too.



See this older threads for an RegEx syntax overview:
=> Getting Started: http://www.bulkrenameutility.co.uk/forum/viewtopic.php?f=3&t=5
=> Go ahead: http://www.bulkrenameutility.co.uk/forum/viewtopic.php?f=3&t=27
Stefan
 
Posts: 736
Joined: Fri Mar 11, 2005 7:46 pm
Location: Germany, EU

Re: Remove everything after the last instance of '

Postby vengab » Mon Dec 28, 2009 8:00 am

Hi Stefan,

Thank you for another quick reply and easy to understand explanation of what is going on

Your solution works :D


David
vengab
 
Posts: 6
Joined: Wed Jan 14, 2009 8:37 am


Return to Regular Expressions