Tricky regular Expression (Dateconversion)

A swapping-ground for Regular Expression syntax

Tricky regular Expression (Dateconversion)

Postby Leywalker » Tue Jun 13, 2006 9:20 am

Hi @ all first. :D

I have a rather tricky task to do.

I have to change many files (different folders /subfolders) on our server from:

name dd.mm.yy
name ddmmyy

to:

yyyy-mm-dd name

and i can't use the feature "Auto Date" as there are many files named
different from the Creation Date.

One more tricky thing is that 1997 to 1999 shouldn't end as 2097 to 2099.

The files are on a networkdrive that has a letter assigned on my desktop.

I look forward for your reply and thanks in advance.
Leywalker
 
Posts: 1
Joined: Tue Jun 13, 2006 8:47 am

Postby spambait » Tue Jun 13, 2006 3:37 pm

Looks easy enough, and if you read this forum, you will find a lot of similar examples to draw from.

For "name dd.mm.yy" format, try this:

Match:
(.*) (..).(..).(..)

Replace:
19\4-\3-\2 \1

If the date is after year 2000, be sure to change the 19\4 with 20\4. If you have a mixture of dates, you will have to run the program twice, cherry picking the 1900's from the 2000's.

For "name ddmmyy", you need to drop the periods:

Match:
(.*) (..)(..)(..)

Replace:
19\4-\3-\2 \1
spambait
 
Posts: 16
Joined: Tue Jan 03, 2006 5:06 pm

Postby Admin » Tue Jun 13, 2006 5:22 pm

Many thanks for helping out! I've been very busy for the last day or two, and haven't had time to check the forums.

Cheers,



Jim
Admin
Site Admin
 
Posts: 2343
Joined: Tue Mar 08, 2005 8:39 pm

Postby Glenn » Thu Jun 15, 2006 10:16 am

One thing which is probably just an oversight (or typo) with the suggested pattern:
(.*) (..).(..).(..)
is that "periods" outside the parentheses really mean any character that isn't a line break character (just like those inside the parentheses) because in a regex it is a wildcard metacharacter. The way it stands, it will match any filename with a space followed by 8 or more characters.
For example, it will match:
name 21.08.04

but it will also match:
name 21.08.04extracrap
name dd.mm.yy evenmore
name namename
8letters
****EDIT: the line above has a space before 8letters but doesn't display in the post
t 8Letters_extraletters

Any name which is longer than 8 letters after the space, it will drop the extra letters and use only those captured in the replacement.
In the last case - i.e. "t 8Letters_extraletters", it will match "t 8Letters" and drop the _extraletters
This would be replaced by "20rs-tt-8L t"

Obviously spambait meant a literal period which would be \.
The regex would then be:
(.*) (..)\.(..)\.(..)
Now it will only match:
name 21.08.04
name 21.08.04extracrap
name dd.mm.yy
name dd.mm.yy evenmore
To get rid of any non numbers in the date positions use \d instead of . in the parentheses
To prevent matching filenames with more than 6 numbers after the space, end the regex with $
i.e. use:
(.*) (\d\d)\.(\d\d)\.(\d\d)$
Finally, a simple way to allow it to match both period or no period separators is to make them optonal with ?

The regex will now be:
(.*) (\d\d)\.?(\d\d)\.?(\d\d)$

This will match:
name 21.08.04
name 210804

and replace them both with:
2004-08-21 name

If you really want the Regex to break a sweat and earn its keep, do the first pass with:
(.*) (\d\d)\.?(\d\d)\.?(?!9[7-9])(\d\d)$
(this tells it to NOT match any dates 97-99)
and replace with:
20\4-\3-\2 \1

This will change all filenames NOT 97-99 to yyyy-mm-dd name format and they won't match the second pass
For the second pass all that are left are 97-99 so use:
(.*) (\d\d)\.?(\d\d)\.?(\d\d)$
and replace with:
19\4-\3-\2 \1

This is still not bullet proof, but I think it's closer to what was intended.
Sorry if I made it more confusing, but it's easy to match more than wanted with these darn regexes

Cordially,
Glenn
Last edited by Glenn on Sat Jun 17, 2006 5:11 pm, edited 1 time in total.
Glenn
 
Posts: 28
Joined: Fri Apr 14, 2006 4:53 pm
Location: Winnipeg, Canada

Postby spambait » Thu Jun 15, 2006 3:08 pm

First of all, you are correct, I should have used (.*) (..)\.(..)\.(..)
That was an oversight. Secondly, I am in awe of your RegExp geekness <bows>.
spambait
 
Posts: 16
Joined: Tue Jan 03, 2006 5:06 pm

Postby bobkoure » Thu May 31, 2007 3:24 pm

Nice that you found a use for negative lookahead - kudos!
bobkoure
 
Posts: 16
Joined: Mon Jul 10, 2006 5:38 pm


Return to Regular Expressions