by Luuk » Fri Sep 09, 2022 10:50 am
Greetings Panchdara.
Its unfortunate, but this regex is probably not a good example for studying, so this is trying to provide some descriptions for...
(?:(?<=^)|(?<=[- ]))(\d\d)\.(\d\d)\.([012]\d)(?=[- ]|$)(?X)(.*)(?<![- ])[- ]*:(20\d\d-\d\d-\d\d)[- ]*(?X)( +$| (?= ))
:20$3-$1-$2(?X)$2 $1 (?X)
Its 3-different Match/Replaces inbetween (?X) and with look-arounds that conduct in this order...
Match/Replace-1 converts mm.dd.yy ===> :20yy-mm-dd
(?:(?<=^)|(?<=[- ]))(\d\d)\.(\d\d)\.([012]\d)(?=[- ]|$)
:20$3-$1-$2
The 1st-two look-behinds are saying that to the left of \d\d should be either 'the-very-beginning' or '-' or 'space'.
The look-ahead at the end is saying that to the right of [012]\d should be either '-' or 'space' or 'at-the-very-end'.
Look-behinds use < to point left, but they forbid using "|" unless having exact same number of characters on both sides.
A look-behind like (?<=abc|def) does conduct properly, but (?<=abc|de) presents 'Regular Expression RegEx(1) is invalid' !!
So this why Im making 1-large group at the very beginning, to settle the 2 look-behinds inside, separated by the '|' character.
Its would have been much easier just using \b instead of look-arounds, but Im only trying to match names like in the samples.
Im only putting ?: inside of this 1st-group, just so that it doesnt waste any group-numbers, because Im wanting to use $1 for (\d\d).
The look-aheads dont care about how many characters are on each side of the '|', so this why the ending look-ahead is just 1-group.
Match/Replace-2 looks for :20yy-mm-dd, moves the date to the front, and replaces its bordering hyphens or spaces with 1-space.
(.*)(?<![- ])[- ]*:(20\d\d-\d\d-\d\d)[- ]*
$2 $1\x20
The look-behind is just looking at (.*) to make sure that his last-character will not be a 'hyphen' or 'space'.
This way, all of the bordering hyphens and spaces will get matched by [- ]* and can be replaced with 1-space.
The \x20 is just saying 'space', because the forum software will not grant any lines ending with spaces.
Match/Replace-3 deletes any extra spaces that Replace-2 might insert, like when there wasnt any $1 to be matched.
( +$| (?= ))
So it deletes 1-or-more spaces at the end, or 1-space when the look-ahead sees another space to its right.
Really you could just leave this part out, with using Remove(5) with checkmarks inside for "D/S" and "Trim".
But usually Im only trying to conduct files like in the samples, so this Match-3 was an exception.
==================================================================================================
If wanting to be very, very careful about never editing anything, unless it was already edited your very first regex...
You can put an illegal character like ':' inside of the Replace-1, then just make sure that your other matches look for it.
So when all of your Match/Replaces is finished, then just add a final match to remove all of the illegal characters.
So like if someone wanted to keep their extra spaces, I would instead conduct it more like...
(.*)(?<![- ])[- ]*(?:(?<=^)|(?<=[- ]))(\d\d)\.(\d\d)\.([012]\d)(?=[- ]|$)[- ]*(?X) +:(?!$)(?X) +:
20$4-$2-$3 $1 :(?X) (?X)
Its still 3-different Match/Replaces inbetween (?X) and with look-arounds that conduct in this order...
Match/Replace-1 converts mm.dd.yy ===> 20yy-mm-dd and moves it to the front.
(.*)(?<![- ])[- ]*(?:(?<=^)|(?<=[- ]))(\d\d)\.(\d\d)\.([012]\d)(?=[- ]|$)[- ]*
20$4-$2-$3 $1 :
This one is a combination of the other Match/Replace-1 and Match/Replace-2, so its "Replace" just adds ' :' to the end.
This way, any future Match/Replaces will look for : so they can only edit text that was first edited by Match/Replace-1.
In the other example, the Match/Replace-1 inserted : and then Match/Replace-2 looks for it, but also removes it.
So everything after that could possibly edit other names, if they had double-spaces or a space at the very end.
Its still always possible that $1 might not exist, and this would also still invent SpaceSpace: after the date.
And if a filename doesnt have any text after the date, then it could be invented at the end of a filename.
Match/Replace-2 converts 1-or-more spaces: ===> 1-space.
\x20+:(?!$)
\x20
The forum software is also prejudice against lines that begin with a space, so just remember that '\x20' == 'space' in regex.
The look-ahead says to not match at the end of filenames, because those should be deleted, instead of converted into 1-space.
Match/Replace-3 deletes 1-or-more spaces: at the end of filenames.
\x20+:
When Match/Replace-2 finishes, the only remaining 1-or-more spaces: would have to be at the end of filenames, so this deletes them.
These last two Match/Replaces could never conduct too many files, because they're always looking for : inserted by Match/Replace-1.
If using Remove(5) with "D/S" and "Trim", then you would only need Match/Replace-1 without any (?X), and just remove ':' from the replace.
Im really was just trying to present how adding an illegal character, can help to let another Match/Replace continue at the same place.
So this way, you never have to depend on using other settings, that might can change too many other filenames.
==================================================================================================
The number of (?X) can also be very important, because if using something like...
Match--1(?X)Match--2(?X)Match--3(?X)Match--4(?X)Match--5
Replace1(?X)Replace2
The Match/Replace--1 and Match/Replace--2 will conduct properly, but Match--3 thru Match--5 would all use Replace2 !!
If you add one (?X) after Replace2, then Match-3 thru Match-5 would all be deleted, because not having their own replace.
This why both of my long-regexs end with (?X) because otherwise their last-match gets replaced, instead of being removed.
A good way to experiment, is to just type in the very first Match/Replace-1, without any renaming.
Then looking at the new-name column to find some clues how the next Match/Replace-2 should conduct.
For myself, Im often use another application to change the text size and color, so its much easier to read.