transforming long date suffix to yyyymmdd prefix

A swapping-ground for Regular Expression syntax

transforming long date suffix to yyyymmdd prefix

Postby Vn5eOJK » Sun Oct 02, 2022 6:16 am

I have files that have various long date fomats at the end, after a bullet symbol (ALT+0149) on the numpad, eg stuff like;

Some_title_text •September_2022.ext
Some_title_text •September-October_2022.ext
Some_title_text •Sept-Oct_2022.ext
Some_title_text •September_15_2022.ext
Some_title_text •15_September_2022.ext
Some_title_text •8_September_2022.ext
Some_title_text •Issue_15.ext
Some_title_text • more_text •Issue_15.ext

and my desired renaming is;

202209 Some_title_text.ext
202209 Some_title_text.ext
202209 Some_title_text.ext
20220915 Some_title_text.ext
20220915 Some_title_text.ext
20220908 Some_title_text.ext
Some_title_text •Issue_15.ext
Some_title_text • more_text •Issue_15.ext

Note;

- there may be multiple bullets but the date if it exists is always at the end
- the date may be in multiple formats with a leading or middle day if it exists
- there may be text after the bullet that uses a scheme other than calendar dating to identify a volume in series

Ideally I'm trying for a PCRE v1 compatible solution so I can use it in BRC in a bat file, but I may resort to buying a corporate lisence and using java code.

I've got code that gets year and month to the front, but am stuck on the regex to use to get the day. Specifically examples 4-6. I've gotten these to;

202209 Some_title_text •_15.ext
202209 Some_title_text •15_.ext
202209 Some_title_text •8_.ext

but am having trouble with the matching/cutting/pasting/replacing to get the day to the 7th position and ideally zero-filled.

My regex is rusty, but the bullet symbol seems to be causing me a problem, even if I'm not trying to parse/split on it and just try to look at the last 2-3 characters. I could be wrong.

I've been trying this in various ways with BRC 1.3.3 and BRU 3.4.4 on Win 7 x64 and Win 10 x64
Vn5eOJK
 
Posts: 11
Joined: Sat Jan 19, 2019 2:41 pm

Re: transforming long date suffix to yyyymmdd prefix

Postby Vn5eOJK » Mon Oct 03, 2022 4:30 pm

I ended up purchasing the license and using Javascript to do this. I don't see a way to delete or edit the original post.
Vn5eOJK
 
Posts: 11
Joined: Sat Jan 19, 2019 2:41 pm

Re: transforming long date suffix to yyyymmdd prefix

Postby therube » Mon Oct 03, 2022 4:45 pm

You could post the JS code you used to accomplish your task (which may help others) ;-).
therube
 
Posts: 1319
Joined: Mon Jan 18, 2016 6:23 pm

Convert •dates with month-words into prefix of yyyymmdd

Postby Luuk » Mon Oct 03, 2022 10:59 pm

For brc64.exe, the /RegExp: will forbid more than 9-groups, and also any groups like (?:), so it needs to be a very long command like...
Code: Select all
"C:\Path\To\Your\brc64.exe" /Dir:"C:\Path\To\MainFolder" /Pattern:*•* /Recursive /NoFolders /ReplaceCI:•:"|"  /RegExp:"(.+) (\|(\d{0,2}[-_])?(Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|Sept|September|Nov|November|Dec|December)([-_](Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|Sept|September|Nov|November|Dec|December))?([-_]\d{0,2})?[-_](19|20)\d\d)(.*):\2 \1\9" /RegExp:"^(\|)(\d)([-_])([^-_]+)(.*):\1\4\30\2\5" /RegExp:"^(\|)(\d\d)([-_])([^-_]+)(.*):\1\4\3\2\5" /Regexp:"^(\|[^-_]+[-_])(\d[-_].*):\10\2" /RegExp:"^(\|[a-zA-Z]+)[-_][^\d]+([-_].*):\1\2" /RegExp:"^(\|)Jan(uary)?[-_](.*):\101\3" /RegExp:"^(\|)Feb(ruary)?[-_](.*):\102\3" /RegExp:"^(\|)Mar(ch)?[-_](.*):\103\3" /RegExp:"^(\|)Apr(il)?[-_](.*):\104\3" /RegExp:"^(\|)May[-_](.*):\105\3" /RegExp:"^(\|)June?[-_](.*):\106\3" /RegExp:"^(\|)July?[-_](.*):\107\3" /RegExp:"^(\|)Aug(ust)?[-_](.*):\108\3" /RegExp:"^(\|)Sept(ember)?[-_](.*):\109\3" /RegExp:"^(\|)Oct(ober)?[-_](.*):\110\3" /RegExp:"^(\|)Nov(ember)?[-_](.*):\111\3" /RegExp:"^(\|)Dec(ember)?[-_](.*):\112\3"  /RegExp:"(^\|\d{4})[-_](\d{4}.*):\1\2"  /RegExp:"(^\|)(\d{2,4})(\d{4})(.*):\1\3\2\4"  /Regexp:"^\|(.+):\1"  /ReplaceCI:"|:•"

If adding /Execute to the end, it renames like...
Title1 •September_2022.txt ----------------> 202209 Title1.txt
Title2 •September-October_2022.txt ------> 202209 Title2.txt
Title3 •Sept-Oct_2022.jpg ------------------> 202209 Title3.jpg
Title4 •September_1_2022.txt -------------> 20220901 Title4.txt
Title5 •25_November_2022.txt ------------> 20221125 Title5.txt
Title6 •4_October_2022.txt ----------------> 20221004 Title6.txt
Title7 •8_September_2022 •Issue1.txt ---> 20220908 Title7 •Issue1.txt
Title8 •Issue_15.txt -------------------------> (no changes, because no •date)
Title9 • more_text •Issue_15.txt ----------> (no changes, because no •date)

But would 1st need to edit both /Dir: and /Pattern: and also the path at the very beginning, to conduct properly.
If not all of the month-words are getting matched, could just add some more different spellings to be matched.
Or if needing any help to modify, can just say some example-abbreviations or spellings for the month-words.

The command is a little longer than it needs to be, just because Im making it an easier command to experiment with.
The /RegExp: wont like any bullet-characters, so they must be converted into '|', before /RegExp: will conduct normally.
The rest just matches your different conditions, replaces month-words with numbers, and move them like the examples.

If your names have any other extended-ASCII chars, they would also need to be replaced like above, before using the /RegExp:.
So this is actually much easier to conduct with either the javascript or bru, because not having all these limitations.
If the preview on the command-line looks correct, you could add /Execute to the very end, to conduct any renames.

If needing help to modify anything, or explanations about the different /RegExp:, can always post back again.
If the command already has /Execute and its ready to put in a batch, the batch first needs a line like...
chcp 1252

This to let the batch properly conduct extended-ASCII chars, just like the command-line conducts them.
Its unfortunate, but modern text-editors dont grant Encoding==Extended-ASCII when inventing batch-files?

========================== Javascript ==========================================

For javascript, it can be more like...
Code: Select all
step1=name.replace(/(.+) •((\d{0,2}[-_])?(Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|((Sept|Nov|Dec)(ember)?))([-_](Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|Sept|September|Nov|November|Dec|December))?([-_]\d{0,2})?[-_](19|20)\d\d)/, ':$2 $1')
step2=step1.replace(/(:)(\d)([-_])([^-_]+)/, '$1$4$30$2')
step3=step2.replace(/(:)(\d\d)([-_])([^-_]+)/, '$1$4$3$2')
step4=step3.replace(/(:[^-_]+[-_])(\d[-_])/, '$10$2')
step5=step4.replace(/(:[a-zA-Z]+)[-_][^\d]+([-_])/, '$1$2')
jan=step5.replace(/(:)Jan(uary)?[-_]/, '$1'+'01');  feb=jan.replace(/(:)Feb(ruary)?[-_]/, '$1'+'02')
mar=feb.replace(/(:)Mar(ch)?[-_]/, '$1'+'03');      apr=mar.replace(/(:)Apr(il)?[-_]/, '$1'+'04')
may=apr.replace(/(:)May[-_]/, '$1'+'05');          jun=may.replace(/(:)June?[-_]/, '$1'+'06')
jul=jun.replace(/(:)July?[-_]/, '$1'+'07');         aug=jul.replace(/(:)Aug(ust)?[-_]/, '$1'+'08')
sep=aug.replace(/(:)Sept(ember)?[-_]/, '$1'+'09');   oct=sep.replace(/(:)Oct(ober)?[-_]/, '$1'+'10')
nov=oct.replace(/(:)Nov(ember)?[-_]/, '$1'+'11');    dec=nov.replace(/(:)Dec(ember)?[-_]/, '$1'+'12')
noseps=dec.replace(/(:\d{4})[-_](\d{4})/, '$1$2')
newName=noseps.replace(/:(\d{2,4})(\d{4})/, '$2$1')


Or can also add comments like...
Code: Select all
// Match all possible •date-formats, and mark these names with : at the beginning...
step1=name.replace(/(.+) •((\d{0,2}[-_])?(Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|((Sept|Nov|Dec)(ember)?))([-_](Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|Sept|September|Nov|November|Dec|December))?([-_]\d{0,2})?[-_](19|20)\d\d)/, ':$2 $1')

// Now its easier searching for : instead of typing the very long •date-formats again!
// Pad and move 1-digit before month-words...
step2=step1.replace(/(:)(\d)([-_])([^-_]+)/, '$1$4$30$2')

// Convert dd_MonthWord --> MonthWord-dd...
step3=step2.replace(/(:)(\d\d)([-_])([^-_]+)/, '$1$4$3$2')

// Pad 1-digit after month-words...
step4=step3.replace(/(:[^-_]+[-_])(\d[-_])/, '$10$2')

// Destroy 2nd-month in MonthText-MonthText ranges...
step5=step4.replace(/(:[a-zA-Z]+)[-_][^\d]+([-_])/, '$1$2')

// Convert month-text into mm...
jan=step5.replace(/(:)Jan(uary)?[-_]/, '$1'+'01');  feb=jan.replace(/(:)Feb(ruary)?[-_]/, '$1'+'02')
mar=feb.replace(/(:)Mar(ch)?[-_]/, '$1'+'03');      apr=mar.replace(/(:)Apr(il)?[-_]/, '$1'+'04')
may=apr.replace(/(:)May[-_]/, '$1'+'05');          jun=may.replace(/(:)June?[-_]/, '$1'+'06')
jul=jun.replace(/(:)July?[-_]/, '$1'+'07');         aug=jul.replace(/(:)Aug(ust)?[-_]/, '$1'+'08')
sep=aug.replace(/(:)Sept(ember)?[-_]/, '$1'+'09');   oct=sep.replace(/(:)Oct(ober)?[-_]/, '$1'+'10')
nov=oct.replace(/(:)Nov(ember)?[-_]/, '$1'+'11');    dec=nov.replace(/(:)Dec(ember)?[-_]/, '$1'+'12')

// Remove dd_yyyy separators...
noseps=dec.replace(/(:\d{4})[-_](\d{4})/, '$1$2')

// Move yyyy to front and destroy : ...
newName=noseps.replace(/:(\d{2,4})(\d{4})/, '$2$1')



========================== BRU with RegEx(1) ==========================================

For users without a commercial license, or not like using the command-line, there is still another way.
The RegEx(1) can use a checkmark inside for "v2", but then its a very long "Match" like...
Code: Select all
(.+) •((\d{0,2}[-_])?(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|June?|July?|Aug|August|Oct|October|((Sept|Nov|Dec)(ember)?))([-_](Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|June?|July?|Aug|August|Oct|October|((Sept|Nov|Dec)(ember)?)))?([-_]\d{0,2})?[-_](19|20)\d\d)(?X)(:)(\d)([-_])([^-_]+)(?X)(:)(\d\d)([-_])([^-_]+)(?X)(:[^-_]+[-_])(\d[-_])(?X)(:[a-zA-Z]+)[-_][^\d]+([-_])(?X)(:)Jan(uary)?[-_](?X)(:)Feb(ruary)?[-_](?X)(:)Mar(ch)?[-_](?X)(:)Apr(il)?[-_](?X)(:)May[-_](?X)(:)June?[-_](?X)(:)July?[-_](?X)(:)Aug(ust)?[-_](?X)(:)Sept(ember)?[-_](?X)(:)Oct(ober)?[-_](?X)(:)Nov(ember)?[-_](?X)(:)Dec(ember)?[-_](?X)(:\d{4})[-_](\d{4})(?X):(\d{2,4})(\d{4})

And a "Replace" like...
Code: Select all
:$2 $1(?X)$1$4${3}0$2(?X)$1$4$3$2(?X)${1}0$2(?X)$1$2(?X)${1}01(?X)${1}02(?X)${1}03(?X)${1}04(?X)${1}05(?X)${1}06(?X)${1}07(?X)${1}08(?X)${1}09(?X)${1}10(?X)${1}11(?X)${1}12(?X)$1$2(?X)$2$1


Like therube is saying, any javascript methods for this solution might also help others to learn.
Glad you found a solution?
Luuk
 
Posts: 705
Joined: Fri Feb 21, 2020 10:58 pm

Re: transforming long date suffix to yyyymmdd prefix

Postby Vn5eOJK » Thu Oct 06, 2022 12:20 am

Since it was requested, here is the JS I ended up coding. Note that;
- my use case is kind-of specific
- this is the Regex forum, not JS forum
- it's not commented and was intended to get a job done quickly, not in the most pretty or efficient way

Also, I have since learned that I should be able to use Date.parse() to significantly simplify the long date interpretation, but am having some issues with that that I'm going to ask about in the JS forum.

Code: Select all
String.prototype.pad = function(size,chr) {
  var s = this;
  while (s.length < (size || 2)) {s = "0" + s;}
  return s;
}

newName = name.replace('_-_',' •');
newName = newName.replace('__',' •');

if (newName.match(/^.*_2022$/)) {
  newName = '2022 ' + newName.replace('_2022','')
}

switch(true) {
  case newName.match(/^.*•January-February$/) != null:
    newName = newName.slice(0,4) + '01' + newName.replace('•January-February','').slice(4);
    break;
  case newName.match(/^.*•February-March$/) != null:
    newName = newName.slice(0,4) + '02' + newName.replace('•February-March','').slice(4);
    break;
  case newName.match(/^.*•March-April$/) != null:
    newName = newName.slice(0,4) + '03' + newName.replace('•March-April','').slice(4);
    break;
  case newName.match(/^.*•April-May$/) != null:
    newName = newName.slice(0,4) + '04' + newName.replace('•April-May','').slice(4);
    break;
  case newName.match(/^.*•May-June$/) != null:
    newName = newName.slice(0,4) + '05' + newName.replace('•May-June','').slice(4);
    break;
  case newName.match(/^.*•June-July$/) != null:
    newName = newName.slice(0,4) + '06' + newName.replace('•June-July','').slice(4);
    break;
  case newName.match(/^.*•July-August$/) != null:
    newName = newName.slice(0,4) + '07' + newName.replace('•July-August','').slice(4);
    break;
  case newName.match(/^.*•August-September$/) != null:
    newName = newName.slice(0,4) + '08' + newName.replace('•August-September','').slice(4);
    break;
  case newName.match(/^.*•September-October$/) != null:
    newName = newName.slice(0,4) + '09' + newName.replace('•September-October','').slice(4);
    break;
  case newName.match(/^.*•October-November$/) != null:
    newName = newName.slice(0,4) + '10' + newName.replace('•October-November','').slice(4);
    break;
  case newName.match(/^.*•November-December$/) != null:
    newName = newName.slice(0,4) + '11' + newName.replace('•November-December','').slice(4);
    break;
  case newName.match(/^.*•December-January$/) != null:
    newName = newName.slice(0,4) + '12' + newName.replace('•December-January','').slice(4);
    break;
}

lastMatch = newName.split('•').pop();
lastStart = newName.length - lastMatch.length - 1;

switch(true) {
  case lastMatch.match(/^.*January.*$/) != null:
    newName = newName.slice(0,4) + '01' + newName.slice(4,lastStart)
    day = lastMatch.replace('January','');
    break;
  case lastMatch.match(/^.*February.*$/) != null:
    newName = newName.slice(0,4) + '02' + newName.slice(4,lastStart)
    day = lastMatch.replace('February','');
    break;
  case lastMatch.match(/^.*March.*$/) != null:
    newName = newName.slice(0,4) + '03' + newName.slice(4,lastStart)
    day = lastMatch.replace('March','');
    break;
  case lastMatch.match(/^.*April.*$/) != null:
    newName = newName.slice(0,4) + '04' + newName.slice(4,lastStart)
    day = lastMatch.replace('April','');
    break;
  case lastMatch.match(/^.*May.*$/) != null:
    newName = newName.slice(0,4) + '05' + newName.slice(4,lastStart)
    day = lastMatch.replace('May','');
    break;
  case lastMatch.match(/^.*June.*$/) != null:
    newName = newName.slice(0,4) + '06' + newName.slice(4,lastStart)
    day = lastMatch.replace('June','');
    break;
  case lastMatch.match(/^.*July.*$/) != null:
    newName = newName.slice(0,4) + '07' + newName.slice(4,lastStart)
    day = lastMatch.replace('July','');
    break;
  case lastMatch.match(/^.*August.*$/) != null:
    newName = newName.slice(0,4) + '08' + newName.slice(4,lastStart)
    day = lastMatch.replace('August','');
    break;
  case lastMatch.match(/^.*September.*$/) != null:
    newName = newName.slice(0,4) + '09' + newName.slice(4,lastStart)
    day = lastMatch.replace('September','');
    break;
  case lastMatch.match(/^.*October.*$/) != null:
    newName = newName.slice(0,4) + '10' + newName.slice(4,lastStart)
    day = lastMatch.replace('October','');
    break;
  case lastMatch.match(/^.*November.*$/) != null:
    newName = newName.slice(0,4) + '11' + newName.slice(4,lastStart)
    day = lastMatch.replace('November','');
    break;
  case lastMatch.match(/^.*December.*$/) != null:
    newName = newName.slice(0,4) + '12' + newName.slice(4,lastStart)
    day = lastMatch.replace('December','');
    break;
  default:
     day = "";
}

if (day != "") {
  day = day.replace('_','').pad(2,'0');
  newName = newName.slice(0,6) + day + newName.slice(6)
}

newName = newName.trim();
Vn5eOJK
 
Posts: 11
Joined: Sat Jan 19, 2019 2:41 pm

Re: Convert •dates with month-words into prefix of yyyymmdd

Postby Vn5eOJK » Thu Oct 06, 2022 12:41 am

Luuk wrote:For brc64.exe, the /RegExp: will forbid more than 9-groups, and also any groups like (?:), so it needs to be a very long command like...


Thanks for still posting. I'll study that for future reference. I wasn't trying to catch everything with one big command. I used the /pattern to process the files for each month name separately, and had variables set at the top for parts of the command that were the same for each month.
Vn5eOJK
 
Posts: 11
Joined: Sat Jan 19, 2019 2:41 pm

Ending date-formats with month-words --> yyyymmdd prefi

Postby Luuk » Thu Oct 06, 2022 1:23 am

I did make some typos on the brc64.exe solution, because May/June/July should be using \2 instead of \3 for their replacements.
So it wont convert any filenames having those 3-months inside of their •date-formats, but at least it wont destroy them either!
It should have looked like...
Code: Select all
"C:\Path\To\Your\brc64.exe" /Dir:"C:\Path\To\MainFolder" /Pattern:*•* /Recursive /NoFolders /ReplaceCI:•:"|"  /RegExp:"(.+) (\|(\d{0,2}[-_])?(Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|Sept|September|Nov|November|Dec|December)([-_](Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|Sept|September|Nov|November|Dec|December))?([-_]\d{0,2})?[-_](19|20)\d\d)(.*):\2 \1\9" /RegExp:"^(\|)(\d)([-_])([^-_]+)(.*):\1\4\30\2\5" /RegExp:"^(\|)(\d\d)([-_])([^-_]+)(.*):\1\4\3\2\5" /Regexp:"^(\|[^-_]+[-_])(\d[-_].*):\10\2" /RegExp:"^(\|[a-zA-Z]+)[-_][^\d]+([-_].*):\1\2" /RegExp:"^(\|)Jan(uary)?[-_](.*):\101\3" /RegExp:"^(\|)Feb(ruary)?[-_](.*):\102\3" /RegExp:"^(\|)Mar(ch)?[-_](.*):\103\3" /RegExp:"^(\|)Apr(il)?[-_](.*):\104\2" /RegExp:"^(\|)May[-_](.*):\105\2" /RegExp:"^(\|)June?[-_](.*):\106\2" /RegExp:"^(\|)July?[-_](.*):\107\3" /RegExp:"^(\|)Aug(ust)?[-_](.*):\108\3" /RegExp:"^(\|)Sept(ember)?[-_](.*):\109\3" /RegExp:"^(\|)Oct(ober)?[-_](.*):\110\3" /RegExp:"^(\|)Nov(ember)?[-_](.*):\111\3" /RegExp:"^(\|)Dec(ember)?[-_](.*):\112\3"  /RegExp:"(^\|\d{4})[-_](\d{4}.*):\1\2"  /RegExp:"(^\|)(\d{2,4})(\d{4})(.*):\1\3\2\4"  /Regexp:"^\|(.+):\1"  /ReplaceCI:"|:•"




========= Only at END of filenames ===========================================
Except now Im just realizing that date-formats dont always start with '•', and also should only be at the very end of filenames?
So if it matters to anyone, these next-ones to conduct date-formats or •date-formats, but only at the very end of filenames...

For the command-line, can add /execute to...
Code: Select all
"C:\Path\To\brc64.exe" /Dir:"C:\Path\To\MainFolder" /Pattern:* /Recursive /NoFolders /ReplaceCI:•:"|" /RegExp:"(.+) (\|?(\d{0,2}[-_])?(Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|Sept|September|Nov|November|Dec|December)([-_](Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|Sept|September|Nov|November|Dec|December))?([-_]\d{0,2})?[-_](19|20)\d\d)$:|\2 \1" /RegExp:"^(\|+)(\d)([-_])([^-_]+)(.*):\1\4\30\2\5" /RegExp:"^(\|+)(\d\d)([-_])([^-_]+)(.*):\1\4\3\2\5" /Regexp:"^(\|+[^-_]+[-_])(\d[-_].*):\10\2" /RegExp:"^(\|+[a-zA-Z]+)[-_][^\d]+([-_].*):\1\2" /RegExp:"^(\|+)Jan(uary)?[-_](.*):\101\3" /RegExp:"^(\|+)Feb(ruary)?[-_](.*):\102\3" /RegExp:"^(\|+)Mar(ch)?[-_](.*):\103\3" /RegExp:"^(\|+)Apr(il)?[-_](.*):\104\3" /RegExp:"^(\|+)May[-_](.*):\105\2" /RegExp:"^(\|+)June?[-_](.*):\106\2" /RegExp:"^(\|+)July?[-_](.*):\107\2" /RegExp:"^(\|+)Aug(ust)?[-_](.*):\108\3" /RegExp:"^(\|+)Sept(ember)?[-_](.*):\109\3" /RegExp:"^(\|+)Oct(ober)?[-_](.*):\110\3" /RegExp:"^(\|+)Nov(ember)?[-_](.*):\111\3" /RegExp:"^(\|+)Dec(ember)?[-_](.*):\112\3"   /RegExp:"(^\|+\d{4})[-_](\d{4}.*):\1\2"  /RegExp:"(^\|+)(\d{2,4})(\d{4})(.*):\1\3\2\4"  /Regexp:"^\|+(.+):\1"  /ReplaceCI:"|:•"
But if needing to conduct this inside of a batch, the batch first needs a line like... chcp 1252



For javascript, it can be like...
Code: Select all
// Match all possible date-formats, and mark these names with : at the beginning...
step1=name.replace(/(.+) •?((\d{0,2}[-_])?(Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|((Sept|Nov|Dec)(ember)?))([-_](Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|Sept|September|Nov|November|Dec|December))?([-_]\d{0,2})?[-_](19|20)\d\d)$/, ':$2 $1')

// Now its easier searching for : instead of typing the very long •date-formats again!
// Pad and move 1-digit before month-words...
step2=step1.replace(/(:)(\d)([-_])([^-_]+)/, '$1$4$30$2')

// Convert dd_MonthWord --> MonthWord-dd...
step3=step2.replace(/(:)(\d\d)([-_])([^-_]+)/, '$1$4$3$2')

// Pad 1-digit after month-words...
step4=step3.replace(/(:[^-_]+[-_])(\d[-_])/, '$10$2')

// Destroy 2nd-month in MonthText-MonthText ranges...
step5=step4.replace(/(:[a-zA-Z]+)[-_][^\d]+([-_])/, '$1$2')

// Convert month-text into mm...
jan=step5.replace(/(:)Jan(uary)?[-_]/, '$1'+'01');   feb=jan.replace(/(:)Feb(ruary)?[-_]/, '$1'+'02')
mar=feb.replace(/(:)Mar(ch)?[-_]/, '$1'+'03');       apr=mar.replace(/(:)Apr(il)?[-_]/, '$1'+'04')
may=apr.replace(/(:)May[-_]/, '$1'+'05');            jun=may.replace(/(:)June?[-_]/, '$1'+'06')
jul=jun.replace(/(:)July?[-_]/, '$1'+'07');          aug=jul.replace(/(:)Aug(ust)?[-_]/, '$1'+'08')
sep=aug.replace(/(:)Sept(ember)?[-_]/, '$1'+'09');   oct=sep.replace(/(:)Oct(ober)?[-_]/, '$1'+'10')
nov=oct.replace(/(:)Nov(ember)?[-_]/, '$1'+'11');    dec=nov.replace(/(:)Dec(ember)?[-_]/, '$1'+'12')

// Remove dd_yyyy separators...
noseps=dec.replace(/(:\d{4})[-_](\d{4})/, '$1$2')

// Move yyyy to front and destroy : ...
newName=noseps.replace(/:(\d{2,4})(\d{4})/, '$2$1')



RegEx(1) first needs a checkmark inside for "v2", then a "Match" like...
Code: Select all
(.+)( | •)((\d{0,2}[-_])?(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|June?|July?|Aug|August|Oct|October|((Sept|Nov|Dec)(ember)?))([-_](Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|June?|July?|Aug|August|Oct|October|((Sept|Nov|Dec)(ember)?)))?([-_]\d{0,2})?[-_](19|20)\d\d)$(?X)(:)(\d)([-_])([^-_]+)(?X)(:)(\d\d)([-_])([^-_]+)(?X)(:[^-_]+[-_])(\d[-_])(?X)(:[a-zA-Z]+)[-_][^\d]+([-_])(?X)(:)Jan(uary)?[-_](?X)(:)Feb(ruary)?[-_](?X)(:)Mar(ch)?[-_](?X)(:)Apr(il)?[-_](?X)(:)May[-_](?X)(:)June?[-_](?X)(:)July?[-_](?X)(:)Aug(ust)?[-_](?X)(:)Sept(ember)?[-_](?X)(:)Oct(ober)?[-_](?X)(:)Nov(ember)?[-_](?X)(:)Dec(ember)?[-_](?X)(:\d{4})[-_](\d{4})(?X):(\d{2,4})(\d{4})
And a "Replace" like...
Code: Select all
:$3 $1(?X)$1$4${3}0$2(?X)$1$4$3$2(?X)${1}0$2(?X)$1$2(?X)${1}01(?X)${1}02(?X)${1}03(?X)${1}04(?X)${1}05(?X)${1}06(?X)${1}07(?X)${1}08(?X)${1}09(?X)${1}10(?X)${1}11(?X)${1}12(?X)$1$2(?X)$2$1




So these last 3-examples to rename like...
Title1 •September_2022.txt ----------------> 202209 Title1.txt
Title2 •September-October_2022.txt ------> 202209 Title2.txt
Title3 •Sept-Oct_2022.txt ------------------> 202209 Title3.txt
Title4 •September_1_2022.txt -------------> 20220901 Title4.txt
Title5 •25_November_2022.txt ------------> 20221125 Title5.txt
Title6 •4_October_2022.txt ----------------> 20221004 Title6.txt
Title7 •5_July_2022.txt ---------------------> 20220705 Title7.txt
Title8 5_July_2022.txt ----------------------> 20220705 Title8.txt
Title9 •8_September_2022 •Issue1.txt ---> (no changes, because date-format not at the very end)
Luuk
 
Posts: 705
Joined: Fri Feb 21, 2020 10:58 pm

Re: Ending date-formats with month-words --> yyyymmdd prefi

Postby Vn5eOJK » Thu Oct 06, 2022 5:11 pm

Luuk wrote:I did make some typos on the brc64.exe solution, because May/June/July should be using \2 instead of \3 for their replacements.
So it wont convert any filenames having those 3-months inside of their •date-formats, but at least it wont destroy them either!

Except now Im just realizing that date-formats dont always start with '•', and also should only be at the very end of filenames?


On your last comment, no, those were both my requirements. There would always be a delimiter before the date and the date, if it exists, would always be only at the end. What I did say was there may be more than one delimiter in the name, like;

Spacetime_•_the_untold_story_•_1_January_2001

So you would need to be sure you just capture "1_January_2001" and not "the_untold_story_•_1_January_2001" to interpret it as a date.

I am parsing my brain through your string of regex commands and have a few questions so I can learn from it;

1) Your update has;

/RegExp:"^(\|)Apr(il)?[-_](.*):\104\2"
/RegExp:"^(\|)May[-_](.*):\105\2"
/RegExp:"^(\|)June?[-_](.*):\106\2"
/RegExp:"^(\|)July?[-_](.*):\107\3"

Should the "/2" really be on July and not Apr(il) ?

2) Why does May not have a "?" and all the others do ?

3) On the big long regex with "Jan|January|Feb|February|Mar|March|Apr|April|May|June?|July?|Aug|August|Oct|October|Sept|September|Nov|November|Dec|December"
can that be simplified with the same syntax as "Apr(il)" with the additional optional full month names in parens, something like;
"Jan(uary)|Feb(ruary)|Mar(ch)|Apr(il)|May|June?|July?|Aug(ust)|Oct(ober)|Sep(tember)|Nov(ember)|Dec(ember)"

4) In #3, I have the same question about the "?". Is it needed on June and July? Should it be added to May ? I'm not quite sure what that is trying to account for and which month names really need it or not
Vn5eOJK
 
Posts: 11
Joined: Sat Jan 19, 2019 2:41 pm

Ending date-formats with month-words -> yyyymmdd prefi

Postby Luuk » Thu Oct 06, 2022 7:39 pm

1)
Yes! May/June/July should all be using \2 instead of \3.
I said I would change May/June/July, but changed April/May/June instead!!

2)
The ? is just a shortcut for {0,1} saying "zero or 1-times", its like + being a shortcut for {1,} saying "1 or more-times".
So like July? matches Jul or July, because y? matches the "y" character either "0 or 1-times".
But if you need to conduct ? against a longer-string, it should be grouped like Dec(ember)?.

3)
I do believe that 2) does probably answer everything about 3) and 4) but if any more questions, please to post them again.
For myself, I do sometimes even include different spellings like (May|Mai) and Feb(r?uary)? to also include Febuary.
But really, everybody will be more familiar with their own spellings and abbreviations.

If you like to re-post either of the brc64.exe solutions with April using \3 and July using \2, its much to be appreciated.
Im not really feel comfortable re-posting until getting onto another computer, and also making some filenames with dates at the end.

My keyboard is dying, so I must keep a file with many characters that I cannot type anymore, so I can copy/paste them.
The number-pad is also completely dead, so I can only use the top-buttons for numbers, and also the other Enter button.
At least we know that \3 with July will never destroy anyone's date-formats, but the \2 with April would destroy them!

Also, many thanks for posting the javascript! Im not getting a chance to experiment with it yet, but looking forward to it.
There is another very helpful library for conducting the date-elements, except right now, Im not finding it anywhere.
So when I do find it, I will post the links on your last javascript question and answer...

If ever needing help for regex, Im will try my best, but with javascript I very new and still learning the codes.
Many thanks again, both for finding the brc64.exe typos, and also for posting the javascript code example!
Luuk
 
Posts: 705
Joined: Fri Feb 21, 2020 10:58 pm

Ending •date-formats with month-words -> yyyymmdd prefix

Postby Luuk » Fri Oct 07, 2022 2:40 am

Many apologies, I did believe that your •date-formats might not always have the '•' in front!
So in the last examples, just use the second brc64 example, because its not having any of the \3 typos.
But in the 1st /RegEx: just change \|? ===> \| so then it always looks for 1-delimiter, not 0-or-1 delimiters.

Its unfortunate, but I cannot type the characters '"+?=}]) or the number-pads, so now Im copy/pasting way too many regexs.
Usually Im would just type each month-replacement separately, but instead I just typed one, and then copy/pasted 11-times.
So then I edited the month-text and month-numbers, but forgot that \3 was being used in all of the replacements!

Im not to re-post anything, because at least the last 3-examples do conduct like their description is saying.
But you might like to post the correction anyways, just in case coming back later to see the proper example.
Many apologies for any confusion, and thanks again for the javascript code!
Luuk
 
Posts: 705
Joined: Fri Feb 21, 2020 10:58 pm

Re: Ending •date-formats with month-words -> yyyymmdd prefix

Postby Vn5eOJK » Fri Oct 07, 2022 10:41 am

Luuk wrote:My keyboard is dying.
But you might like to post the correction anyways, just in case coming back later to see the proper example.


Luuk, do you have a way I can tip you? Paypal or venmo or something. Maybe order you a new keyboard from Amazon? This will actually save me quite a bit of time from doing manual steps.

I stepped through your example using the handy feature that nothing will be /execute-d without that parameter and added in each /regex one by one to understand what was going on--how the output name was being transformed in each step. That all looks great and I understand it 99.9%. The only one thing I don't quite get is the "\9" at the end of the long regex with all the months testing if the long date at the end is in an acceptable format to move to the front.

I did tweak two things that don't materially affect the algorithm;
1) I revised the downloads so that the filenames just use spaces and not underscores
2) The separator before the date would be a hyphen or long-hyphen character surrounded by spaces, not that bullet symbol (' - ' or ' – ')
3) I found enough cases where the long date was "month day, year", ie, it included a comma after the day number, so I added in handling for that

I used the bat file continuation character ("^") so I could see all the steps on separate lines.

Here's what I ended up with and so far is passing all my testing with hundreds of file names:

Code: Select all
@echo off
@chcp 65001

set src=D:\Document\ToDo\
set brc=C:\Program Files\Bulk Rename Utility\brc\brc64.exe
set doit=
@rem rem next line to test changes
set doit=/execute
set args=%doit% /dir:"%src%" /nofolders /recursive /nodup

@rem note some files have a normal hyphen and others have the en-dash (Alt+150)
"%brc%" %args% /pattern:"* 20??.*" ^
  /ReplaceCI:" - ":"|" ^
  /ReplaceCI:" – ":"|" ^
  /RegExp:"(.+)(\|(\d{0,2}[- ])?(Jan|January|Feb|February|Mar|March|Apr|April|May|June|July|Aug|August|Sept|September|Oct|October|Nov|November|Dec|December)([- ](Jan|January|Feb|February|Mar|March|Apr|April|May|June|July|Aug|August|Sept|September|Oct|October|Nov|November|Dec|December))?([- ]\d{0,2})?,?[- ](19|20)\d\d)(.*):\2 \1\9" ^
  /RegExp:"^(\|)(\d)([- ])([^- ]+)(.*):\1\4\30\2\5" ^
  /RegExp:"^(\|)(\d\d)([- ])([^- ]+)(.*):\1\4\3\2\5" ^
  /Regexp:"^(\|[^- ]+[- ])(\d[- ].*):\10\2" ^
  /RegExp:"^(\|[a-zA-Z]+)[- ][^\d]+([- ].*):\1\2" ^
  /RegExp:"^(\|)Jan(uary)?[- ](.*):\101\3" ^
  /RegExp:"^(\|)Feb(ruary)?[- ](.*):\102\3" ^
  /RegExp:"^(\|)Mar(ch)?[- ](.*):\103\3" ^
  /RegExp:"^(\|)Apr(il)?[- ](.*):\104\3" ^
  /RegExp:"^(\|)May[- ](.*):\105\2" ^
  /RegExp:"^(\|)June[- ](.*):\106\2" ^
  /RegExp:"^(\|)July[- ](.*):\107\2" ^
  /RegExp:"^(\|)Aug(ust)?[- ](.*):\108\3" ^
  /RegExp:"^(\|)Sept(ember)?[- ](.*):\109\3" ^
  /RegExp:"^(\|)Oct(ober)?[- ](.*):\110\3" ^
  /RegExp:"^(\|)Nov(ember)?[- ](.*):\111\3" ^
  /RegExp:"^(\|)Dec(ember)?[- ](.*):\112\3" ^
  /RegExp:"(^\|\d{4}),?[- ](\d{4}.*):\1\2" ^
  /RegExp:"(^\|)(\d{2,4})(\d{4})(.*):\1\3\2\4" ^
  /Regexp:"^\|(.+):\1" ^
  /ReplaceCI:"|":" - "

pause
Vn5eOJK
 
Posts: 11
Joined: Sat Jan 19, 2019 2:41 pm

Ending date-formats with month-words -> yyyymmdd prefix

Postby Luuk » Fri Oct 07, 2022 9:21 pm

Many apologies for all the confusion...
The 1st brc-example with typos, was guessing that you might also want to convert •date-formats in the middle of filenames.
The 2nd brc-example without typos, was for date-formats only at the very ends, but also guessing that "•" should be optional.
So in your 1st-regex at the very end, can just change (.*) ==> $ to say "at-the-end", and then also just remove the \9.

So the way it is right now, it could change your filenames like...
Title7 - 8-September-2022 - Issue1.ext -----> 20220908 Title7 - Issue1.ext

But if you like moving the date-formats, either in the middle or at the ends, can just leave it alone, exactly like it is.
Many thanks for the offer, but we already ordered another keyboard, so Im just trying to get brave enought to replace it!
One time everything except the number-pad did start conducting properly, but then after a few days, it dies again.

It does seem strange that only those characters would die, because they are not common, even though common in my regexs.
And there is no way that Im typing those characters, more than the regular alpha-characters, so it does seem very strange.

On thing I did forget to mention, in case you like to edit in the future, is about using sets of characters inside of [here].
Since the regex uses - to say a range of characters, so like [a-zA-Z] matching any english lowercase or uppercase character.
If needing to match a - it should always be the very 1st-character, especially if you might later need to include more characters.

Nice solution!!
Luuk
 
Posts: 705
Joined: Fri Feb 21, 2020 10:58 pm


Return to Regular Expressions


cron