Add leading zeroes in last part of file names

A swapping-ground for Regular Expression syntax

Add leading zeroes in last part of file names

Postby georgedb » Sun Jul 24, 2016 2:27 pm

I unpack (using unzip.exe) a set of PDF files with the following structure:

nnnc.pdf --> pdf-nnn-000.pdf (I can solve this within a batch file, with FOR %%variable, REN and another REN)
nnns.pdf --> pdf-nnn-003.pdf (I can solve this within a batch file, with FOR %%variable, REN and another REN)
pdf-nnn-(p)(p)p.pdf --> pdf-nnn-ppp.pdf (with leading zeroes).

So:
pdf-123-7.pdf should become pdf-123-007.pdf
pdf-123-42.pdf should become pdf-123-042.pdf
pdf-123-179.pdf should become pdf-123-179.pdf

Where "nnn" is some number that indicates that these PDF's belong together.
The goal is to merge the files with the same nnn to one complete PDF (I'll use a nice command line tool for that: "cpdf").
and "p" is the page number, therefore (p)(p)p, as there is always a page number, but it is 1, 2 or 3 digits big.
123c.pdf and 123s.pdf is a different story, but I can solev that, when it complicates things, in an alternative way.

Sounds like I need to use, it's all from the commandline as it needs to happen among other things in a batch file, a regular expression.
I'm not familiar with RegEx and tried for hours, but the syntaxis is not completely clear to me.
I found a RegEx-trainerhttps://regex101.com/#pcre) and was able to setup a RegEx for a start:

pdf-(.*)-(.*)\.pdf

If the example is pdf-321-56.pdf, then my idea was to sort od strip and store the code ("nnn") and the page number ("(p)(p)p").
This seems to work.
I'd now expect to end up "321" in variable \1 and "56" in \2.
Therefore I tried the command:

BRC32 /REGEXP:pdf-(.*)-(.*)\.pdf:pdf-\1-\2.pdf

But BRC indicates that nothing needs to be remaned; that sounds like there is no match at all...

Cab someone please help me with the RegEx and the syntax of BRC?
georgedb
 
Posts: 4
Joined: Sat Jul 23, 2016 8:56 am

Re: Add leading zeroes in file names

Postby Admin » Mon Jul 25, 2016 5:02 am

pdf-123-7.pdf should become pdf-123-007.pdf
pdf-123-42.pdf should become pdf-123-042.pdf
pdf-123-179.pdf should become pdf-123-179.pdf


This transformation can be done with a Javascript function if you have a commercial license for BRU I can post the script. thanks!
Admin
Site Admin
 
Posts: 2343
Joined: Tue Mar 08, 2005 8:39 pm

Re: Add leading zeroes in last part of file names

Postby georgedb » Mon Jul 25, 2016 4:43 pm

Well, I could also build some simple application, but actually, I'd like to solve it with BRC.
Is Javascript really needed and would it help, as I need to run a batchfile on the file system. Furthermore, I'd like to learn more about regex and I'd like to learn the BRC syntaxis. Even for this one time goal (which is glueing pages of some old electronics magazine together in the right order, the actual glueing is done with some command line tool). If I would understand the right regex command and the syntaxis of BRC, I'd say it would be possible.

Can you help me with that?
georgedb
 
Posts: 4
Joined: Sat Jul 23, 2016 8:56 am

Re: Add leading zeroes in last part of file names

Postby georgedb » Tue Jul 26, 2016 5:03 pm

Well, in four steps (actually two), but I don't care, the idea was to solve it with BRC only, from a batch file:

Step 1:
(\d\d\d)c:pdf-\1-000
(nnnc.pdf --> pdf-nnn-000.pdf)

Step 2:
(\d\d\d)s:pdf-\1-005
(nnns.pdf --> pdf-nnn-005.pdf)

Step 3:
pdf-(.*)-(\d)$:pdf-\1-0\2
(When there is single digit, a leading zero is added, so…3.pdf --> …03.pdf)

Step 4:
pdf-(.*)-(\d\d)$:pdf-\1-0\2
(When there are two digits at the end, a leading zero is added, so …53.pdf --> …053.pdf, and ...03.pdf --> 003.pdf)

With...000.pdf, ...003.pdf, ...005.pdf and all other non special page numbers, a page number is always 3 digits long and therefore the sorting works as needed.
Problem seems solved ;-)
Last edited by georgedb on Wed Jul 27, 2016 11:57 am, edited 1 time in total.
georgedb
 
Posts: 4
Joined: Sat Jul 23, 2016 8:56 am

Re: Add leading zeroes in last part of file names

Postby Admin » Wed Jul 27, 2016 1:16 am

8)
Admin
Site Admin
 
Posts: 2343
Joined: Tue Mar 08, 2005 8:39 pm

Re: Add leading zeroes in last part of file names

Postby georgedb » Wed Jul 27, 2016 10:28 am

Maybe an interesting addition for other users: BRC/BRU (that's where I learned it) only touches the filename part before the extension. That's what I didn't know and therefore my previous regex'es didn't match. For me a filename is the whole thing, it can contain one or more dots, and everything after the last dot is the extension, which is a handy feature for the OS... That's not how BRC/BRU looks at filenames ;-)
georgedb
 
Posts: 4
Joined: Sat Jul 23, 2016 8:56 am

Re: Add leading zeroes in last part of file names

Postby therube » Wed Jul 27, 2016 9:24 pm

(I'll also throw in that if the replacement part contains a space, you have to enclose it in quotes - even though it is deliminated by :. I didn't expect that.

I was start out simple:

> brc32 /regexp:(.*):xxx:

But then when I threw the \1 in there, it barfed, which wasn't making sense to me:

> brc32 /regexp:(.*):xxx \1:

> Unrecognised Parameter: \1:
> Unrecognised Argument: 1
> One or more arguments were invalid. Rename aborted.

Then, with your post today, I thought:

> brc32 /regexp:(.*):"xxx \1":

:-). )
therube
 
Posts: 1314
Joined: Mon Jan 18, 2016 6:23 pm


Return to Regular Expressions


cron