How to replace chars within a match group

A swapping-ground for Regular Expression syntax

How to replace chars within a match group

Postby to64ku » Sat Apr 24, 2021 1:42 pm

Hello,

I am trying to rename subtitle filenames to match the video filename.

Currently the filenames are something like:
(Title.name.with.period.instead.of.spaces.S99E99.Episode.name.with.period.instead.of.spaces.ChannelName.language.ext)
E.g.:
The.Good.Place.S01E09....Someone.Like.Me.as.a.Member.ChannelName.en[cc].vtt

I want them to be like:
The Good Place - S01 - E09 - Someone Like Me as a Member - ChannelName.en[cc].vtt

I have tried to study the instructions, but only got this far:
Match: ^(.+)\.(S\d{2})(E\d{2})\.+(.*)\.ChannelName(\..+)
Replace: \1 - \2 - \3 - \4 - ChannelName\5

This creates a new filename:
The.Good.Place - S01 - E09 - Someone.Like.Me.as.a.Member - ChannelName.en[cc].vtt

(I think that if there are multiple periods after each other I'd replace them with one space, as there is no way of knowing if the original text had periods or spaces.)

I tried then replacing periods with spaces, but it also removes the period before the language and that causes subtitles not to work with my video player.
I do have to download subtitles with different languages as sometimes my native subtitles are not available.

How can I replace the periods to spaces only within groups 1 and 4, but not affect the period before the language?
to64ku
 
Posts: 3
Joined: Sat Apr 24, 2021 1:05 pm

Re: How to replace chars within a match group

Postby Luuk » Sun Apr 25, 2021 3:37 am

The only thing Im not understanding is 'replacing multiple periods', but thinking it means...
You can also have filenames with either The...Good...Place... or ...ChannelName...en[cc].vtt ?
Im thinking it really not matters though, because the regex should conduct them both together.

With RegEx(1), all the replacements must be presented inside of the Replace box.
And also everything inside the Replace must be exact, because everything must go into the new filename.
Your regex makes some of the periods to be exact with ' - ', but the other \. are not presented, so not replaced.

The 'v2' regex does grant '/g' for saying 'all matches' and also grants "(?X)" for saying 'next regex'.
So unless all of your filenames have the same periods, Im thinking a 'v2' RegEx(1) and Replace(3) solution to be like..

\.+(S\d\d)(E\d\d)\.+(.+?)\.+(ChannelName.*)(?X)\.+/g
\ - $1 - $2 - $3 - $4(?X) \u
The first regex conducts like yours, but the next one replaces all 1-or-more periods ==> 1 space.

So then Replace(3) to fix the very last space back into period using a Replace and With like...
\last\ |
.
So using them together does alter the names like...
Some..Words..S01E09.....Some..More..Words...ChannelName...en[cc].vtt ==> Some Words - S01 - E09 - Some More Words - ChannelName.en[cc].vtt
The.Good.Place.S01E09...Someone..Like.Me..ChannelName...en[cc].vtt ====> The Good Place - S01 - E09 - Someone Like Me - ChannelName.en[cc].vtt
The.Good.Place.S01E09...Someone.Like.Me.ChannelName.en[cc].vtt =======> The Good Place - S01 - E09 - Someone Like Me - ChannelName.en[cc].vtt

But if wanting to conduct everything inside RegEx(1)...
\.+(S\d\d)(E\d\d)\.+(.*?)\.+(ChannelName.*)(?X)(?!\.+[^.]+$)\.+/g(?X)(\.)+
\ - $1 - $2 - $3 - $4(?X) (?X)$1
In this one, the second regex replaces all 1-or-more periods (except last) ==> one space.
Then a third regex would replace the last 1-or-more periods =============> one period ( ...en[cc] ==> .en[cc] ).

You CAN also only replace periods inside where groups 1 and 4 match, but its by adding much longer expressions.
So its easier just putting a checkmark in "RegEx" for Filters(12) for both solutions, and using a "Mask" like...
\.+S\d\dE\d\d\.+.+ChannelName
Luuk
 
Posts: 691
Joined: Fri Feb 21, 2020 10:58 pm

Re: How to replace chars within a match group

Postby to64ku » Sun Apr 25, 2021 10:51 am

Hello, Luuk.

Thanks a lot! This shall take me a while to understand;)

The "multiple periods" can occur within the groups 1 and 4 (Title and Episode names). Not between ChannelName and the language.

I have one issue, that I did not yet guess how to fix. But I'll try to understand what is causing it. I am using your 1st solution (#1 and #3)

Everything else works marvelously, but I am getting one extra space after the ChannelName.
I.e., this:
The.Good.Place.S01E09...Someone..Like.Me..ChannelName...en[cc].vtt
becomes this:
The Good Place - S01 - E09 - Someone Like Me - ChannelName .en[cc].vtt

BR. ToKu
to64ku
 
Posts: 3
Joined: Sat Apr 24, 2021 1:05 pm

Re: How to replace chars within a match group

Postby to64ku » Sun Apr 25, 2021 11:51 am

Hi again,

Your second proposal for replacing everything inside RegEx(1) works perfectly, so I stopped for now evaluating what is the issue I have with the other alternative.

I am really grateful for your help.

In case you have time, I am interested in how to fix the remaining issue (for a learning experience), but I am also extremely satisfied with the current solution as well.

BR. ToKu
to64ku
 
Posts: 3
Joined: Sat Apr 24, 2021 1:05 pm

Re: How to replace chars within a match group

Postby Luuk » Mon Apr 26, 2021 12:44 am

That problem looks to be inside the first regex before (?X) so probably ...
1) Match has (.*)\+ that should be (.*?)\+
2) Match has \. that should be \.+
3) Replace has an extra space

Also I forget to mention... Im only using \ at the beginning of Replace and \u at the end because...
This forum does forbid starting the lines with a space, and also to make the last space obvious for the eyesight.
So Im just choosing characters that wont destroy the regex.. This also the reason for '|' inside the Replace(3).
Luuk
 
Posts: 691
Joined: Fri Feb 21, 2020 10:58 pm


Return to Regular Expressions