Case(4) Title Enhanced treats Digits as Word Separators

Post any Bulk Rename Utility support requirements here. Open to all registered users.

Case(4) Title Enhanced treats Digits as Word Separators

Postby BogStandard » Fri Mar 19, 2021 9:30 pm

Hi Admin.

BRU 3.4.3.0

From the new manual Case(4):
A word is generally defined as a string of letters proceeded by a space or a bracket or a dash.

I have used Case(4) Title and it works as expected in that a digit is not a word separator. Therefore a character after a digit is made lower case. So...

Case(4) Title
Before:
title 2 TITLE 1ST 2ND 33RD 400TH.pdf
After:
Title 2 Title 1st 2nd 33rd 400th.pdf


However Case(4) Title Enhanced works differently. It is treating a digit as a word separator. Therefore a character following a digit is made Upper Case. So...

Case(4) Title Enhanced
Before:
title 3 TITLE ENHANCED 1ST 2ND 33RD 400TH.pdf
After:
Title 3 Title Enhanced 1St 2Nd 33Rd 400Th.pdf

I think Case(4) Title is the correct treatment.


Note on Multiple Spaces.
Multiple spaces are not now stripped by any Case(4) options. I used double spaces in the examples above but they were stripped by the forum software.

Case(5)
D/S tick
substitutes one space wherever there are many.

Regards...
BogStandard
 
Posts: 17
Joined: Sun Feb 07, 2021 11:25 pm

Re: Case(4) Title Enhanced treats Digits as Word Separators

Postby Luuk » Sat Mar 20, 2021 3:32 am

Greetings everyone, this not a workaround, but just trying to explain how Title-Enhanced is choosing to conduct the uppercases.
Instead of numbers being like word-separators, its really Title-Enhanced skipping past any non-letters to find the next closest letter.
So after each word-separator like "!. _," If the next character is not a letter, it likes to skip ahead to uppercase the next closest letter!

Sorry if its a poor explanation, but Im thinking this experiment better presents the explanation ...
Test111st2nd3rd ===> Test111st2nd3rd (same)
Test 11st2nd3rd ===> Test 11St2nd3rd
Test-11st2nd3rd ===> Test-11St2nd3rd
Test!11st2nd3rd ===> Test!11St2nd3rd
Test.11st2nd3rd ===> Test.11St2nd3rd
Test,11st2nd3rd ===> Test,11St2nd3rd
Test_11st2nd3rd ===> Test_11St2nd3rd
Test_1@;#1st2nd ===> Test_1@;#1St2nd

Im experimenting to invent a "v2" regex to conduct for Title-Enhanced, but dont know what are all of the NY-Times rules?
If anybody has links that gives full explanation of the rules, its to be much appreciated.
Luuk
 
Posts: 323
Joined: Fri Feb 21, 2020 10:58 pm

Re: Case(4) Title Enhanced treats Digits as Word Separators

Postby BogStandard » Sat Mar 20, 2021 5:08 pm

Hi Luuk.

See:
https://titlecaseconverter.com/rules/

Case(4)
Title Enhanced
Excep. <clear>
clears all the default (NYT based) word case changes and allows you to create your own set. Use the $ prefix to force lower case words to Title Case only at the start or end of the file, book, document or publication name. The choice of case and words is now yours.

Excep. <rnup> or <rnlo>
converts Roman Numerals to UPPER or lower case respectively but you can now specify case exceptions for words that are made up of letters used by Roman Numerals ie CDILMVX.
The list that I use is :Civic :Civil :Did :Dill :Dim :Ill :Lid :Livid :Mild :Mill :Mimic :Mix :Vim :Vivid :

This is a subset of my list of exceptions:

Case(4)
Title Enhanced
Excep.
<clear>:<rnup>:&c :$about:$above:$across:$ad:$against:$along:$among:$an:$and:$are:$around:$as:$at
:$before:$behind:$below:$beneath:$beside:$between:$beyond:$by:$de:$der:$des:$down:$du:$during:$ed
:$et:$etc:$except:$for:$from:$in:$inside:$into:$it:$its:$la:$le:$les:$like:$near:$of:$off:$on:$or:$out
:$since:$sur:$that:$the:$their:$this:$through:$to:$toward:$under:$until:$up:$upon:$von
:$which:$with:$within
: Civic: Civil: Did: Dill: Dim: Ill: Lid: Livid: Mid: Midi: Mild: Mill: Mimic: Mix: Vim: Vivid:

I have had to put line breaks in the above list for it to display correctly.

Regards...
BogStandard
 
Posts: 17
Joined: Sun Feb 07, 2021 11:25 pm

Re: Case(4) Title Enhanced treats Digits as Word Separators

Postby Luuk » Sun Mar 21, 2021 8:07 pm

Greeting BogStandard, and a million thanks for the link!!
I had no idea there can be so many rules for title casings, especially when there can be a whole book about each style!
So this probably is not entirely possible, even with using the regex "v2", so now Im thinking to use your formats instead!
If there is a contest for "biting off more than you can chew", Im thinking this wins the prize!

Its also giving me great respect for all the programming code that must be needed to conduct for Title-Enhanced.
I did not consider it, but Im thinking that most people probably really do prefer changing --12cars ===> --12Cars.
So its ideal (but maybe impossible) for the programming to also grant something like :/RegexException/:
So then granting something like :/\d+(st|nd|rd|th)/: to forbid changing any 'numbered-occurence' words.

For now, Im thinking to just let Title-Enhanced uppercase them, so then later fixing with "v2" regex like ...
(\d+(St|Nd|Rd|Th)(\b|_))/g
\L$1
So its lowercasing the words like "1St" or "4Th", but ignoring words like "1Star" or "44Things".
Im still wanting to invent a simple Title-Case "v2" regex, to use its own word-separators and exceptions.
But first, Im still trying to get "Negative Look-Arounds" to conduct properly to grant some exceptions.

So far, this my experiment...
(^|[][ !.,;_&\(\)-]+)([^][ !.,;_&\(\)-])([^][ !.,;_&\(\)-]*)/g
$1\U$2\L$3

Its using the word-separators [] !.,;_&()- but no exception words, because the "Negative Look-Arounds" must still be solved?
Of course, Im not yet even thinking how to convert 1ststring ==> 1stString because Im never wanting 1string ==> 1stRing.
But also, I do like the preference --44strings ==> --44Strings, so this part of it would still take very much consideration!
Im thinking its a good project for learning the regex, so still keep trying for solutions, but its much harder than it seems.
Many thanks again!
Luuk
 
Posts: 323
Joined: Fri Feb 21, 2020 10:58 pm


Return to BRU Support