Remove Special Characters from Subdirectories

A swapping-ground for Regular Expression syntax

Remove Special Characters from Subdirectories

Postby msigal » Thu Nov 21, 2019 7:07 pm

Hi everyone,

I have never used BRU before and I am sorry for the amateurish question, but I'm been notified by my IT department that a bunch of filenames in my "dropbox" are problematic for syncing on their platform and I need to rectify it as soon as possible (since things like lecture slide PDFs are failing to sync). Specifically, I need to sort through thousands of files in a bunch of nested subdirectories and remove all instances of: \ / < > : " | ? ! * & # ( ) +

Some of these have never been used; others, pretty regularly (looking specifically at & and ( ) in my PDF directory).

If someone could help me put together a proper regular expression that would find these, I would be extremely grateful! Then, as a quick solution, I'm hoping to use BNU to find all instances of these specific characters and maybe replace them with a - since that is still allowed. I imagine I will put the regex in the first box in the upper left corner of the app, and make sure "subfolders" is checked in the filters tab?

Any advice would be very appreciated!
msigal
 
Posts: 1
Joined: Thu Nov 21, 2019 7:00 pm

Re: Remove Special Characters from Subdirectories

Postby therube » Thu Nov 21, 2019 10:13 pm

Thinking the most feasible way is going to use JavaScript Renaming.

With that (I'm thinking) you could do a search & replace:
Code: Select all
Match: [\/<>:"|?!*&#()+]
Replace: -


Otherwise:

3:Replace, may handle some items, though you'd have to run the Replace for each of your characters

5:Remove, Chars could do it, but that would outright remove the particular characters - without allowing you to demarcate their removal, say with a - (dash). Also in there, Sym., High, & Accents, could help.


PS:

Some of what you show are illegal filename characters (on Windows).
Though others are perfectly acceptable.
And I've have to think, that for the acceptable characters, if your end does not like them, then whatever method they are using for their sync is pretty brain-dead.
Similarly, I would think that their end should automatically account for "illegal file name" issues, as in the "sync", should say, "hey, I see a ( and we can't have that so replace that with a -".
(IOW, they are putting the onus on the user to do things "correctly" instead of taking what is given & making it work with their brain-dead sync system.)


You could use Everything to aid in finding your "illegal" filenames. Various means to search.

For "legal" characters, like parens, just type it in, ).
Some chars may need escaping, which you could do with regex: or provided functions; lt: (less then symbol), or unicode char; #60: (less then symbol), or ranges; regex:[^\x{0000}-\x{007f}] ("non-ascii" chars)...
therube
 
Posts: 1314
Joined: Mon Jan 18, 2016 6:23 pm

Re: Remove Special Characters from Subdirectories

Postby Admin » Fri Nov 22, 2019 12:42 am

Hi,

The Character Translations could be helpful too: you can specify from which char map to which char and also use multiple characters e.g. map & to AND

An example from Character Translations :

Replace Symbols by underscore (To replace by a space, just add a space instead a underscore. Or what you like):
!=_
-=_
#=_
$=_
%=_
'=_
(=_
)=_
[=_
]=_
{=_
}=_
~=_
+=_
2C=_
;=_
3D=_
@=_
Admin
Site Admin
 
Posts: 2343
Joined: Tue Mar 08, 2005 8:39 pm


Return to Regular Expressions