Windows Illegal Character Help

Bulk Rename Utility How-To's

Re: Windows Illegal Character Help

Postby RegexNinja » Wed Apr 22, 2020 7:03 am

Sorry about the typo's earlier.. I incorrectly referenced WinZip instead of WinRar..
I've only heard of this working with 7-zip (& now WinRar), but it probably works with most archivers.

Admin: I'm pretty sure you just compress your problem file/folders with the archiver (name & contents).
Then just extract them to overwrite.. The extraction process will auto-remove any illegal characters.
It makes sense, as archivers are designed to move files, often across varying OS's (varying sets of illegal-chars).
Cheers!
RegexNinja
 
Posts: 134
Joined: Fri Feb 21, 2020 5:26 pm

Re: Windows Illegal Character Help

Postby RegexNinja » Wed Apr 22, 2020 7:32 am

Admin:
After 2nd-thought, the overwrite method wouldnt work; it'd leave you with 2-versions of the same file/folders:
The good-names (extracted by your archiver), and the original bad-names with the illegal-chars.
But archivers offer a 'delete-files' after adding them into an archive..

So you'd choose that option, and then extract everything to create only the good-name versions.
Its certainly a better option than using WinRar/etc to rename everything one-by-one.
Cheers!
RegexNinja
 
Posts: 134
Joined: Fri Feb 21, 2020 5:26 pm

Re: Windows Illegal Character Help

Postby Admin » Wed Apr 22, 2020 11:59 am

But archivers offer a 'delete-files' after adding them into an archive..


But can they delete the files with illegal names after adding them to the archive? If they can , then the same deleting function could be used in BRU in a future release....
Admin
Site Admin
 
Posts: 2354
Joined: Tue Mar 08, 2005 8:39 pm

Re: Windows Illegal Character Help

Postby therube » Wed Apr 22, 2020 2:26 pm

In the case of 7-zip, in the case of a filename that ends with a (dot) (., period) - which is not an "illegal" character, not an "illegal" filename, then yes it will delete the file after "moving" the file into the archive.

Code: Select all
c:\tmp> dir > \\?\c:\tmp\dotdot.

That creates a file name, dotdot. (with a literal dot at the end).

Code: Select all
c:\tmp> 7-zip  a  X.7z  -sdel

That creates an archive, X.7z containing the specified files within.
(In this case, simply defaulting to all files that happen to be there [& subdirectories too - SO BE CAREFUL].)

Once the files are archived, the -sdel switch kicks in & deletes said files.

(I'm not really familiar with with -sdel. Other archives use the m command, i.e., move, to accomplish similar.
ARJ, an old archiver, cannot deal with dotdot. Instead, saying, "Can't open dotdot."
ARJ m X.arj *
[ARJ, can also immediately upon creation, test an archive to assure it's validity, & if it fails the operation is aborted:
ARJ m X.arj * -jt, fails - because of dotdot., & so the archive is not created, & no files are deleted.
I don't know how 7-zip's -sdel or RAR's m options deal with errors & what might or might not happen as far as archive creation, file deletion are concerned?)


RAR stores the file name, dotdot. literally within its archive.
On extraction, where 7-zip replaces the . with _, looks like RAR simply drops the (dot).

Not sure how either play with truly illegal characters (cause I have none to play with)?
therube
 
Posts: 1319
Joined: Mon Jan 18, 2016 6:23 pm

Re: Windows Illegal Character Help

Postby therube » Wed Apr 22, 2020 2:33 pm

So 7-zip, on extraction, replaces illegal characters with underscores.

Code: Select all
7-Zip 19.00 (x86) : Copyright (c) 1999-2018 Igor Pavlov : 2019-02-21

Scanning the drive for archives:
1 file, 11959 bytes (12 KiB)

Listing archive: Example (containing file names with ILLEGAL characters).dmg

--
Path = Example (containing file names with ILLEGAL characters).dmg
Type = Dmg
Physical Size = 11959
Method = Zero2 ZLIB CRC
Blocks = 11
----
Path = 4.hfs
Size = 2056192
Packed Size = 2819
Comment = disk image (Apple_HFS : 4)
Method = Zero2 ZLIB CRC
--
Path = 4.hfs
Type = HFS
Physical Size = 2056192
Method = HFS+
Cluster Size = 4096
Free Space = 1888256
Created = 2018-07-06 16:01:56
Modified = 2018-07-06 10:02:00

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2018-07-06 09:57:19 D....                            SomeFolder
2018-07-06 09:57:44 .....         6148         8192  SomeFolder\.DS_Store
2018-07-06 10:01:56 D....                            SomeFolder\.HFS+ Private Directory Data
2018-07-06 09:59:12 D....                            SomeFolder\SubFolder
2018-07-06 10:01:56 D....                            SomeFolder\[HFS+ Private Data]
2018-07-06 09:58:19 .....         6148         8192  SomeFolder\SubFolder\.DS_Store
2018-07-06 09:32:10 .....          462         4096  SomeFolder\SubFolder\FilenameWithIlligalCharacters1_2|3_4<5>6"7*8?9.rtf
2018-07-06 09:32:10 .....          462         4096  SomeFolder\SubFolder\FilenameWithIlligalCharacters_|_<>"*?.rtf
------------------- ----- ------------ ------------  ------------------------
2018-07-06 10:01:56              13220        24576  4 files, 4 folders


7-zip x E* -oY
DIR Y\ /s

Code: Select all
Directory of C:\TMP\SEA\xul\cache2\dotdot\x\y

04/22/2020  09:32 AM    <DIR>          .
04/22/2020  09:32 AM    <DIR>          ..
07/06/2018  09:57 AM    <DIR>          SomeFolder
               0 File(s)              0 bytes

Directory of C:\TMP\SEA\xul\cache2\dotdot\x\y\SomeFolder

07/06/2018  09:57 AM    <DIR>          .
07/06/2018  09:57 AM    <DIR>          ..
07/06/2018  09:57 AM             6,148 .DS_Store
07/06/2018  10:01 AM    <DIR>          .HFS+ Private Directory Data_
07/06/2018  09:59 AM    <DIR>          SubFolder
07/06/2018  10:01 AM    <DIR>          [HFS+ Private Data]
               1 File(s)          6,148 bytes

Directory of C:\TMP\SEA\xul\cache2\dotdot\x\y\SomeFolder\.HFS+ Private Directory Data_

07/06/2018  10:01 AM    <DIR>          .
07/06/2018  10:01 AM    <DIR>          ..
               0 File(s)              0 bytes

Directory of C:\TMP\SEA\xul\cache2\dotdot\x\y\SomeFolder\SubFolder

07/06/2018  09:59 AM    <DIR>          .
07/06/2018  09:59 AM    <DIR>          ..
07/06/2018  09:58 AM             6,148 .DS_Store
07/06/2018  09:32 AM               462 FilenameWithIlligalCharacters1_2_3_4_5_6_7_8_9.rtf
07/06/2018  09:32 AM               462 FilenameWithIlligalCharacters________.rtf
               3 File(s)          7,072 bytes

Directory of C:\TMP\SEA\xul\cache2\dotdot\x\y\SomeFolder\[HFS+ Private Data]

07/06/2018  10:01 AM    <DIR>          .
07/06/2018  10:01 AM    <DIR>          ..
               0 File(s)              0 bytes

     Total Files Listed:
               4 File(s)         13,220 bytes
              14 Dir(s)


(I happened to come across the .dmg file [here,https://sourceforge.net/p/sevenzip/discussion/45797/thread/16f5253d/].
Seems RAR doesn't know what to do with it, so not sure how it deals with things.
Also don't know how to create an illegally named file to see how the archives deal with adding it to their archive [& subsequent deletion]?)
therube
 
Posts: 1319
Joined: Mon Jan 18, 2016 6:23 pm

Re: Windows Illegal Character Help

Postby RegexNinja » Wed Apr 22, 2020 2:56 pm

Admin:
Hopefully, CliffyBoy will post back to verify this..
I'm fairly certain that most archivers can either rename or delete such files.
Since it would take the same file-handing ability to perform either of those actions.

But its easier to place them all into 1-archive (with the delete option), then extract-all, versus renaming 1-by-1.
I do know, that when placing them into an archive, there's a 'copy full path' option.
So that during extraction, there's no worry about anything going to a wrong location.

The archivers have their own built-in file-handlers, so they dont rely on the Windows API's.
I've no idea how hard it'd be to code your own file-handler though, but it'd be a great option!
Windows would still disable creating files with illegal-chars, but at least we could fix them.

Therube:
Thanks for all the testing!.. I dont suppose 7zip has an option to choose whether _ is used as the replacement??
I'm now wondering what the differences might be between the different archivers out there.
Looking forward to hearing CliffyBoy to see whether there was a replacement, or just removal.

Cheers!
RegexNinja
 
Posts: 134
Joined: Fri Feb 21, 2020 5:26 pm

Re: Windows Illegal Character Help

Postby therube » Wed Apr 22, 2020 7:51 pm

I forgot about (gnu) zip.
zip (as well as 7-zip & RAR) will store both dotdot & dotdot. (separately) into the archive.

On extraction, all see the two "dots", dotdot & dotdot. & ask what you want to do (with the "second").
All give you the option to rename. (7-zip gives you the option to 'A(u)to rename all', which it then does, automatically - without intervention, incrementing, if needed, which is most convenient).

I forgot about (gnu) zip.

And lha. (How did I forget at that, circa 1993.)
lha also will add both dotdot & dotdot.
(But after that it has trouble listing/extracting individual files, but can extract ALL, but does not give an option to rename, only overwrite options; yes/no/skip.)
therube
 
Posts: 1319
Joined: Mon Jan 18, 2016 6:23 pm

Re: Windows Illegal Character Help

Postby Admin » Thu Apr 23, 2020 12:32 am

Regarding Bulk Rename Utility, we will investigate if it is possible to add the capability to rename files even if they have illegal Windows characters. :)
Admin
Site Admin
 
Posts: 2354
Joined: Tue Mar 08, 2005 8:39 pm

Re: Windows Illegal Character Help

Postby CliffyBoy » Sat Apr 25, 2020 3:29 am

Hey All,

I see there has been much discussion about this. Great! I went to verify a successful illegal character (space) archive and extraction on another computer. At first it didnt work and I had a panic attack that I told the whole forum that it did. After troubleshooting I realized I was using a lower version of WinRar, so I updated to the latest WinRar version 5.90 and it works as expected. So, this solution can work using WinRar 5.71 and above. So my process is:

WinRar set up with default options

Select files/folders
right click add to archive
Select Archive options- delete files after archiving (will delete the hard to delete files/folders with illegal characters instantly)
extract files back into folder

The only issue I have found is a maximum of 8 files that can be converted at a time. The average file size of my 8 file batches was 56 mb. I dont have larger folders/files to test to see if the size makes a difference in number converted or if Winrar is just tracking # of files with issues at a time. Maybe someone else with illegal characters (spaces) could check. I dont have any files/folders with other illegal characters (?:etc) to test either.

My main issues have been with files/folders containing trailing spaces transferred from my seedbox to a windows machine. I have a perl script fix for them now prior to downloading, but still looking for a perfect batch file fix to address previous downloads. I still havent tested using BRU to find the illegal characters yet because I have about 400 folders to delete and can only do 8 at a time.

t would be great if BRU could incorporate WinRar type procedures to deal with these type of files.
CliffyBoy
 
Posts: 24
Joined: Sun Apr 19, 2020 9:36 pm

Re: Windows Illegal Character Help

Postby CliffyBoy » Sat Apr 25, 2020 4:07 am

For those of you with deep pockets, the application Shareprep seems like the full solution (per their site). I did use the trial and it batch fixed 50 files (illegal space characters) but that is the free limit. It is pricey but maybe someone who can afford it give it a full test and see if this is a complete solution that perhaps can be ported to BRU.
CliffyBoy
 
Posts: 24
Joined: Sun Apr 19, 2020 9:36 pm

Re: Windows Illegal Character Help

Postby therube » Sat Apr 25, 2020 12:25 pm

@CliffyBoy, do you have an archive that you can post, that contains "illegal characters" - along with instructions as to how to extract said files such that they retain their "illegal-ness" (on the Windows end)?

270:

Code: Select all
RAR 2.70     Copyright (c) 1993-2000 Eugene Roshal     11 May 2000
Shareware version         Type RAR -? for help

Archive XXX.rar

Name             Size   Packed Ratio  Date   Time     Attr      CRC   Meth Ver
-------------------------------------------------------------------------------
4<5>6"7*8?9.rtf      462      325  70% 06-07-18 14:32 -rw-r--r-- DC1AC3B3 m3b 2.9
SomeFolder\SubFolder\.DS_Store - the file header is corrupt
.DS_Store        6148      181   2% 06-07-18 14:58 -rw-r--r-- 6D88006A m3b 2.9
<>"*?.rtf         462      325  70% 06-07-18 14:32 -rw-r--r-- DC1AC3B3 m3b 2.9
.DS_Store        6148      420   6% 06-07-18 14:57 -rw-r--r-- 8C6320AB m3b 2.9
SomeFolder\SubFolder\FilenameWithIlligalCharacters1\2|3 - the file header is corrupt
2|3                 0        0   0% 25-04-20 12:31 drwx------ 00000000 m0  2.0
SomeFolder\SubFolder\FilenameWithIlligalCharacters\| - the file header is corrupt
|                   0        0   0% 25-04-20 12:31 drwx------ 00000000 m0  2.0
[HFS+ Private Data]        0        0   0% 06-07-18 15:01 drwx------ 00000000 m0  2.0
SubFolder           0        0   0% 06-07-18 14:59 drwx------ 00000000 m0  2.0
.HFS+ Private Directory Data
        0        0   0% 06-07-18 15:01 drwx------ 00000000 m0  2.0
SomeFolder - the file header is corrupt
SomeFolder          0        0   0% 06-07-18 14:57 drwx------ 00000000 m0  2.0
-------------------------------------------------------------------------------
   10            13220     1251   9%


Code: Select all

RAR 2.70     Copyright (c) 1993-2000 Eugene Roshal     11 May 2000
Shareware version         Type RAR -? for help


Extracting from XXX.rar

Unknown method in SomeFolder\SubFolder\FilenameWithIlligalCharacters1\2|3\4<5>6"7*8?9.rtf
SomeFolder\SubFolder\.DS_Store - the file header is corrupt
Unknown method in SomeFolder\SubFolder\.DS_Store
Unknown method in SomeFolder\SubFolder\FilenameWithIlligalCharacters\|\<>"*?.rtf
Unknown method in SomeFolder\.DS_Store
SomeFolder\SubFolder\FilenameWithIlligalCharacters1\2|3 - the file header is corrupt
Cannot create directory SomeFolder\SubFolder\FilenameWithIlligalCharacters1\2|3
SomeFolder\SubFolder\FilenameWithIlligalCharacters\| - the file header is corrupt
Cannot create directory SomeFolder\SubFolder\FilenameWithIlligalCharacters\|
Cannot create directory SomeFolder\.HFS+ Private Directory Data

SomeFolder - the file header is corrupt
Total errors: 11



---------------------------------------------------------------------------------------


590:

Code: Select all
RAR 5.90 x86   Copyright (c) 1993-2020 Alexander Roshal   26 Mar 2020
Trial version             Type 'rar -?' for help

Archive: XXX.rar
Details: RAR 4

Attributes      Size     Date    Time   Name
----------- ---------  ---------- -----  ----
-rw-r--r--       462  2018-07-06 14:32  SomeFolder\SubFolder\FilenameWithIlligalCharacters1\2|3\4<5>6"7*8?9.rtf
-rw-r--r--      6148  2018-07-06 14:58  SomeFolder\SubFolder\.DS_Store
-rw-r--r--       462  2018-07-06 14:32  SomeFolder\SubFolder\FilenameWithIlligalCharacters\|\<>"*?.rtf
-rw-r--r--      6148  2018-07-06 14:57  SomeFolder\.DS_Store
drwx------         0  2020-04-25 12:31  SomeFolder\SubFolder\FilenameWithIlligalCharacters1\2|3
drwx------         0  2020-04-25 12:31  SomeFolder\SubFolder\FilenameWithIlligalCharacters\|
drwx------         0  2018-07-06 15:01  SomeFolder\[HFS+ Private Data]
drwx------         0  2018-07-06 14:59  SomeFolder\SubFolder
drwx------         0  2018-07-06 15:01  SomeFolder\.HFS+ Private Directory Data

drwx------         0  2018-07-06 14:57  SomeFolder
----------- ---------  ---------- -----  ----
                13220                    10


Code: Select all
Extracting from XXX.rar

Creating    SomeFolder                                                OK
Creating    SomeFolder\SubFolder                                      OK
Creating    SomeFolder\SubFolder\FilenameWithIlligalCharacters1       OK
Creating    SomeFolder\SubFolder\FilenameWithIlligalCharacters1\2_3   OK
Extracting  SomeFolder\SubFolder\FilenameWithIlligalCharacters1\2_3\4_5_6_7_8_9.rtf      17%  OK
Extracting  SomeFolder\SubFolder\.DS_Store                                31%  OK
Creating    SomeFolder\SubFolder\FilenameWithIlligalCharacters        OK
Creating    SomeFolder\SubFolder\FilenameWithIlligalCharacters\_      OK
Extracting  SomeFolder\SubFolder\FilenameWithIlligalCharacters\_\_____.rtf      50%  OK
Extracting  SomeFolder\.DS_Store                                          76%  OK
Creating    SomeFolder\[HFS+ Private Data]                            OK
Creating    SomeFolder\.HFS+ Private Directory Data_                  OK
Total errors: 4


Code: Select all
C:.
\---SomeFolder
    +---.HFS+ Private Directory Data_
    +---SubFolder
    |   +---FilenameWithIlligalCharacters
    |   |   \---_
    |   \---FilenameWithIlligalCharacters1
    |       \---2_3
    \---[HFS+ Private Data]


Code: Select all
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder
C:\TMP\SEA\du\cache2\ILLEGAL\X\XXX.rar
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder\.DS_Store
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder\.HFS+ Private Directory Data_
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder\SubFolder
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder\[HFS+ Private Data]
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder\SubFolder\.DS_Store
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder\SubFolder\FilenameWithIlligalCharacters
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder\SubFolder\FilenameWithIlligalCharacters1
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder\SubFolder\FilenameWithIlligalCharacters\_
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder\SubFolder\FilenameWithIlligalCharacters\_\_____.rtf
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder\SubFolder\FilenameWithIlligalCharacters1\2_3
C:\TMP\SEA\du\cache2\ILLEGAL\X\SomeFolder\SubFolder\FilenameWithIlligalCharacters1\2_3\4_5_6_7_8_9.rtf
therube
 
Posts: 1319
Joined: Mon Jan 18, 2016 6:23 pm

Re: Windows Illegal Character Help

Postby RegexNinja » Sat Apr 25, 2020 2:07 pm

Therube:
I doubt Windows would allow anything to be extracted(created) with illegal-chars.
Hopefully, I'm wrong about that.. I recently bricked my laptop using HxD to edit the partition table.
My goal was to create a test environment, but I somehow managed to get illegals into a system folder.

I was all set to post a nice little tutorial on how to get it done.. Then I needed to reboot...
Wont be making that mistake again.. I'll just get another drive for that kinda stuff, thank goodness for boot-CD's.
Anywho, I think, so long as Windows is running, you cant create such illegal-names, just fix them.

I'd love to be wrong about that, but I'm not doing any more testing with HxD to find out.
7zip lets you create illegal-names by renaming within an archive, but the chars get removed during extraction.
Thanks for all the testing with WinRar.. If you figure out a safe way to create any, please let us know!

CliffyBoy:
Could you please answer 2 questions:
Are the illegal-names are on the same drive that Windows is installed/running on?
Will WinRar let you simply rename those folders 1-at-a-time (without compressing, then extracting)?

My theory is yes, since deleting/renaming should use the same file-handler.
If true, then its definitely possible to write BRU with its own file-handler to do the same.
Not saying it'd be wortwhile, as I've no idea how hard it'd be to do such a thing.

As you can see, this is a fairly common issue, so it draws alot of attention.. Thanks for posting back!
And as external drives become even more common, I've gotta feeling so will this issue..
Cheers!
RegexNinja
 
Posts: 134
Joined: Fri Feb 21, 2020 5:26 pm

Re: Windows Illegal Character Help

Postby CliffyBoy » Sat Apr 25, 2020 3:20 pm

RegexNinja,

Yes, the illegal characters (trailing spaces) are on a Windows installed drive. They were FTP from my original seedbox download server separately using both versions of Filezilla and Filezilla Pro. (Tested each to ensure the illegal character transfer was consistent). The files/folders once found can be renamed easily on the remote seedbox server, but once transferred via Filezilla to the Windows 10 computer the files/folders show but are not accessible to delete or rename. Windows gives a "Cannot Find This Item" error. In my case, the folder has the illegal character space but the files within it are ok. I can delete the files but not the parent folder. The folder in properties will also show 0bytes.

Yes, WinRar did let me rename the folders with the illegal character (trailing space) without compressing then extracting. I just tested and it worked fine. The folder just needs to be selected within WinRar, then right click and rename as usual. Im assuming it would work identically for files also but I dont have any to test right now.

Ultimately, I think Filezilla bears some responsibility here since it is the vehicle allowing the illegal characters (spaces) to be transferred to a Windows machine. Filezilla has options to rename illegal characters with _ and I have seen it fix other files previously; however there doesnt seem to be an option for leading or trailing spaces thus creating this big mess.

Therube,
Im working on a way to make some test files accessible here. The few ways I've tried so far dont work; Google drive and direct download through the seedbox interface keep zipping the files which then get fixed on extraction.
CliffyBoy
 
Posts: 24
Joined: Sun Apr 19, 2020 9:36 pm

Re: Windows Illegal Character Help

Postby therube » Sat Apr 25, 2020 4:52 pm

OK, so that was easy - once you read up on it a bit.

Code: Select all
rename "\\?\C:\TMP\illegal\xxx" " XYZ "


So that renames "xxx" to " XYZ " with both a leading & trailing blank.
And then Everything finds it easy enough with: regex:\s$
(Can't delete it, though, nor rename it.)


(From the Windows end) only Rar & 7-zip would add illegally named files into their archives (& they were added, "as is"):

Image
therube
 
Posts: 1319
Joined: Mon Jan 18, 2016 6:23 pm

Re: Windows Illegal Character Help

Postby CliffyBoy » Sun Apr 26, 2020 12:05 am

For anyone who cares to try to manipulate a folder with a trailing space, you can FTP some examples directly.

Using FTP client:

<removed>

Trust the certificate or else it will keep coming up for permission as you transfer the files. Ignore the certificate mismatch error; It is self signed and I didnt know to match the names when i created it. Illegal in the password name is a capital I. There are 4 folders; 3 with no contents just named with a trailing space, and 1 with a trailing space name with some music files. Total size around 500mb. I will check back later tonight if there are questions.
CliffyBoy
 
Posts: 24
Joined: Sun Apr 19, 2020 9:36 pm

PreviousNext

Return to How-To