RE: [PATCH] fs: fat: add check for dir size in fat_calc_dir_size

From: Anupam Aggarwal
Date: Fri Jul 03 2020 - 10:49:31 EST


Hi Ogawa,

>So what was the root cause of slowness on big directory?

Problem happened on FAT32 formatted 32GB USB 3.0 pendrive, which has 20GB of data, cluster size is 16KB
It has one corrupted directory whose size calculated by fat_calc_dir_size() is 1146896384 bytes i.e. 1.06 GB.

When directory traversal of corrupted directory starts, directory entries looks to be corrupted
and lookup fails for these directory entries.
Some directory entries name are having format abc/xyz,
following are the few observed directory entry names:

eqk/hb*
*ÃÃ/ÃÂ7Ã.ÃBÃ
ty7@o/<`
-Ã%/Ã3{.9q
'Ãu/Ãy<Ã.^mÃ
PhâCfâ6g.Ã/k

Now when path lookup happens for above directory entries, it will search for name before â/â in corrupted directory e.g.

eqk
*ÃÃ
ty7@o
-Ã%
'Ãu
PhâCfâ6g.Ã

There are also directory entries with garbage name for which lookup fails, e.g.
Ã)YÂ&qÂ(.ÃÂ.
ÃââÃâârâ.âgÂ
4âh1âx0â.p3â

During search for single name in fat_search_long() function, whole corrupted directory of size 1.06GB is traversed,
which takes around 230 to 240 secs, which finally ends up with returning ENOENT.

Now multiple lookups in corrupted directory makes âls -lRâ never-ending e.g. in overnite test of running âls âlRâ
on USB having corrupted directory, around 200 such lookups in corrupted directory took 14hrs and still âls âlRâ is running.

Total number of directory entries in corrupted directory of size 1146896384 bytes = 1146896384/32 = 35840512,
so lookup for 35840512 looks very exhaustive, therefore we have put size check of directory in fat_calc_dir_size()
and prevented the directory traversal by returning -EIO.

While browsing corrupted directory(\CorruptedDIR) on Windows 10 PC,
2623 directory entries were listed and timestamps were wrong

Following is the readonly chkdsk output of USB.

--------------------------------------------------------------------------------------
chkdsk I:
The type of the file system is FAT32.
Volume AAA created 12/28/2018 3:15 PM
Volume Serial Number is 1606-72DC
Windows is verifying files and folders...
Windows found errors on the disk, but will not fix them
because disk checking was run without the /F (fix) parameter.
The \$TXRAJNL.DAT entry contains a nonvalid link.
The size of the \$TXRAJNL.DAT entry is not valid.
Unrecoverable error in folder \CorruptedDIR.
Convert folder to file (Y/N)? n
The \BBB\file1.txt entry contains a nonvalid link.
The size of the \BBB\file1.txt entry is not valid.
The \CCC\file1.txt entry contains a nonvalid link.
The size of the \CCC\file1.txt entry is not valid.
File and folder verification is complete.
Convert lost chains to files (Y/N)? n
3531520 KB of free disk space would be added.

Windows has checked the file system and found problems.
Run CHKDSK with the /F (fix) option to correct these.
30,015,472 KB total disk space.
400 KB in 2 hidden files.
2,800 KB in 48 folders.
16,479,312 KB in 7,583 files.
9,999,392 KB are available.

16,384 bytes in each allocation unit.
1,875,967 total allocation units on disk.
624,962 allocation units available on disk.
--------------------------------------------------------------------------------------

Please let us know for any queries,
and please suggest if something better can be done.

Regards,
Anupam