How to do housekeeping with a lot of text files

If you save a lot of textfiles, e.g. you store historical versions of webpages, you do not want to store empty files or duplicate files.

Housekeeping can be done that way:
1. find empty files

find $PATH -type f -empty

if you want to delete them:

find $PATH -type d -empty -delete

2. find duplicate files

fdupes -r $PATH

In order to delete the duplicate files:

fdupes -rdN $PATH

r - recursive
d - preserver first file, delete other dupes
N - run silently (no prompt)


That will result in a reduced number of files and in a reduced number of duplicate content.