Anybody who makes regular backups, uses some package manager to check out software or does any work on directories with several hundreds of files knows there will be a point that the directory structure is too vast to grasp in your mind. At that stage it is really easy to have duplicate, missing or unwanted files in your folders.
For instance, if you make a backup of a directory structure, you want to have an exact copy of the thing. Not an almost exact copy with some files missing, some files that turned corrupt and some extra files you put there by mistake or forgot to remove.
I often receive updates to (premium) WordPress plugins. Since these plugins have no version control whatsoever I like to keep them under version control myself. Most of the time the changes concern existing or new files, but sometimes an existing file is no longer needed and keeping it in the repository is not a good idea. But how do you know which files are no longer needed?
There are several tools to compare the contents of two directories. The one I like best is
diff. Normally it is used to compare the contents of files, but it is also possible to make a recursive comparison (
-r). And by including the quiet flag (
-q) you get only a list of differences:
diff -rq original/ copy/
This will output something like this:
Files original/in-both.txt and copy/in-both.txt differ Only in copy/: only-in-copy.txt Only in original/: only-in-original.txt
As you can see, all differences between the directories
copy are shown. In this case, both directories have a file called
in-both.txt, but the content is different. Also both
copy contain files that are not present in the other directory (