How to do line-by-line comparison of files in Linux using diff command

If you are a Linux user and work with different Linux distributions in your job, you may find yourself typing commands on a Linux system without a GUI. This means that you will no longer have access to your favorite GUI applications-for example, e.g. Gedit for file editing which you might normally use for your work.

Regardless of whether you're a system administrator or a developer, file comparison is a task that almost everyone has in their work. But what if you need to compare two files while working on a pure CLI Linux system? Your favorite GUI-based comparison tool won't be available to you, of course. Of course, you'll have to make do with a command line utility to get the job done.

On Linux, you can use the diff command to compare two files, but there is a bit of a learning curve with this utility. If you don't know how diff works and are looking for a quick tutorial to get you started, you've come to the right place. In this article, we'll cover the basics of this command along with some easy-to-understand examples.

Linux Diff Command

Instead of directly jumping on to examples, it's good to know a bit about the command first. The man page of the diff command reveals that the tool compares files line by line. Its syntax is:

diff [OPTION]... FILES

While [OPTION] represents the various command line options the tool offers, FILES is usually a couple of file names. Although the diff man page contains useful information about the command, the full documentation for diff is maintained as a Texinfo manual. If the info and diff programs are properly installed at your site, the command

info diff

should give you access to the complete manual.

Diff Usage/Examples

Now let's discuss how diff is used. For this, let's begin with a basic example. Suppose the following are the two files that we want to compare:

file1:

test
test2
test3

file2:

test
test23
test3

Here's how you can use the diff command to compare these two files:

diff file1 file2

And here's the output the above command produces:

2c2
< test2
---
> test23

The output seems cryptic, right? We'll come to it in a bit. Let's first understand the basic structure of the output that diff produces in general.

The first thing to keep in mind is that the output represents the changes required to transform file1 (usually the original file) into file2 (the new or changed file). The output usually consists of lines that begin with a number (or a range) followed by an alphabet (a,d, or c) and another number (or range). For example 2c2 (from the output above).

The first number represents the line (or range of lines) from file1 (the original file), while the last number represents the line (or range of lines) from file2 (the new file). As for the alphabet in between, a represents added, d is for deleted, and c represents changed.

So, 2c2 means the second line in the original file has changed and needs to be replaced with the second line from the new file in order to make the files the same. If you manually compare the two files (file1 and file2) then you'll see that's exactly the case. 

As for the three lines that follow 2c2 in the aforementioned example, the one that starts with '<' is nothing but the second line from file1, and the one that begins with '>' is nothing but the line in question from file 2. The three hyphens in between them (---) are just for separation purposes.

Is there any need to revisit the output of the first example now? Most likely not as the above explanation should have made it self-explanatory. Now, let's take another example.

So, here's file1:

Hi all,
This is a diff command tutorial
from HowtoForge.
Hope you'll benefit from it.
Thanks.

Here's file2:

Hi all,
Welcome to HowtoForge.
In this tutorial, we'll discuss the diff tool.
Hope you'll find it beneficial.
Thanks.

Here's the command (which remains the same):

diff file1 file2

And here's the output:

2,4c2,4
< This is a diff command tutorial
< from HowtoForge.
< Hope you'll benefit from it.
---
> Welcome to HowtoForge.
> In this tutorial, we'll discuss the diff tool.
> Hope you'll find it beneficial.

So you can see that in this case, the main output - 2,4c2,4 - consists of multiple numbers both before and after the alphabet. These are basically ranges - 2,4 represents line number 2,3, and 4. So the output means that line numbers 2 to 4 in the original file (file1) have changed, and need to be replaced by line numbers 2 to 4 from file2 in order to make the files same. 

Moving on, let's change the contents of the files a bit. While the file1 remains the same, file2 now becomes:

Welcome to HowtoForge.
In this tutorial, we'll discuss the diff tool.
Hope you'll find it beneficial.
Thanks.

This is a diff command tutorial
from HowtoForge.
Hope you'll benefit from it.
Thanks.

Now, if you run the diff command, the following output will be produced:

0a1,5
> Welcome to HowtoForge.
> In this tutorial, we'll discuss the diff tool.
> Hope you'll find it beneficial.
> Thanks.
>

So you can see that the tool immediately recognized that the second paragraph in file2 is nothing but what all of file1 contains. So the output says that lines 1 to 5 from file2 should be appended at the beginning of file1 to make the two files the same.

And if you delete the last line ("Thanks.") from file2, here's the output:

0a1,5
> Welcome to HowtoForge.
> In this tutorial, we'll discuss the diff tool.
> Hope you'll find it beneficial.
> Thanks.
>
4d8
< Thanks.

You can see that the output now also contains 4d8, which means that the fourth line in file1 should be deleted in order to make both files in sync beginning at line number 8. Of course, this is after you address the 0a1,5 change that's mentioned first.

You might also want to take a look at the sdiff command which is capable to show you differences in files side by side.

Conclusion

Agreed, the output of the diff command isn't easy to comprehend, but the learning curve isn't that steep. Spend a couple of hours with the tool, and you'll surely get comfortable with it. As for the tutorial, we've just scratched the surface here. Take a look at the command's man page, and you'll realize that there's much more to learn about diff, something which we'll do in the next part of this tutorial series.

Share this page:

3 Comment(s)