How to create and extract cpio archives on Linux Examples

Although the cpio archiving utility is nowadays used less than other archiving tools like tar, it is still good to know how it works, since it is still used, for example, to create initramfs images on Linux and for rpm packages, which are used mainly in the Red Hat family of distributions. In this tutorial we see how to create and extract cpio archives using the GNU cpio utility, and how to obtain a list of the files they contain.

In this tutorial you will learn:

  • The cpio utility basics
  • How to create a cpio archive and optionally compress it
  • How to extract a cpio archive
  • How to obtain a list of files contained in a cpio archive
How to create and extract cpio archives on Linux
How to create and extract cpio archives on Linux

Software requirements and conventions used

Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Distribution-independent
Software cpio,gzip,find
Other None
Conventions # – requires given linux-commands to be executed with root privileges either directly as a root user or by use of sudo command
$ – requires given linux-commands to be executed as a regular non-privileged user

Introducing cpio

Cpio stands for “Copy In and Out”: as we already said, it is an archiving utility which is normally included in all Unix and Unix-like operating systems, Linux included. Cpio has two main modes of usage: “Copy-out” and “Copy-in”. When in the former mode the application reads a list of file names from the standard input and, by default, creates an archive to standard output; when is used in the latter mode, instead, it copies files out of an archive. Another mode exists, “copy-pass”, but we will not talk about it in this tutorial.

Creating an archive (copy-out mode)

Cpio has not itself the ability to explore directory trees, therefore, unlike what we do with tar, we cannot pass a directory as argument and expect cpio to create an archive with all its content recursively. Instead, in the Unix spirit of “do one thing and do it well”, we have to use another utility, like find, to create the list of files to be included in the archive. Let’s see an example.



Suppose we want to create a cpio archive with the content of our home directory. Here is the command we could launch:

$ find "$HOME" -depth -print0 | cpio -ocv0  > /tmp/archive.cpio

Let’s analyze what we did above. We used the find utility to obtain the list of files which should be part of the archive. As the first argument of the utility we passed the path of the directory whose content should be archived, and we used two options: -depth and -print0. The former modifies the behavior of find so that each directory content is processed before the directory itself; why is this needed?

Suppose files and directories are processed normally (top first) by find and we have a read-only directory. If this directory is processed before the files it contains, it will be put into the archive before them, and extracted before them when requested. Since cpio has not the ability to manage files and directories permissions and the directory is read-only, it would be impossible to copy files inside of it once it is restored.

With the -print0 option, instead, we made so that full file names are printed on standard output, separated by a null character instead of the standard new line. This is a measure which let us include files which contain newlines in their name.



We piped the output of find to cpio standard input, so that files in the list are included in the archive. When running cpio we used the -o, -v, -c and -0 options. The first is the short form of --create and is needed to specify that we want to use cpio in “copy-out” mode. The -v option (--verbose) is used to list the files that are being processed by the application, and with -c we specified the cpio format to use. When running in copy-out mode to create an archive, by default, the very old “bin” format is used. Using -c is basically a shorthand for -H newc (the -H option let us specify the cpio format), which makes so that cpio uses the new SVR4 portable format. Finally, we used the -0 option, which is the short form of --null. This last option is used to specify that files in the list are delimited by a null character.

The last thing we did was to redirect the output of cpio to a file, the archive we surprisingly named /tmp/archive.cpio (file extension is completely arbitrary). As an alternative to this last redirection we could have used the cpio -F option (--file) with the file name as argument, to instruct the application to write to it instead of standard output.

What if we need to compress the archive on creation? We could simply use another pipe to pass cpio standard output to another application specifically designed to compress files, gzip for example. We would write:

$ find "$HOME" -depth -print0 | cpio -ocv0 | gzip -9 > /tmp/archive.cpio.gz

Extracting an archive (copy-in mode)

We just saw how to create a cpio archive, now let’s see how to extract one. The first thing we should say is that while in copy-out mode we need to specify the archive format to use (if we want to use something different from the default “bin”), on extraction, the format is automatically recognized.

To make cpio run in copy-in mode we launch the cpio utility with the -i option which is the short for --extract. When working in this mode, we need to pass the archive as the cpio standard input. Here is how we could extract the archive we previously created:

$ cpio -iv < /tmp/archive.cpio

When running this command, the files, as stored in the archive, are extracted in the current working directory. If a newer or the same version of the files already exist on the filesystem, cpio will refuse to extract them, and will return an error similar to the following:

<file> not created: newer or same age version exists

If we want to switch to another location before performing the actual extraction, all we have to do is to specify it with the -D option (short for --directory).



Just like working in copy-out mode, we can instruct cpio to read from a file other than standard input, by using the -F option, with the file name as argument.

What if the archive we want to extract is compressed? Supposing we want to extract the archive we compressed with gzip, we need to read the compressed data first, then pipe it to cpio. In the case of a gzip-compressed file we can use the zcat utility to perform such task:

$ zcat /tmp/archive.cpio.gz | cpio -iv

Listing files contained in a cpio archive

Obtaining a list of the files contained in a cpio archive without having to extract it, is quite simple. It is enough to run the application together with the -t option, which is the short form of --list. Just to make an example, to list all the files in the archive we created in the first section of this tutorial, we would run:

$ cpio -t < /tmp/archive.cpio

The command produces a list of the files as they are stored in the archive. If we add the -v option to it, we obtain an output similar to that of ls -l, which includes files and directories permissions.

Conclusions

In this article we learned how to use the cpio utility on Linux. Although nowadays it is less used than tar, it’s important to know how it works, since it is still used for specific purposes, for example, to create rpm software packages. We saw how to create an archive, how to extract it, and finally how to list its content.



Comments and Discussions
Linux Forum