Pergunta

What system call does tar use to get the content of files that it uses to create an archive? I tried using strace to see the call, but it never calls open on the file.

$ echo "HelloWorld" > my_test_file 
$ strace -s250 -f -F tar -cf /dev/null my_test_file 2>&1 | grep my_test_file
execve("/bin/tar", ["tar", "-cf", "/dev/null", "my_test_file"], [/* 20 vars */]) = 0
newfstatat(AT_FDCWD, "my_test_file", {st_mode=S_IFREG|0664, st_size=11, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "my_test_file", {st_mode=S_IFREG|0664, st_size=11, ...}, AT_SYMLINK_NOFOLLOW) = 0

I am guessing the newfstatat is pretty much the same thing as fstatat (which "operates in exactly the same way as stat" except for some minor differences), so that probably isn't opening the file.

My version of tar:

$ tar --version
tar (GNU tar) 1.26
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by John Gilmore and Jay Fenlason.

My operating system:

$ uname -a 
Linux myhostname 3.11.0-14-generic #21-Ubuntu SMP Tue Nov 12 17:04:55 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=13.10
DISTRIB_CODENAME=saucy
DISTRIB_DESCRIPTION="Ubuntu 13.10"
Foi útil?

Solução 2

To me it seems that the source file is not read when writing to /dev/null and when it has zero size.

cd /tmp; echo test > testinput; diff -u <(strace -s250 -f tar -cf /dev/null testinput 2>&1) <(strace -s250 -f tar -cf testoutput testinput 2>&1) | less +'/open\("testinput"'

Open is used on input file when output is not /dev/null and the input file is not empty. Using GNU tar 1.20 and strace 4.5.17.

Outras dicas

Obviously, when you're taring a file, it must be read by the process running tar. This is exactly what happens on my system. I created a 512-byte file from /dev/urandom and ran tar -cf file.tar file.xyz. After filtering out all the noise related to loading libraries into the process' image, you can see the actual relevant lines that strace reports :

creat("file.tar", 0666)                 = 3

We can see that the output file from the tar command is being created with read/write permissions for the owner, group, and world (which is probably influenced by the umask reported by your shell), and the new file's descriptor inside this process is 3.

openat(AT_FDCWD, "file.xyz", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_CLOEXEC) = 4

Here, the file to be archived is opened and assigned the file descriptor 4.

fstat(4, {st_mode=S_IFREG|0644, st_size=512, ...}) = 0

tar calls fstat on an open file descriptor in order to find out if the file is readable and its size (probably).

read(4, "\225\243\263uG\320-\354!%\337\3376\311\210&\377T=aiO\10\203\375|y\304\231\203x."..., 512) = 512

We can see the file being actually read.

close(4)                                = 0

And properly closed.

write(3, "file.xyz\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 10240) = 10240

The file referenced by descriptor 3 - our output file - is being written to. We can't really see the contents of file.xyz in the write call, but this is probably because of the structure of the tar file.

close(3)                                = 0

Now, the output file is closed, as well as the whole process (not shown here).

Interestingly, at first I created an empty file with touch, and tried to tar it. However, it seems like tar checks if the file is empty and, if it is, does not insert the data inside the tar archive. newfstatat returns the information about the size, which tar probably uses to make this decision.

However, you should really read the source to see how the actual execution looks. It is possible that, for example, files which are much larger are mmaped into the process, and read this way, while smaller files are simply read with read.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top