Atomically rewrite the content of a file
Content |
Tested on |
Debian (Etch, Lenny, Squeeze) |
Ubuntu (Lucid, Maverick, Natty, Precise, Trusty) |
Objective
To atomically rewrite the content of a file
Background
An atomic operation is one that changes a system from one state to another without visibly passing through any intermediate states. Atomicity is desirable when altering the content of a file because:
- The process performing the alteration may fail or be stopped, leaving the file in an incomplete or inconsistent state.
- Another process may try to read the file while the alteration is in progress.
The method described here is applicable when the whole file is rewritten during each update. There is no fully equivalent method for making incremental changes to a file, however it is possible to achieve a similar effect by using a database-like file format.
Scenario
Suppose that you are writing an editor for an XML-based data format such as SVG. XML is not amenable to making incremental changes, therefore you have found it necessary to rewrite the whole file each time it is saved.
If the editor were to directly overwrite the copy on disc then there would be a window of risk during which the file could be left in an unusable state. You wish to avoid this by performing the update atomically.
Method
Overview
For an update to be atomic the new copy of the file must be written in full before the old copy is altered. This can be done using a temporary file. There are four steps to the process:
- Choose a name for the temporary file.
- Write the new content to a temporary file.
- Flush the new content to disc.
- Move the temporary file onto the original.
The example code below is written in C, but using low-level POSIX functions to provide a degree of language-neutrality.
A mechanism is needed for handling errors. The code assumes that there is a function called die
provided for this purpose, which takes the same arguments as printf
and does not return.
The code also assumes that there are no signal handlers that could cause system calls to be interrupted. If this is not the case then you will need to handle EINTR
by some means, for example using the TEMP_FAILURE_RETRY
macro provided by the GNU C Library.
Choose a name for the temporary file
There is a standard library function for choosing temporary filenames called tmpnam
, but it is not suitable for use in this case because it might place the file on a different filesystem from where it ultimately needs to reside. This matters because files can be moved about within a given filesytem very quickly and efficiently, whereas movement between filesystems requires copying.
The simplest way to ensure placement on the appropriate filesystem is chosen is to create the temporary file in the same directory as the file it is intended to replace. Typically this is done by appending a tilde to the pathname, so foo.txt
would become foo.txt~
:
char tmp_pathname[strlen(pathname)+2]; snprintf(tmp_pathname,sizeof(tmp_pathname),"%s~",pathname);
Write the new content to a temporary file
The temporary file may or may not exist prior to being opened. The best way to allow for it existing is to make a speculative call to unlink
before calling open
:
if (unlink(tmp_pathname)==-1) { if (errno!=ENOENT) { die("failed to remove existing temporary file (errno=%d)",errno); } }
(An alternative would be to pass O_TRUNC
to open
, however that would not necessarily leave the file with the appropriate ownership and access mode.)
The temporary file can now be created:
mode_t default_mode=S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH; int fd=open(tmp_pathname,O_RDWR|O_CREAT|O_TRUNC,default_mode); if (fd==-1) { die("failed to open new file for writing (errno=%d)",errno); }
You will need to decide how you want the access mode to be set:
- The value specified above (which corresponds to 666 in octal) is appropriate when a new file is created by an application program. It will be pared back by the
umask
(probably to 644 or 664), thereby giving the user control over the final value. - For existing file you may want to preserve the original access mode.
- Files written by daemons or administrative programs you may want to force a specific mode.
If you do not want to be influenced by the umask
then make a separate call to chmod
. It is not generally possible to preserve the ownership of the file, and for this reason, if you do preserve the file mode then you should take care to consider the security implications.
There is little to be said about actually writing the content, except to stress the need for error detection. Do not close the file handle yet, as it will need to be open for the next step.
Flush the new content to disc
Once the file content has been successfully handed over to a POSIX-compatible operating system then it is safe against failure of the process that wrote it, but not against failure of the system as a whole. This is because the operating system has broad discretion to decide when data is physically written to disc and in what order. One particular sequence of events that you will want to avoid is as follows:
- The temporary file is created, and its creation committed to disc.
- The content of the temporary file is written, but not immediately committed to disc.
- The temporary file is moved onto the original, and its movement committed to disc.
- There is a system crash or a power failure.
The result would be loss of both old and new content. It is unlikely that the old content would have been physically overwritten, but you would need to resort to low-level data recovery: it would no longer exist as a file within the filesystem.
You can prevent this from happening by calling fsync
immediately before closing the file:
if (fsync(fd)==-1) { die("failed to flush new file content to disc (errno=%d)",errno); }
Once the content has been flushed, the file can be closed:
if (close(fd)==-1) { die("failed to close new file (errno=%d)",errno); }
A common misconception is that calling close
will necessarily perform an implicit fsync
. This is not a requirement of ISO C, nor of POSIX. Systems have a strong incentive not to synchronise unnecessarily because of the effect it would have on performance.
It is true that some implementations of fsync
perform poorly. This may be a consideration when deciding whether the cost of performing an atomic update is justified, but does not change the fact that an fsync
is required if you wish to ensure atomicity.
Move the temporary file onto the original
One way to move the file into place would be to unlink the old copy then rename the new copy. This would work, but would not be atomic. The file content would not be lost if there were a failure, but it could be left in a file with the wrong name. Even if there were no failure, from the perspective of other processes the file would briefly disappear from the filesystem.
Fortunately there is no need to make an explicit call to unlink
because the POSIX rename
function will happily unlink any file that stands in its way. Better still, the specification promises that the operation will occur atomically so far as other processes are concerned. This behaviour is ideal for the task at hand:
if (rename(tmp_pathname,pathname)==-1) { die("failed to move new file to final location (errno=%d)",errno); }
Note that ISO C does not guarantee this behaviour. In that case the best you can do is issue a speculative call to rename
, in the hope that it has POSIX-compatible behaviour, but fall back to a non-atomic remove
then rename
if it does not.
Further reading
- The Open Group, fsync, Base Specifications Issue 6
- The Open Group, close, Base Specifications Issue 6
- Stewart Smith, Eat My Data: How everybody gets file I/O wrong
- Theodore T'so, Don't fear the fsync!, 15th March 2009
- Valerie Aurora, Don’t Panic – fsync(), ext3/4, and your data, 16th April 2009
Tags: posix