Temporary files

Here’s a little tip. If you’re writing PHP code for a linux only environment and you need a temp file to write something to.

You can unlink the file after you use fopen() and use it to your hearts content. Linux will keep the “file” in existence for you and you can use ftruncate(), rewind(), fflush() and the like to churn it to your hearts content. When you close the filepointer linux will reclaim the space you used. With this method of operation you can be sure that your code will not accidentally leave these files laying around if, for example, the process is terminated.

Just recently I used this trick on two tempfiles in parallel, moving data back and forth between them filtering things in and out on each pass. Handy trick…

5 thoughts on “Temporary files

  1. Cute 🙂 Do you know why this works? Is this because of some underlying linux behavior/spec?

    If you just need a file pointer (rather than a file path), you can also use <code>fopen( 'php://memory' )</code> or <code>fopen( 'php://temp' )</code> http://php.net/manual/en/wrappers.php.php

    That'd reduce I/O (at the obvious expense of memory) and make it harder for other processes to read your temp files (if that's a concern).

    • The reason that this works is because *everything* on a linux server is a file. And linux doesn't actually destroy a file (or rather mark it as over writable) until all references to the file are gone. So if you make a file, then open it in two places, then remove the entry of the file from the filesystem the operating system still sees the two open handles to the file and will continue to treat it as if it did exist on the filesystem. You can see similar behavior using hard links (you can hard link a file to as many different locations on the same volume as you want, and the file isnt actually deleted until the LAST reference has been deleted from the filesystem.

      The reason I don't want to use memory in some cases is that I might be working with hundreds or more megabytes of data that I don't want my processes actually allocating. I'm not sure about the behavior of php://temp so I can't yet comment on it.

      Another nice thing is that by opening and then unlinking the files I can be sure that when my process terminated (by accident or on purpose) that space will be relinquished to the filesystem again (an advantage over using still-linked tempfiles.

  2. Huh. So if I have a file open (say I'm tailing a text file), then delete it, the file is still on disk until I stop the tail?

    Ah – big files. Yeah – don't use php://memory 🙂

    php://temp is neat because you can set a threshold: up to a certain size it will stay in memory, and after that it will transfer to disk. So if you don't know if your file is going to be small or big yet, it might be useful, but if you know your file is going to be big, it's a waste.

    I don't know how the unlinking works. I suspect it uses tmpfile.

  3. In one terminal do:

    <code>touch delme; while [ true ]; do echo 'ah'; sleep 1; done > delme</code>

    In a second do:

    <code>tail -f delme</code>

    Later, in a third do:

    <code>rm delme</code>

    The tail keeps on outputting data even after deleting the file, and you can still see it open with lsof.

Leave a Reply