Errno::EMFILE with Paperclip

In a commercial project, I recently came across an unusual bug which I have never seen before. What better way to document this for furture reference and what not to do in case it ever crops up again.

The issue lies in a Rails application which uses Paperclip to store attachments in the database. The model in question uses Paperclip as per documented without any special configurations.

One fine day, the application suddenly stopped working after a bulk upload of attachments. The error logs keep reporting the following error:

Errno::EMFILE: Too many open files

After digging through the source code, I still could not work out where the issue lie. The stacktrace did not point to where exactly the error originated from. The only clue I had was Too many open files. With that, I started to inspect the model which handled file uploads more closely.

There are several helper methods within this model which deserializes the binary blob from the database and calls Paperclip.io_adapters.for to extract metadata about the attachment such as width, height, content type.

Paperclip.io_adapters.for in turn calls which in turns calls its cache_current_values method, which in turn invokes copy_to_tempfile with the target file. The method is shown below:

1 def copy_to_tempfile(source)
2   if source.staged?
3     FileUtils.cp(source.staged_path(@style), destination.path)
4   else
5     source.copy_to_local_file(@style, destination.path)
6   end
7   destination
8 end

The method is actually deserializing the binary blob from the database and saving it as a tempfile in /tmp directory. The issue here is that there is no automatic cleanup of the temp files once processing is completed. In the above case, the model keeps creating a temp file object everytime is calls Paperclip.io_adapters.for. This results in the temp directory being filled up as in the case of a bulk upload.

To resolve this issue we need to be able to unlink or delete the tempfile after each call to the adapter. The problem here is that the @tempfile instance variable is not in the adapter’s public api.

I tried to use refinements on the PaperClip::AbstractAdapter class in order to make @tempfile readable but it did not work in the context of a Rails app due to scope issues. With a little bit of meta programming within an initializer, I came up with the following:

1 PaperClip::AbstractAdapter.class_eval do
2   attr_readable :tempfile
3 end

Now, I am able to access the tempfile and close it once processing is done like so:

1 adapter = Paperclip.io_adapter.for(file)
2 # get the image width, height etc
3 width = Paperclip::Geometry.from_file(adapter).width.to_i
4 height = Paperclip::Geometry.from_file(adapter).height.to_i
6 # close the adpater and removes the temp file
7 adapter.tempfile.close(true) if adapter.tempfile

We first close and then unlink / delete the tempfile. Tempfile.close does it automatically when you pass true to it.

This has been a really interesting bug to track down and resolve and I did learn a lot about how Paperclip works under the hood. It also throws open my assumptions that the gem would undertake all the cleanup for me automatically. If anything, I learnt not to take for granted file IO in Ruby and always make sure that any file handles are opened and closed properly everytime.

Happy Hacking!!