CarrierWave Better Storage

Written by Rashmi Yadav on

I am using CarrierWave gem for uploading files and faced a problem of file storage after my application reached a certain number of uploaded files.

I have a User model and avatar field for uploading photo of a user.

As my application growing I was getting to many users in system.

So when users count exceeded 32768 in database I started getting an error of "Too many links" in logs.

Something like this

Errno::EMLINK - Too many links - /home/sites/apps/binaryfunction/releases/20140110162610/public/uploads/user/avatar/42435

When I checked file numbers by running following command in system inside

$ cd /home/sites/apps/binaryfunction/releases/20140110162610/public/uploads/user/avatar

$ ls | wc -l

I found that the files count inside my avatar directory has reached 32768.

What is this error and this number and how can we fix it

I looked over internet and found that there is a maximum limit for files under a directory.

It depends on what filesystem your distribution uses. If you use a newer desktop distribution (recent version of Fedora), you probably use ext4. If you use something older, it is most likely ext3.

So each filesystem has it own number to store file in directory, like

ext2: -> 32768

ext3: -> 31998

ext4: -> 64000

It won't store more files after reaching this number. And start complaining about "Too many links".

CarrierWave gives us default storage path when we generate a uploader

Usually AvatarUploader looks like

class AvatarUploader < CarrierWave::Uploader::Base
  def store_dir
    "uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
  end
end

According to this all files will be stored in uploads/user/avatar/ directory for a user avatar.

So for user#id 1 it will be like uploads/user/avatar/1.png.

This will work fine until you reach to maximum number of files(i.e. for ext2 32768).

It means after 32768 your user's avatar will not be saved as your filesystem don't allow more than that inside a dir.

No worry we have a solution here ..yeahhhhh :-)

Paperclip has solved this problem by giving a better storage path for your files by default.

If you have noticed that paperclip generates different directory structure for your files.

An example of file storage path in paperclip

users/avatar/000/000/013/small/1

You can see here that they have this nested directories “000/000/013”.

They called this id_partition.

But CarrierWave do not have this nested directory structure by default.

Lets implement this with CarrierWave uploader.

class AvatarUploader < CarrierWave::Uploader::Base
  def store_dir
    "uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{id_partition}/#{model.id}"
  end

  private

  def id_partition
    case id = model.id
    when Integer
      ("%09d" % id).scan(/\d{3}/).join("/")
    # can add more checks if you have other id type (i.e. string for monogdb)
    else
      nil
    end
  end
end

So using above id_partition we will be nesting our directories like

"000/000/001"

If our file count reaches 1000 it will create another nested directory like

"000/001/000"

So we have solved this problem for our AvatarUploader. But what if we have many uploaders in system and we want this in all of our uploaders.

I would suggest to create a BaseUploader and put this method inside that and all other uploaders can simply inherit BaseUploader.

class BaseUploader < CarrierWave::Uploader::Base
  def store_dir
    "uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{id_partition}/#{model.id}"
  end

  private

  def id_partition
    case id = model.id
    when Integer
      ("%09d" % id).scan(/\d{3}/).join("/")
    else
      nil
    end
  end
end

And AvatarUploader will become

class AvatarUploader < BaseUploader
end

To verify this id_partition work you can simply write a test something like

require 'spec_helper'

describe BaseUploader do
 describe 'store_dir' do
   it 'should have id_partition included' do
     model_id = 123
     uploader = BaseUploader.new
     model = double(id: model_id)
     uploader.stub(:model).and_return(model)

     expect(uploader.store_dir).to match(/000\/000\/#{model_id}\/#{model_id}/)
   end

   it 'should have another id_partition when id is higher' do
     model_id = 1001
     uploader = BaseUploader.new
     model = double(id: model_id)
     uploader.stub(:model).and_return(model)

     expect(uploader.store_dir).to match(/000\/001\/001\//)
   end
 end
end

By doing this you can easily solve problem of "Too many links" with CarrierWave and avoid having too many files inside one folder.

I know it is a bit hard to migrate all your uploads when you reach to this number of filess so it is better to have this in first place to avoid problems in future.

I hope this article will be helpful to you.

There are some other articles where you can find more about this.

paperclip id_partition

mkdir(): Too Many Links – What It Is and How to Fix