I’ve been profiling some of my background jobs, and I discovered that an S3 download step is taking up a large percentage of the time (between 20% and 50%.) The same files are being used repeatedly, so I would like to start caching these locally.
It would be easy to set up an ephemeral cache inside the Docker containers, but then this would be blown away every time a container restarts. Also each container would need to maintain their own cache, so this would not be an efficient use of disk space.
I’m reading the EFS FAQ:
Q: When should I use Amazon EFS vs. Amazon S3 vs. Amazon Elastic Block Store (EBS)?
Amazon EBS can deliver performance for workloads that require the lowest-latency access to data from a single EC2 instance.
As far as I understand, Convox uses EFS for any volumes defined in convox.yml. (At least, for version 2 racks.)
I think it might be better if I could mount a local volume from each EC2 instance (from the EBS volume), because the performance would be much better. Each set of containers on a single instance would share the same volume. For my case, it’s totally fine if a container on instance A doesn’t see the same files as a container on instance B.
One advantage of an EFS volume is that the file cache would be shared across all of my EC2 instances. I might need to be careful about lock contention, since every process would be accessing and updating the same cache. (But I probably won’t need to worry about that for a long time.)
Is there something I can modify in convox.yml, or maybe in a custom AMI?