Recently, I needed a simple, secure and elegant solution to resize million+ images to several dimensions.
Just resize, different sizes, secure, fast.
Those images are meant to serve as static files for an e-commerce application.
So what options do we have?
- resize the images preprocessed as cronjob periodically (cause we are getting new images every day..)
- resize the images on the fly on first request
Because of the flexibility, the decision was made to use the second version.
This means, the first user who wants to have a resized version is actually starting the transformation job.
All following requests get the resized version directly from disc delivered via HTTP Server.
I’ve researched a lot on google, stack overflow and github to find a useful solution.
The list of names was long:
There is also a pretty long article from the authors of imgproxy with comparisons of different solutions.
However, none of those solutions are what i want:
- resize only one time an image and the persist it to disc (or sth. like S3 later).
- serve the results with correct http caching headers
Most solutions are doing either full on-the-fly conversions without any persistance to disc or limited cache sizes like most nginx based solutions.
Basically, its one external facing proxy server in nginx and and internal one which does the actual resize job. The result from internal is persisted on disc.
Replace all CAPITAL words with your current configuration.
Add this to enabled sites in nginx, restart and you can use the following URL structure:
This results in the following directory structure
Why is it secure and fast?
- Only certain dimensions are allowed (via regex), attacker cannot start numerous variations of resizes (gets 404)
- URL parameter behind image url cannot be used to start transformation again (gets 404)
- Serves only local files (or if you like S3 via http call), no remote loading
- You can apply all nginx security features (SSL, rate limiting etc.)
- You can apply all nginx caching header features
- You can apply all nginx scaling, load-balancing and reverse proxy features
- The image process itself is bound to localhost, no direct interaction
- You can put this server behind CDN and add a HTTP Header Check in nginx (and set Auth Header in CDN config) for scaling and even more security
- Codesize is VERY limited :)
I decided against any remote loading capabilities in nginx (you find a lot of examples in the nginx solution links above) due to risk, stability and maintainability concerns. In the end, those are 2 separate processes.
Thanks to the nice solution from Pawel Miech this can be done easily parallel, async and secure in Python3 Code (Making 1 million requests with python-aiohttp).
The only problem may be the first transformation (Cache stampede) which can bring down the system. You could change this with some lock and placeholder file easily.
Change the regex to allow more dimensions.
With the actual nginx you will also get webp transformation, which can be pretty nice for your Chrome based users.