SETCOOKI

FULL-STACK (WEB) DEVELOPER

WordPress in the cloud (Docker + Minio S3)

Recently we decided to port a WordPress project to run inside a Docker swarm environment. The requirements were basically:

  • Deploy workflow with Gitlab CI and Docker compose
  • Scalable service with Docker swarm and Traefik load balancer
  • Move every “service” into the “cloud” (WordPress core, Database, Mail, …)
  • Move and have all media assets in the “cloud” with Minio private cloud server
  • Use a shared media library across multiple WordPress installs

In this post i will focus on moving WordPress media assets into the “cloud”. The “cloud” in this context means away from /wp-content/uploads and assets on a physical volume – instead hosting all files on a S3 compatible storage service like Minio. To be clear from the beginning: This is not the only viable way in a Docker powered environment. The most prominent approach for most will be still to have some sort of shared volume on file system level (NFS) that even could be kept in sync across different Docker nodes. With the following requirement this approach did fall of the table for us though:

  • Upload and manage files without WordPress as a frontend
  • Access restrictions to assets with private/public folder access rights
  • Use the same assets (media library) across multiple WordPress instances with independent databases

The last point for WordPress standards is “ground breaking”. Think of the possibilities – One media library for multiple WordPress installs. E.g. one client with multiple WordPress pages or different environments (e.g. real content staging, etc..). If you think further it is possible to even have a global cloud media library and a local, per WordPress install, running together. Is this something or what?

When assessing the viability and options for going S3 with WordPress one inevitable stumbles on the following issues:

  1. Does the WordPress media library support S3 as a storage engine?
    The answer is NO – But you can make WordPress use S3 either by custom programming or via plugins (e.g. https://wordpress.org/plugins/ilab-media-tools)
  2. Which S3 Product/Vendor to use?
    If you want to go open source private cloud there is a list of options and Minio is definitely a front runner in the field.
  3. How to manage multiple WordPress installs with independent databases and a shared media library?
    This is the tricky part and we will go into this in details

So now WordPress can go S3 – the good news. “The bad news” – WordPress stores all assets and meta data also in its own database. An asset that does not have a post entry (type attachment) does not exist for WordPress! This of course makes sense since WordPress handles the thumbnail generation, meta data aggregation and so on. So in case of WordPress as a service in cloud/distributed environment where multiple WordPress instances should share the same S3 media library there is a problem. When a user uploads a file it will be stored on the S3 storage but only the uploader´s WordPress instance does know that a new file has been added – all the other instances do not know of any changes! The distributed WordPress instances need to be told that there is change to the S3 media library. None of the tested WordPress S3 plugins do that of course. In fact those plugins are meant to be a single WordPress install solution. One WordPress – one media library.

We decided for the Ilab Media Cloud plugin (https://wordpress.org/plugins/ilab-media-tools) even though we were aware of those short comings. Fact is that the Media Cloud plugin does work quite well and is written on a very stable and clean code basis. To overcome the need to sync the WordPress installs we wrote a helper plugin that will keep any connected WordPress install in sync. The trick is the use of Minio´s webhook capabilities. Once a file is added of delete a HTTP webhook is fired and the payload contains all the needed file (meta) data to run through the WordPress insert attachment process. In short: once a webhook is received the plugin downloads the file from the S3 server to the WordPress upload folders and triggers a manual insert attachment process without actually uploading anything. We do this do extract the meta data and provoke the post and post meta entries. Should the webhook fail for whatever reason we would be left with an off-sync situation between the WordPress installs. For this case we can make use of Minio´s client interface (https://docs.minio.io/docs/minio-client-complete-guide) and trigger a resync (cron) by iterating through all media assets and a helper script that checks if a file is missing on a specific WordPress install. For this we need another helper

Since we have strong dependencies on the Ilab Media Cloud plugin i consider this solution as experimental but satisfactorily working for our needs. Fact is – this is something that is not meant by WordPress to work out of the box nor a common scenario. But for me it elevated WordPress into the levels of bigger enterprise (cloud) CMS systems and opened up a whole new set of use cases e.g.:

  • With Docker swarm we can have endless replicated instances of our WordPress App with the same media library running
  • We can protected private files (with to S3 canned policies for folders) that can be only reached for authenticated users (via proxy and S3 signed requests)
  • We can use another S3 compatible frontend (e.g. seafile, owncloud, etc.pp.) to upload files to our shared S3 media library (No need for asset managers to use WordPress as a frontend anymore!)
  • We now can have encrypted files, Replication, Multi-level authentication, and much more …

If you interested to harness the power of WordPress in the cloud – contact me!