Deploying a Hugo website to Amazon S3 using Bitbucket Pipelines

Atlassian recently released a new feature for their hosted Bitbucket product called “Pipelines”. It’s basically their version of Travis CI, that can do simple building, testing and deployment.

In this blog post I’ll show you how I use Pipelines to deploy my Hugo site to AWS S3. This is short and to-the-point, if you know AWS this should tell you enough to set up your own deployment in about 5 minutes.

Create an AWS user for Pipelines

You need an AWS user that can deploy to your bucket, do NOT use your admin user for this! Simply create a new user called “pipelines” and give it only access to your blog bucket.

This inline policy should be enough access to do these deployments (replace BUCKETNAME with the name of your bucket):

    "Version": "2012-10-17",
    "Statement": [
            "Action": [
            "Resource": [
            "Effect": "Allow",
            "Sid": "AllowPipelinesDeployAccess"

Configure Pipelines with your AWS credentials

Generate an access and secret key for this new user and add these 3 variables in the environment variables settings page in Bitbucket:

AWS Variables:

AWS_DEFAULT_REGION: (your bucket's region)

Bitbucket Pipelines environment settings page:

Bitbucket Pipelines environment variables settings page

Create the Pipelines build config

I’m assuming your hugo site lives in the root of your git repository. In my case my repository looks like this:

karel:Hostile ~/KarelBemelmans/karelbemelmans-hugo$ tree -L 2
├── bitbucket-pipelines.yml
├── config.toml
├── content
│   ├──
│   └── post
├── public
│   ├── 2015
│   ├── 2016
│   ├── 404.html
│   ├── CNAME
│   ├── about-me
│   ├── categories
│   ├── css
│   ├── favicon.png
│   ├── goals
│   ├── images
│   ├── index.html
│   ├── index.xml
│   ├── js
│   ├── page
│   ├── post
│   ├── sitemap.xml
│   ├── touch-icon-144-precomposed.png
│   └── wp-content
├── static
│   ├── CNAME
│   ├── css
│   ├── images
│   └── wp-content
└── themes
    └── hyde-x

Then create the file bitbucket-pipelines.yml in the root of your repository, replace BUCKETNAME with the name of your blog’s bucket:

image: karelbemelmans/pipelines-hugo

    - step:
          - hugo
          - aws s3 sync --delete public s3://BUCKETNAME

Docker Hub and Github links for this Docker image, feel free to fork and modify:

That’s all.

One single remark though

As you can see I use the aws s3 sync method to upload to S3. When I do this from my laptop, where files persist over deployments, that actually makes sense and saves me some upload traffic.

Doing this on Pipelines, where the hugo site is always completely re-generated from scratch inside a Docker container, is actually useless as it will always upload the entire site as every file is “new”.

CloudFormation YAML support

CloudFormation recently added support for YAML, so I’ve updated my Drupal 7 stack with a YAML version. Check the github repository for the new stack:

Running Drupal 7 on AWS with EFS

In two previous blog posts I talked about running Drupal 7 on AWS:

Since writing part 2 of this topic AWS has finally released Elastic File System (EFS), so I had to write an update for the stack that uses EFS instead of S3.

Elastic File System (EFS)

EFS is a shared nfs filesystem you can attach to one or more EC2 instances. While we can store our user uploaded content in S3 using the Drupal s3fs module, getting the css and js aggregation cache to work over multiple servers was still an issue with S3.

If we use EFS instead of S3, and share the sites/default/files directory over every EC2 instance, we remove that problem.

The source code for this stack is on Github:

  • drupal7-efs.json: A very minimal Drupal 7 stack setup
  • drupal7-efs-realistic: A more realistic Drupal 7 site with a lot of contrib modules. This also uses a Docker hub container image instead of building an image in the Launch Configuration.

I will continue to work on the second one, so you probably want to take that stack.

A short note about this stack and Docker

While this stack uses Docker it is not a complete container management system like ECS is intended to be. Rolling out a new version of a Docker image with this stack is pretty much a manual job: you scale the Auto Scaling Group down to 0 nodes, then scale it up again to the required number. All the new instances that get created that way will have the new version of your Docker image. (or you can scale it up to double the normal size and then scale down again to remove the old instances).

Docker cleanup commands

Running Docker containers also includes a little housekeeping to keep your Docker hosts running optimal and not wasting resources. This blog post provides an overview of which commands you can use.

Currently there are a lot of blog posts and stackoverflow questions that talk about clean up commands for old Docker versions, that are not very useful anymore. In this blog post I will try to keep them updated with newer versions of Docker.

Current Docker version as of 2016/07/20: 1.11 (stable), 1.12 (beta)

Clean up old containers

Originally copied from this blog post: source

These commands can be dangerous! So don’t just copy/paste them without at least having a clue what they do.

# Kill all running containers:
docker kill $(docker ps -q)

# Delete all stopped containers (including data-only containers):
docker rm $(docker ps -a -q)

# Delete all exited containers
docker rm $(docker ps -q -f status=exited)

# Delete ALL images:
docker rmi $(docker images -q)

# Delete all 'untagged/dangling' (<none>) images ():
docker rmi $(docker images -q -f dangling=true)

Clean up old volumes

When a container defines a VOLUME it will not automatically delete this volume when the container gets deleted. Some manual clean up is needed to get rid of these “dangling” volumes.

Originally found on Stackoverflow: source

# List all orphaned volumes:
docker volume ls -qf dangling=true

# Eliminate all of them with:
docker volume rm $(docker volume ls -qf dangling=true)

Running Drupal 7 on AWS - part 2

Update 2016/07/11: AWS has released EFS, which is a better choice for our Drupal 7 setup than using S3. Check this blog post for a newer stack that replaces S3 with EFS.

This blog post continues with an actual code example for the blog post about Running Drupal 7 on AWS - part 1. It provides a full CloudFormation setup to get a full Drupal 7 stack running on AWS.

This is the stack we are creating (click the image for a larger version):

Drupal 7 stack on AWS

All the code referenced here is available in this Github repository:

AWS CloudFormation

When you are creating large software stacks, creating it by hand is not an option anymore as that takes too long to setup and is too prone to errors. For this reason AWS has created AWS CloudFormation, their infrastructure-as-a-code service. Check the video on that page for a short introduction.

Sidenote: AWS currently has its own container service called Elastic Container Service (ECS), which we could use since our Drupal 7 site comes in a Docker container. We are however doing it the old school way and will manage our own EC2 instances.

Creating our Drupal 7 stack with CloudFormation

Creating the stack from the drupal7.json file is quite simple:

  • Go to the CloudFormation page on your AWS account
  • Create a new stack, give it a name and select the drupal7.json file
  • Review some of the settings you can change, they should be pretty straight-forward
  • Create the stack and after about 10-15 minutes everything should be up and running

When the stack has been created you will get a value in the Outputs for the WebsiteURL parameter, which is the hostname of the Elastic Load Balancer. The last step to add here would be to create a Route 53 ALIAS record to this name to map it to your real website url.

Surfing to the url will give you an error though, as we have a valid settings file but an empty database. You can either copy your own database now to the RDS server (see the Q&A section how to do that) or simply browse to /install.php and install a fresh copy of Drupal.

Structure of the stack

The biggest piece is the Launch Configuration resource “LaunchConfigurationForDrupalWebServer”. This contains the setup scrip that will be used on the web servers. It installs Docker, generates a Drupal settings.php and builds a new Drupal container that contains this settings.php file.

All the rest is pretty straight-forward AWS stuff: a VPC with 2 subnets, NAT instances for the private subnets, Internet Gateways for the public subnets, a MySQL DB, a memcached instance and an EC2 setup with LC, ASG and ELB.

Some Q&A

How do you ssh into this instance now?

You can’t. You will need to create a bastion (relay) host in a public subnet and assing it a public ip. The web servers run inside the private subnet, which has no direct connection possible from the outside (because of the subnet routing table not using an Internet Gateway). You then ssh to the bastion and then ssh to the instances in the private subnet (or configure ssh forwarding in your local ssh config).

How can I copy my existing database to the RDS db?

Use the bastion host to set up an ssh tunnel. SequelPro for Mac can do this. Or just ssh to the bastion and cat your SQL file to the RDS MySQL instance using the hostname, username and password.

How do I get the logs from the Docker containers in a central location?

Use an rsyslog server in your docker-compose.yml file, like Papertrail:

  build: .
    - "80:80"
  log_driver: syslog
    syslog-address: hostname:port
    tag: "drupal"

I can’t seem to send emails, do I need to configure an SMTP server?

Yes. You should configure Amazon Simple Email Service (SES) in Drupal in your settings.php file. You can script this in the Launch Configuration too as you build the settings.php there.

But what about drush in this setup?

Drush is not used here. We don’t want to install it inside the Dockerfile to keep the container as clean as possible. So simple use curl and ADD to download Drupal modules and themes.

In an actual Drupal production site you would also not use the base FROM: drupal:7-apache in your Dockerfile. You would use your own Drupal docker image that contains your full Drupal stack (core, modules, themes, config…) and just overwrite the settings.php file in the Launch Configuration (like already is being done right now).

Todo list

There are still a few things missing for this CloudFormation stack:

  • Use 2 CloudFront distributions:
    • One for the S3 static content
    • One for the ELB so anonymous users also get a cached page
  • Add Papertrail logging to the Docker containers
  • Use more CloudWatch metrics for the Auto Scaling Group adjusments
  • Configure SES so Drupal can send emails

I might add those in the future, but right now these are left as an exercise for you to implement.

Problems with this stack

There are still some problems with running this setup on AWS though:

  • CSS and JS aggregation does not work with the s3fs module
  • Question: Is the session fixation on the ELB the right way to go?

This stack is still a theoretic one, I don’t really use this in production. I’m sure there will be more problems showing up when you actually start using it for a production setup, feel free to use the comments section to point them out and I’ll see if I can find a decent solution for the.

Further reading

While writing this blog post I did a lot of research on writing CloudFormation stacks, and as it usually goes, I found a lot of better examples than the ones I was writing. So, looking back now on my blog post, most of the code of my CF templates comes from the official AWS examples below, so make sure to check them out too. They have a lot of examples for some common stacks, with or without Multi-AZ support, and you can pretty much copy/paste entire stacks as a starting point for your own stacks.

Running Drupal 7 on AWS - part 1

The last 5 months I’ve been doing a lot of work on Amazon Web Services (AWS) for my new job as a Cloud Architect at Nordcloud Sweden. Learning how to build applications that take fully advantage of The Cloud has made me very anxious to redo some of my previous projects and rebuild them for AWS. In this blog post I’ll start of with the best way to run a Drupal 7 website on AWS.

While this blog post is written with Drupal 7 as an example, it could easily by adapted for any other PHP based application.

1. The current Drupal server setup

If you are a Drupal builder, you are most likely using a combination of two typical web server setups for your production sites:

  • A Shared Hosting server, where multiple websites run on the same server
  • A dedicated Virtual Private Server (VPS) per website

How you deploy your code and if you use Docker or not is currently not relevant, the main thing is that you have dedicated (virtual or physical) servers that run 24-7 with the exact same hardware configuration.

On these servers you probably have this software stack installed:

  • nginx or apache
  • MySQL/MariaDB database
  • A local disk where your user content gets uploaded to
  • Shell access via ssh so you can run drush and cronjobs
  • Maybe an Apache Solr server for search indexing
  • Maybe a varnish cache in front of the web server
  • Maybe a memcached bin to offload your database

All of this is managed by you, or maybe a hosting company that does it for you, using some kind of provisioning tool like chef or puppet. Making changes to this setup is hard and keeping the setup in sync with your development stack is probably even harder (even when you use Docker).

If you use a managed hosting provider you already got rid of being responsible for the hardware, but you still run the same kind of static server setup that you would have if you did it yourself.

Problems with this setup

Problem 1: There are a lot of single point of failures in this setup: a lot of non-redundant single-instance services are running on the same server. If any component crashes, your entire site is offline.

Problem 2: The CPU/RAM of this server does not scale up or down automatically depending on the server load, there’s always a manual intervention required to make changes to the hardware configuration. If you get an unexpected traffic boost, this might cause your server to go down.

Problem 3: The whole setup is constantly running at full power, no matter what load it’s currently having. This is a waste of resources and even worse, your money.

2. Moving things to AWS

So let’s see now how we can move this setup to AWS, and while doing so, get rid of the problems from the previous paragraph.

When you move this web server setup to the cloud you can basically do it two ways: the wrong way and the right way.

The wrong way: lift and shift

If you just see AWS as another managed hosting provider you could go for the lift and shift solution. In this scenario you re-create your entire server just like you did in the old setup. You run a single EC2 instance (= the AWS equivalent of a virtual server) with your full stack inside of it.

This works of course, but it does not scale, it’s not redundant and it will probably cost you more than running your old setup. So it doesn’t fix any of the problems we’ve described in the previous chapter.

AWS has a tool to calculate the cost of such a move, called the TCO Calculator. Just keep in mind that if you just compare the cloud cost to your own datacenter cost using the same hardware setup you are not using the cloud the right way and you will pay a lot more than you should.

The right way: build your application for AWS

Before we continue to optimize our setup for AWS I have to explain a few AWS concepts that will be important to understand: Managed Services and High-Availability


High-Availability (HA) is a concept that you will see pop up everywhere when using AWS. It’s about not having a single point of failure in your setup by using redundant setups using the tools AWS offers you.

An important part of HA setups is the concept of regions and availability zones (AZ). Each region has several availability zones, which are independent data centers that can communicate with each other as if they were a local network.


  • Region: eu-west-1 (Ireland))
  • Availability Zones (AZ): eu-west-1a, eu-west-1b, eu-west-1c

Certain things are automatically replicated within the AZ’s for a region (e.g. all the managed services we’ll see in the next topic), but you’re also required to use them intelligently yourself. For a web server setup for example, using EC2 instances, you would create 2 servers, each in a different AZ, and have a EC2 LoadBalancer (which is also HA since it’s a managed service) in front of them. If one of servers goes down, or even the whole AZ, the load balancer will keep working and only send traffic to the server in the AZ that is still working.

In a Lucid Chart diagram this HA setup would look like this:

An example of a High-Availability setup on AWS

AWS Services

AWS Services are simply put your usual services from your software stack, but managed by Amazon. They offer them as high-available software-as-a-service where you don’t have to worry about anything else than using it.

For our Drupal 7 setup we’ll be using these AWS Services:

  • Web servers: Amazon EC2 (EC2 instances, Elastic Load Balancer, Auto Scaling Groups)
  • Database: Amazon RDS (MySQL, MariaDB or even Aurora if you want)
  • Configuration files and User uploaded content: Amazon S3
  • Key/value caching server: Amazon Elasticache (memcached)
  • Reverse proxy content cache: Amazon CloudFront

Now that we have all the AWS tools explained, let’s go build our Drupal 7 site using them.

3. Building Drupal 7 on AWS

To deal with the problems we had when running on a Shared Hosting or VPS server we have to make sure we cover these two items:

  • Our setup needs to have High-Availability: no single point of failure
  • It has to have automatic scaling: scale in and out when needed

Scaling up and down means increasing or decreasing the amount of RAM or CPU cores in a system, while scaling in and out means adding more similar servers to a setup or remove some of them. Scaling in and out obviously only works if you have a load balancer that distributes traffic among the available servers.

Look at this Lucid Chart diagram to get an idea of what the final stack will look like (click for a larger version):

Drupal 7 stack on AWS

Database: AWS RDS MySQL

The database is probably the easiest component to configure in our setup: we simply use an Amazon RDS MySQL instance. We connect to it using the hostname and the username and password we supply.

We can make this stack HA by using the Multi-AZ option. This is not a master-master setup, but a standby instance in a different AZ that will get booted by AWS in the event the main one goes down. You do not need to configure anything for this, AWS will update the ip address of the hostname automatically.

Backups of the RDS instance are taken by using daily snapshots, which will be enabled by default for any RDS database you create.

Upload content: Amazon S3

Update 2016/07/11: AWS has released EFS, which is a better choice for our Drupal 7 setup than using S3. Check this blog post for a newer stack that replaces S3 with EFS.

Since our setup will include web servers running on AWS EC2 that will scale in and out depending on the usage, we cannot have any permanent data inside of them. All the content that gets uploaded by Drupal will have to be stored in a central file storage that is accessible by all web servers: Amazon S3.

Drupal can not use S3 out of the box, but there are available to achieve this. When writing this blog post I was still experimenting which one was best suited for the task, I’ll update this post later on with my findings.

While S3 has versioning support, it’s not a bad idea to have a second AWS account copy all the files from S3 every day, hour or even when they get created.

Besides the user uploaded content we will store another type of files in S3: configuration files used by instances and load balancers. More about this later.

Memcached: AWS Elasticache

There’s not much to say about Elasticache. Simply create a memcached server and configure your instances to use it.

Caching: AWS CloudFront

CloudFront is Amazon’s CDN service with edge locations all over the world. The most important though to know here is that invalidating requests is not easy and you should pretty much rely on your Drupal site setting the correct cache headers for each request it serves. If you need to clear your entire cache, it might be easier (and cheaper) to just create a new CloudFront distribution and delete the old one.

We use CloudFront like you would use any other cache: just put it in front of the web server. In this case it will be put in front of the Elastic Load Balancer (see next topic) with the DNS record for our site pointing to the CloudFront distribution.

Web servers and AutoScaling: Amazon EC2

Now we get to the core of the setup: the actual web servers. We will be using a set of AWS EC2 services to accomplish that task.

Let’s start with pointing out that our Drupal code is in a Docker container, pushed to a (public or private) repository. The EC2 instances can reach the registry and can check out the images without authentication.

Configuring our EC2 instances will be done by something called a Launch Configurations. A Launch Configuration can be best seen as a configuration file that will be used by an Auto Scaling Group to create servers. The Launch Configuration contains the base server image to be used, the type of EC2 instance to be used, some other things I will not go into detail here, and most important: the user-data script.

The user-data script is simply a bash shell script that we will use to install the required software on the web servers:

  • Install certain OS packages we need (e.g. aws-cli, docker)
  • Install extra packages using simple curl commands (e.g. docker-compose)
  • Configure rsyslog monitoring (if we don’t use it via docker-compose)
  • More things as you like
  • And as last step: start the Drupal Docker container.

The user-data script will also handle the creation of a custom settings.php file for Drupal. It will overwrite the default one inside the Docker Drupal container with our values for the datasbase, the memcached server, etc…

This Launch Configuration will now be used by an Auto Scaling Group (ASG) to fire up a set of instances. This ASG can become intelligent if you connect it to AWS Cloud Watch, where it will create or remove instances by monitoring certain metrics (server load, RAM usage,…) but it can also be quote simple as to just have a single web server running in each available AZ all the time.

The third component in our web server setup is an Elastic Load Balancer (ELB). The ASG will create servers and the ELB distributes traffic between them and performs the health checks. If a server becomes unhealthy, the ELB will remote it from the rotation and kills it. The ASG will create a new one which will then be picked up by the ELB again and put into the load balancing rotation.

Together these 3 services - LC, ASG and ELB - create a setup that scales in and out when needed, exactly what we wanted for our Drupal 7 setup.

If this all sounds a bit difficult to visualize, check the AWS Auto Scaling article for a longer explanation with some examples.

Route 53

Route 53 is AWS’s DNS service. While you can use any DNS service you want and just point the CNAME records to AWS hostnames I strongly recommend using Route 53. Because AWS internally updates ip address all the time, using a CNAME record might give you situations where DNS lookups can go to the wrong ip.

To deal with this issue, AWS has created an ALIAS record where you can point to an internal AWS resource (ELB, CloudFront distribution, S3 location, …) and won’t be affected by any downtime when ip address change.

4. Did we solve all our problems?

Now that we’ve listed all the services we will be using to build our Drupal site, did we actually meet all the requirements we set out to achieve?

High Availability

Do we have a High Available setup with no single points of failure? Yes. We either use Amazon HA services or we create services in 2 AZ’s at all times.

Auto Scaling

Is this a setup that automatically scales in and out without manual interaction? Yes. The combination of Launch Configuration, Auto Scaling Groups and Elastic Load Balancers takes care of that.

5. Cost

We have managed to turn our Drupal stack into a high available, auto scaling setup, but what will this cost us to run this? To get that cost, we use the Simple Monthly Calculator that Amazon provides.

Before we start calculating we have to make some decisions about which instance types and data usage we will be talking.

This is a very basic setup for now. I’m not going into detail about instance types, snapshot storage space, CloudWatch monitoring, etc… Adding these will of course increase the total cost of running your site on AWS.

  • We use a db.t2.medium 10GB MySQL RDS database, with the Multi-AZ option
  • An S3 bucket that contains 500GB of files
  • A cache.t2.micro memcached instance, which is more than enough for our setup
  • A CloudFront bucket that has worldwide edge location coverage

Our EC2 setup is as follows:

  • We use 2 Availability Zones
  • In each AZ we create a t2.small (1 CPU core, 2GB RAM) web server with a 30GB EBS root disk
  • One Launch Configuration that handles creating the EC2 instances
  • One Auto Scaling Group that scales out instances in pairs, one per AZ
  • One Elastic Load balancer, created in both AZ’s

We expect about 100GB traffic per month to our site.

As you can see I’m using only small instance types for this calculation. Don’t go too big too fast, our scaling setup adds more capacity than in a setup where you would have only one big instance.

For price calculation I’m taking the EU West-1 region (Ireland). Prices vary between different regions, so this is not a complete picture. But still, you should go for the region that is closest to your customers and has all the services you need (e.g. in Europe the Frankfurt region does not have all the services Ireland currently offers).

Adding all of this into AWS’s Monthly Cost tool gives you this number: $75,74 (€66,70) a month or (see calculation details).

6. Conclusion

I hope this blog post was a good example to show you have to optimize your Drupal site for AWS. It can easily be applied to Drupal 8 or any other PHP application, as long as you focus on the important goals of this setup: High Availability and Auto Scaling.

7. Next steps

Even though this is a lengthy blog post, there are still a lot of topics I haven’t covered yet. There are many more AWS services you can use to monitor, scale and build your application. I also haven’t addressed how you should run cron jobs or nightly import/sync tasks in a setup like this. This is all stuff for upcoming blog posts.

Drupal 7 on AWS Part 2: CloudFormation

This blog post focused on the “how?” and “why?” running Drupal on AWS. Part 2 is an actual example of such a setup, with a complete infrastructure setup provided as a CloudFormation stack. CloudFormation is AWS’s Infrastructure-as-a-code tool, something you should definitely should be using for any large software stack.

Getting started with Amazon Web Services (AWS)

Getting started with AWS is actually quite simple. The best way is to start learning for these 5 AWS Certification exams:

Associate (beginner) level:

Professional level:

Don’t get blinded by the names, no matter what your job title is you should learn about all 3 facets of AWS: architect, developer and sysop. So just do all the exams.

The associate (beginner) ones you can start learning straight away if you have some basic experience with server setups. The professional ones you should take if you have at least one or two years experience of working with AWS.

A good place to learn for these exams is A Cloud Guru. They have a bundle that gives you lifetime access to training material for all 5 exams and it only costs you $189.

A Docker Drupal 8 deployment container

Update 2016/07/06: I’ve restarted working on this project. The README on Github should have all the needed information.

I wrote a small example project that creates a deployable Drupal 8 container:

This can of course be used by any PHP project, so this is just an example using Drupal 8.

Datadog php-fpm monitoring via nginx

It took me some time to get this set up properly, but here are my configs that finally worked to get php-fpm monitoring using datadog.

1. nginx vhost config

First, make sure you override your site’s hostname to localhost. For my site this is to make sure connections don’t go out to cloudflare but stay local on the server: /etc/hosts needs to contain this line.

I use my port 80 vhost config for the status page. Cloudflare enforces SSL so this vhost never gets used for anything non-local on my server.

server {
  listen 80;
  listen [::]:80;
  access_log /var/log/nginx/ main;
  error_log /var/log/nginx/ error;
  location ~ ^/(status|ping) {
    access_log off;
    deny all;
    include fastcgi_params;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;

I use “” as fastcgi_pass value. This is defined in another file on the conf.d dir and matches the socket definition in the php-fpm pool config file.


upstream {
 server unix:/var/run/php-fpm/;

(This setup is inspired by Mattias’ config for the Nucleus customers)

2. php-fpm pool config

Make sure these 3 lines are present in your php-fpm pool config:

pm.status_path = /status
ping.path = /ping
ping.response = pong

Normally your acl should be fine as requests will come from localhost.

3. Datadog config file

So nginx and php-fpm are configured, all we have left is the datadog config file:


 - # Get metrics from your FPM pool with this URL
   # Get a reliable service check of your FPM pool with that one
   # Set the expected reply to the ping.
   ping_reply: pong

Reload nginx, php-fpm and datadog-agent after that and your php-fpm tracking should now work. This tracks only 1 pool, it’s up to you to figure out how to track multiple pools now :)

Useful bash one liners

Disclaimer: These are extremely simplified one liners that do not perform any form of input validation or character escaping. You should not use this in a ‘hostile’ environment where you have no idea what the input might be. You have been warned.

This is a list of some bash one liners I use on a daily basis during development and problem debugging. I made this list as a “you might not know this one yet” and will continue to update it every now and then.

Latest update: 2016/06/18.

Running a command on multiple files at once

This is a basic structure we will be re-using for the other examples. Run a command on all files in a directory:

for FILE in $(ls *); do command $FILE; done

Run a command for all lines in a file:

for LINE in $(cat file.txt); do command $LINE; done

Warning: As noted in the comments, this assumes there are no spaces in the lines in your file. If they do contain spaces, you need to add proper escaping using quotes.

Make certain things easier to read

Format a local XML file with proper indenting:

xmllint --format input.xml > output.xml

Run this script on all XML files in a directory:

for FILE in $(ls *.xml); do xmllint --format $FILE -o $FILE; done

Monitor new lines at the end of a log file and colorize the output (requires the package ccze):

tail -f /var/logsyslog | ccze

Find a specific text in a lot of files

Find a text inside a list of files and output the filename when a match occurs. Recurse and case-insensitive:

grep -irl foo*

Count the amount of files in a directory:

cd dir; ls -1 | wc -l

Find a filename that contains the string “foo”:

find ./ -name *foo*

Find all files modified in the last 7 days:

find ./ -mtime -7

And similar all files that have be modified more than 7 days ago:

find ./ -mtime +7

Modify files

chmod 644 all files in the current directory and below:

find . -type f -exec chmod 644 {} \;

chmod 755 all directories in the current directory and below:

find . -type d -exec chmod 755 {} \;

Commandline JSON formatting and parsing

The JQ command is a must-have for anything that returns JSON output on the command line:

curl url | jq '.'

I use this for finding the latest snapshot in a snapshot repository for elasticsearch:

curl -s -XGET "localhost:9200/_snapshot/my_backup/_all" | jq -r '.snapshots[-1:][].snapshot'

Actions against botnets and spammers

Find a list of all bots using a guestbook script to spam a site (that sadly has no captcha). I run this on the apache access_log file:

cat access_log | grep POST | grep guestbook | awk '{print $1}' | sort | uniq > ips.txt

The ips.txt file will now contain a list of unique ip addresses I want to ban with iptables:

for IP in $(cat ips.txt ); do iptables -I INPUT -s $IP -j REJECT; done

Cleanup stuff

Delete all but the 5 most recent items in a directory. I use this in Bamboo build scripts to clean up old releases during a deployment:

ls -1 --sort=time | tail -n +6 | xargs rm -rf -

That’s all for now!