# Backup your Postgres database like a pro

Nowadays, our applications are hosted in the cloud using services like AWS and DigitalOcean. Instead of going for a `database-as-a-service`, where everything is managed and controlled by the service provider, we might choose to do the opposite i.e. using our own cloud server instance due to cost or several other constraints, since everything is done our way, let's not forget to do the most crucial part as well i.e periodic database backups of your client data. So in this post, we will discuss how we could backup our Postgres database hosted in `AWS EC2` and store the database dumps in the `AWS S3`.

Note: I am assuming you already have a working `AWS account` with access permission to `EC2 Instance` & `S3 Bucket`. I'm also assuming that you have hosted your Postgres database in one of your EC2 Instances.

## Creating S3 bucket and policy

To create the bucket and policy, log into the AWS console and go to S3.

- In the `Create Bucket` section of the page, add a name for your database backup. You can use something simple like `my-db-backup` or something more descriptive like `mydatabasebackup`.

- Under `Permissions`, make sure that both of these are checked: `Access logs` (turned on by default) and `Website endpoint` (also turned on by default).

- Click `Create` at the bottom of this page to finish creating your bucket!

## Setting up the roles and permissions

Create a new user with the permission to only upload to the bucket we just created in the step above, for that follow the instructions given below:

- Inside the `AWS console`, search for `IAM` and click on `Roles`
- Next, click on the `create role` button on the top right corner of the console and select `S3` under `Use case` from the list of dropdowns.
- Again, click on the `create policy` button and then click on the `JSON` tab and paste the code given below:

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:PutObject"],
      "Resource": "arn:aws:s3:::<bucket>/<object>/*"
    }
  ]
}
```
NOTE: Replace `<bucket>` with the bucket you created for your database backups, example: `my-db-backup`, and then replace `<object>` with the folder you would like those dumps to be uploaded on, example: `backups`

- Skip to the next part by clicking on the `Next` buttons until you reach a page named `Review Policy` and give your policy a name. If you messed up the process in between you could [follow this article](https://medium.com/analytics-vidhya/how-to-create-iam-roles-in-aws-311940e6ebe3) otherwise, skip to the next step.
- Save the `ACCESS KEY` & `SECRET KEY` provided for the user you just created in the step above

## Installing dependencies

This article is based on a [repository](https://github.com/sidbelbase/backup-my-postgres) hosted on github, please clone and check out the script and requirements in the `requirements.txt` file.

Since we are using python for backing up our database, we will be needing to install dependencies like `boto` (AWS SDK for python) inside our `EC2 Instance`. For that follow the steps given below:

- [SSH to your EC2 Instance](https://99robots.com/how-to-ssh-to-ec2-instance-on-aws/) using this command:
`ssh -i path/to/AccessKey.pem ubuntu@ec2-12-34-567-890.compute-1.amazonaws.com`
- Install python dependencies using `pip3 install -r requirements.txt`
- Rename `.sample.env` to `.env` and add the following credentials

```bash
S3_BUCKET=<bucket>
DATABASE_USERNAME=<user>
AWS_ACCESS_KEY_ID=<access-key>
AWS_SECRET_ACCESS_KEY=<secret-key>
```

## Running script manually

After completing all the steps defined above, you finally be able to run the actual script.
Go ahead and run the script by going inside the root of this cloned project:
```bash
python3 run.py
``` 

## Running script like a pro

The full purpose to write this well-guided article (lol) is to do this thing like a professional like it was supposed to. There are numerous ways to do it, it's all fine if you want to go the extra mile, like using services like `Airflow`, `Github Actions`, `PipeDream`, etc but we will be using `cronjobs` this time.

- Run the following command to set up the cron job:

  ```bash
  sudo service cron start (if not running)
  crontab -e
  ```

- Now paste this cron to run the script every night at 02:00 AM NEPALI TIME (UTC+5:45)

  ```bash
  0 22 * * * python3 /home/ubuntu/backup-my-postgres/run.py
  ```

NOTE: I'm using `my native timezone`, please [change that to your respective timezone](https://crontab.guru/) if you care about when and at what time it runs a backup. 

That's it. No further steps. That is how you run your backup script. 

<hr>

The main motivation behind writing this article is to help another internet fellow who might get overwhelmed by libraries like `minio`, `s3cmd`, and `awscli`. Those tools could deliver greater results but needs various bells & whistles to configure. I also tried to pull this guide using `bash` but it didn't feel like home, hence the python. 

I hope you guys found this article useful. Please don't share or recommend this article to your parents or relatives who might go `wtf, why me?` but to those who might actually need it & will remember you for this article their whole life. [Drop a star in my repo](https://github.com/sidbelbase/backup-my-postgres) (it's free) if you feel like it ;). 

<center><h3>Written with ❤️ in Nepal</h3></center>