How to automate the right tasks so you can focus on development
Interviewed by Christophe Limpalair on 10/05/2015
From simplifying your life to improving your onboarding process, automation can make a big difference. If there are tasks you do over and over again and haven't automated yet, this episode is for you. If you have sticky notes with commands or files with sequences of commands, this episode is going to make your life better.
Speaking of making your life better...have you signed up for your free account with Rollbar yet?
Links and Resources
We, as developers, repeat many of the same tasks over and over again. How often do you see people (or yourself) doing this?
All the time. "I think that's the story of being a software engineer."
Kate goes on to say that most of us have lists somewhere-- on sticky notes or text files, we like to write down commands and sequences of commands. You'll go in those files to copy/paste, and that's your process. But at a certain point, you start realizing that this process is terrible.
I've been there and I've done that, so I agree.
Before we move on, can you tell us a little bit more about yourself and what you do?
Kate Heddleston is a Software Engineer in San Francisco, where she does web application development. She calls herself a product or feature engineer because she likes to build things that touch users & customers. This has resulted in her having to build infrastructure environments to support her work.
Kate is currently working on her own startup. She's building everything herself which requires an investment of time upfront, but then gives her the flexibility she is looking for.
You don't have time to worry about deploying when you need to ship features. For this reason, Kate makes sure she has great tools at her disposal, and the automation example we will talk about later in the episode will show you how she's been able to create One-Click deploys and One-Click test environments.
Are you working on your own product, or other companies' products?
Her own product. It's not out on the market yet. She should have it open for a private beta soon, and it has to do with makeup.
Did you start out with an interest in infrastructure, or did you start working on it out of necessity?
Kate got a masters in Computer Science, or Human Computer Interaction to be more specific, but she was never interested in computing for computing's sake.
She likes to build things that have a human element to them. DevOps wasn't interesting to her at first. It only became interesting as she started running into problems. Kate goes on to say that this ironically turns DevOps into an area where you are dealing with human problems.
You gave a talk on One-Click Deploys. We're going to take it a step further, but can you explain how it works?
The One-Click deploy system is part of that process of making deployments as easy as possible so Kate can focus on writing features. The system she demonstrated uses Docker and AWS. Of course, you can build these kinds of systems with other technologies, the implementation will just be different.
Here's how the process goes:
1) Push to GitHub
2) A container build is triggered for Docker
3) The containers are deployed to servers
Docker containers really reduce the amount of things that you are doing out on servers and in production to deploy code, because the build happens ahead of time. You can take the container, put it on the server, kill the old container, and run the new container. It also makes it really easy to roll back.
The AWS Boto library SDK for Python lets you interact with Docker and servers via Fabric. "It was actually pretty simple to automate this whole process, so that I could then not just deploy code to servers, but I could also create servers and load balancers, and spin them up and configure them. All of this through web applications."
Here are more Python related tools and docs for AWS.
Of course, there are libraries for other languages if you're not working in Python.
How is the One-Click Test Environment different?
The deployment is just the process of getting code out to your servers. You could have one server or many different servers. One server is a very small part of the system. If you go up a layer of abstraction, she has a service oriented architecture with an instance inside of a service. If you want to test code, you don't usually want to test a single instance but instead the overall service.
In order to do that, you need to spin up the entire service. Instead of having this environment running constantly, she developed a way of quickly and easily deploying a new environment.
If you go out even another layer, you have these internal services. But what if you want to test all of these services together? You can spin them all up in one environment.
How do you even decide what to automate in the first place? When does it make sense?
For her, it's usually a natural progression. Like we talked about earlier, a lot of people start out with sticky notes or files to store common commands or sequences. She'll start by writing scripts if there's something she's doing a lot.
Eventually it might become cumbersome because you have more and more scripts. This is the point where you can ask yourself: "is there a better way to automate this?"
Tooling today is incredible. Automation is much easier.
Another great point Kate made is regarding context switching. This is costly to a developer, having to switch back and forth between infrastructure and code. Instead of doing that, can you not automate whatever it is you that is forcing you to context switch?
One of Kate's blog posts "Onboarding and the Cost of Team Debt" talks about training new employees. Having an automated and standardized process can really help with this process. Instead of having many complicated steps that could easily overwhelm a new employee, having an automated system can reduce the fear of breaking things.
So then you need to answer the important question: "how do I automate this?"
What were your first steps in automating this?
She tackled this one as a web application first. She jokes that it is a slightly more powerful version of Heroku because you can control each of the services and how they interact with each other. With Heroku you have single apps, and that's too much of an abstraction for her.
She wanted something that specifically allowed her to create services.
So she started mapping it out as a web app. She's built a lot of apps with Flask and using AWS's Boto made it relatively simple.
So you think of the final product then reverse engineer and break it down in smaller steps?
Thanks to Rollbar for sponsoring this episode
At the beginning of the episode, I mentioned my sponsor Rollbar. And I'm really glad to give them a shout-out, because my mission with ScaleYourCode is to help make developer's lives better and easier. That's exactly what Rollbar does. By tracking errors for you, it let's you focus on what you love--creating new and better features instead of wasting hours debugging. This is not just some sales pitch that's going to lead you to disappointment. This is me telling you about a great product that I personally use. Thank you for sponsoring the show, and thank you for helping realize my mission.
Go to http://try.rollbar.com/syc for a free trial
Do you create a new subnet for your test environment, or an entirely new VPC?
This is totally up to you, but she uses the same VPC and subnets to make it easier.
Her to-do list includes adding more flexibility with this, especially with different regions, but that hasn't been a major concern up-front.
So in AWS you have a Virtual Private Cloud (VPC).
VPCs have a number of benefits:
- Assign static private IP addresses to instances that persist even when you start/stop them
- Define and attach network interfaces to instances.
- Change security groups on the fly
- Control inbound/outbound traffic to instances
You have a range of IP addresses and inside of that you can create subnets. Subnets can allow traffic from the outside world, or only allow internal communication.
This enables you to have public-facing load balancers that pass on requests to your private subnets, which includes webservers, app servers, data stores, etc... Or you can set it up in another way if you need to.
If you're interested in learning more about the difference between public/private subnets, here's a great explanation. Because you can have secure public subnets.
How do you chain events? With creating a new VPC, for example, how do your other commands wait for AWS's response and store the needed information that gets returned?
There are a couple of ways you can do it. One way is to send tasks to an asynchronous queue.
You want to create a new VPC for example. You send the command to create the VPC and hold up all of the other commands until AWS returns a response. You know when AWS returns a response by having something check after a certain amount of time. If it's not yet ready, tell your process to check again in 30 seconds.
Once it returns a ready response, you can go forward with the other tasks.
One problem with this approach is that Amazon sometimes tells you that an instance is ready to go before it's truly ready. You have to account for that.
What do you use for queues?
Python-rq. It's a Python library that uses Redis in the background.
RabbitMQ and Celery are also popular options in the Python world.
When does it make sense to create VPCs instead of just regular EC2?
Normally you just create a VPC once. For test environments you probably don't need to create a new VPC, you can just use your existing one.
Same thing with subnets. You don't really think about them once they are created. When you create a service, you just specify whether it is internal or external, and most of the heavy lifting is done for you.
Once you have it set up it's pretty easy to manage and automate.
What if you want to upload something to an instance in a private subnet?
Since you can't just access an instance in a private subnet (it doesn't have a public IP, as explained here), you have to set up a NAT instance.
This NAT instance sits in your public subnet with a public IP address. Any time you send/receive from/to a private instance, it goes through the NAT.
Once our VPC is created, how do our scripts know which instance to upload our different configurations to?
Configuring instances with load balancers in AWS is really easy.
Instances have a lot of information in AWS...they don't just have IP addresses, they also have instance IDs and a ton of other information. So instead of using IP addresses to identify instances, you use instance IDs.
Kate stores a list of all her instances in a data store, and feeds them to her load balancers.
So if we have a service which has a load balancer and multiple instances, and you create a new instance, it's not automatically updated in that load balancer. There's an extra step where you tell it to activate a certain instance.
The load balancer will do a health check on each instance. You can specify how often each health check happens and where they hit.
This is also nice because if there's something wrong with the instance you don't have to kill it, you can simply deactivate it and look at what's going on.
You've used Chef in the past. How different is it with Docker?
Chef is a tool that uses Ruby scripts, and these scripts are deployed to a production service to perform their purpose. They'll install things, configure things, they'll pull down your code, etc...
Docker on the other hand is a container. It's set up before it even goes to production by using a Dockerfile. This Dockerfile has a list of Linux commands that install certain things or perform certain actions. This results in you having a container with all of your files and services to the configuration you requested. So now you can put the container on your instance and all you need to use it is to run it.
(If you'd like to learn more, I interviewed James Turnbull who advises Docker)
Kate points out that this results in fewer actions performed in production, reducing the chance of mistakes.
You can still use Chef with Docker, though.
If you're interested in learning more about Chef, check out my episodes with Nathen Harvey, who is on the Chef team. (First, second)
Do you use Chef for your containers?
Not currently, but that's because her configuration is pretty simple. She sticks to Fabric commands because she only has a handful of them.
How do you get code to your containers?
Kate mentions that she is not a Docker expert, and so she may not follow best practices for Docker, but she currently bakes the code in her container. That means she just has to ship a new container with every deploy.
That's not the only way to do it. You could also mount files as you start the containers.
Can we see examples of your code for these apps?
Yes, Kate has an example of a Python/Flask app with a Dockerfile on GitHub.
How did you test this?
By spinning up the system and trying it out.
You can also, of course, have Python tests (or whatever language you're using). You can also use bug tracking services like Rollbar.
Really it works just like a web application, so you test it just like a web app.
What do you use to monitor?
Sentry for bug tracking.
How can people get in touch?
How did this interview help you?
If you learned anything from this interview, please thank our guest for their time.
Oh, and help your followers learn by clicking the Tweet button below :)