Docker – building images

Everything you do with Docker concern containers and containers are built from images. While there are plenty of cases where prebuilt images are useful (MySql for instance) most of the time you’ll want to create your own image. You can start from scratch if you wish, but most of the time you’ll want build on top of an existing image. For example, most of the work I do starts with Microsoft’s Asp.Net Core images.

Pulling images

Images come from Docker registries and Docker will pull them. Once an image has been pulled it will be stored locally on your machine. If you are building a new container and the image is already on your machine Docker will use it. If the image isn’t present Docker will locate the image in a registry (Docker Hub by default) and download it.

You can manually pull images using the Docker Pull command. This will cause Docker to download an image from Docker Hub (or another configured registry) and store it locally.

docker pull jakewatkins/ancbuildenv

This will cause Docker to download my customized Asp.Net build image. If I update the image and you want the old one you can add a tag to the image name:

docker pull jakewatkins/ancbuildenv:1.0

If you don’t add the tag (the stuff after the full colon) Docker assumes you want the latest version (who wouldn’t want the latest and greatest? Cobol and FORTRAN programmers, that’s who).

Building images

Building your custom image is a little more involved. You first have to create a docker file which is a script with instructions that tells docker how to build the image. Once you have your docker file you have to actually build it.

Building your docker file is done as follows:

docker build . -t jakewatkins/example1 -f DockerFile

I jumped us forward over some boring stuff, so let me explain. The -t flag means “tag” which is the name we are giving our image. The tag I’m using starts with my docker account name and then after the slash is the actual name for the image. I could have just tagged the image as “example1” but then if I wanted to push it to a registry (Docker Hub) I would have to tag the image with my account name anyway. I’m lazy so I just go ahead and tag images the way I will push them in case I decide to push them. Less work, laziness preserved. The -f isn’t absolutely necessary if you name your doker file “DockerFile”, but I occasionally will use different names. Later I’ll explain how to do multistage builds and I will give those docker files a name like “DockerFile-selfcontained”. If you don’t provide the -f flag Docker will look for a file called “DockerFile”.

Now for the fun part, how to write your docker file. Below is an example of a typical DockerFile.

#
# ExampleApp
#
FROM microsoft/aspnetcore:1.1.2

COPY ./out /app

WORKDIR /app

EXPOSE 80/tcp

ENTRYPOINT ["dotnet", "ExampleApp.dll"]

This is easy and you generally won’t get much more complicated than this. What does it all mean?

#

The hash or pound sign is for leaving comments and telling people lies.

FROM

The FROM directive tells Docker which image you are starting with to build your image. You don’t have to include a FROM, but without it you are building on a very stripped-down Linux distribution. Assume that you will have to install everything yourself if you start from scratch. Save yourself the time and start with a base image.

WORKDIR

WORKDIR tells Docker where you are working. If the directory you specify doesn’t exist it will create it. You can think of WORKDIR as doing an mkdir and cd in to the directory you want. That directory becomes your working directory for everything that follows.

COPY

COPY will copy files from your local file system in to the image’s filesystem. If the destination directory doesn’t exist it will be created for you.

RUN

RUN will execute commands inside the image. A common RUN sequence in a docker file is

RUN apt-get update

This will update the packages already installed in the image to make sure you have the latest patches.

VOLUME

VOLUME allows you to specify mount points where Docker can attach persistent storage to your image. When a container is shutdown or if it crashes any changes in the container are lost. If you are running a database server in a container and you shut down the container any data in the database will be lost. Using persistent volumes gives the container a place to store data.

EXPOSE

EXPOSE tells Docker which ports in the container should be exposed so network traffic from the host computer can be routed to the container. The -p flag in the Docker Run command on the command line not in the DockerFile) allows you to specify how to route the traffic.

ENTRYPOINT

ENTRYPOINT tells Docker what should be run when the container is started. In the example, we are running dotnet and telling it to use exampleapp3.dll.

There is more you can do in the DockerFile but this will get you started and covers 80% of what you need. You can ship really useful images just using the information above. Keep going though because a little more knowledge will help make your life easier (i.e. help you be as lazy as possible too).

All of that said, I do recommend spending some quality time reading the DockerFile reference and Best practices for writing DockerFiles. It is time well spent.

Multistage builds

The previous example demonstrated a usual build for Docker. You start with an image, copy some files in, set a few configuration options and call it a day. What I don’t like about this is that you have to first compile your application on and then copy the output in to the image. What if your stuff was updated recently but the base image you are working with is using an older version? What if one of the people on your team is doing it a little differently? You’ll end up working harder trying to figure out why the application works in one environment but not another. We’re back to “it works on my machine”. We can use multistage builds to do away with that. This is one of the reasons why I have my custom build image. I added the git client to Microsoft’s image so when I build an image my Docker file actually gets the source code for Git Hub and builds that inside the Microsoft Asp.Net Core Build image and then copies the output in to the Asp.Net Core image which serves as the actual runtime image we’ll use to push to other environments. Here is an example where I’m building a sample application whose source code is pulled from Git Hub during the build process.

#
# ancSample
#
# stage 1 - build the solution
FROM jakewatkins/ancbuildenv AS builder

WORKDIR /source

# Pull source code from Git repository

RUN git clone https://github.com/jakewatkins/ancSample.git /source
# restore the solution's packages

RUN dotnet restore
#build the solution
RUN dotnet publish --output /app/ --configuration Release

# stage 2 - build the container image
FROM microsoft/aspnetcore:1.1.2

WORKDIR /app
COPY --from=builder /app .

EXPOSE 80/tcp
# Set the image entry point
ENTRYPOINT ["dotnet", "/app/ancSample.dll"]

You can download the entire project from my github here: https://github.com/jakewatkins/ancSample

Notice in stage 1 that the FROM statement added an AS at the end. The builder is used in the copy statement in stage 2 telling Docker where to find the files we want to copy. The other thing to notice is that there are a lot more RUN statements here. There are a few tricks that could be added to help optimize our image size. For example it would probably be good to chain the two dotnet statements together using the shell “&&” operator. However, I’m also learning this stuff so bear with me.

There are other tricks you can use in your Docker file. For example you can parameterize docker files so you can pass in parameters. I plan to refactor the above Docker file so I can pass in the version of Asp.Net Core that I want to use and the url for the Git Hub repository. That way I won’t have to write a new Docker file for each project I start.

Push images

Now that we have our image we will want to push it to a registry. I recommend that you create an account on Docker Hub to store images. The only downside of the free Docker Hub account is that you can only have 1 private registry.

To push an image to Docker Hub, you first create the registry on their web-site. The cleverly hidden blue button at on the top right will get the job done for you. Name the registry to match the tag you used to create your image. This means you need to have your account name followed by a slash and then the image name. Like this:

Jakewatkins/ancsample

Image names must be all lowercase but you can use dashes and underscores to make them readable. If you didn’t tag your image during the build process you’ll have to do it now. If you gave your image the name testimage and your account name is ‘spacecommando’ and you want to name the image in the registry ‘mysuperimage’ the command will look like this:

docker tag testimage spacecommando/mysuperimage

Once you have your image tagged correctly you can push it:

docker push spacecommando/mysuperimage

You can hit refresh and see you image on Docker Hub.

How do I setup my own registry for images?

You can setup your own registry. Docker provides a container to do it. All you do is run it! However, you will want to do some configuration to setup persistent storage.

You can read about it here: Deploy a registry server

Their instructions are good and I’m too lazy to write a different version of them.

Setting up my own base images

As I’ve already stated I’ve started creating my own base images to make it easier for me to get work done. You should do the same. In my case all I did was take Microsoft’s image and add the git client to it. I can see adding other packages as well down the road (npm and bower to name a few) but for now the image is doing what I want.

The docker build file looks like this:

#
# Jake's ASP.NET Core Build image
# This image starts with Microsoft's ASP.NET Core Build image which already
# has the .net tools pre-loaded so projects can be built inside the image
# build process.  This creates an environment where everybody builds the
# project the same way.  On to this git has been added so the build process can
# pull the project source code from a git repository further decouples
# developer workstations from the build process.
# this can also allow the DockerFile to be used directly by OpenShift in a
# CI/CD.
#

FROM microsoft/aspnetcore-build

#Add git to the image

RUN apt-get update && apt-get install -y git

#

That’s it. The comment header is longer than the actual build script! If you’re as lazy as me, you can grab this from my git hub here: https://github.com/jakewatkins/ancbuildenv

With this image you can setup your development work flow so that once you’re satisfied with your code, you push it to GIT and then kick off a build and run tests. In a future post we’ll use this to setup a CI/CD pipeline in different environment (I want to do OpenShift first).

Conclusion

I’ve covered the barest sliver of what you can do with a Docker file and Docker images. In my next post I’ll take this a step further to start building an actual application and start introducing docker-compose to orchestration containers so they can work together.

Containerization 101 – OpenShift

What is OpenShift?

OpenShift builds on top of Docker by providing tools to help orchestration of containers, scaling applications and managing containers. Web scale applications become very complex and even with the efficiencies of containers additional hardware will be needed for scaling. OpenShift helps make it possible to scale containers across multiple hosts. OpenShift also provides a nice CI/CD system whereby each time you commit code to a git repository OpenShift will perform a build and deployment cycle for your application.

What makes OpenShift Important?

Docker provides a great tool for an individual developer to work in isolation. OpenShift provides additional capabilities that it easier for a team to work together. Additionally, Docker doesn’t provide much in the way of management tooling to make Docker ready to be run in a production environment. OpenShift fills that gap.

Over the next few weeks I have several posts about Docker and OpenShift planned.  My goal is writing these is to lock in what I’ve learned.  Along the way I’m building a lot of sample applications, POCs, and demos.

Containerization 101 – Docker

What are containers?

In very simple terms, Containers are a mechanism for deploying applications. A container will hold an application’s files, libraries, and other dependencies that it needs to execute. Containers isolate applications so that they cannot interfere with other applications and they cannot be interfered with either.

Containers can be thought of as virtual machines. A virtual machine is complete deployment of a computer. It will have a hyper-visor or virtualization layer that simulates a real computer, a complete operating system and then the applications it is running. A container on the other hand only has the resources necessary to run the application it is hosting. It uses the host systems operating system and other resources. By sharing the operating system, containers are much less resource intensive and start more quickly.

Docker is the main container service in use. It works on all the major operating systems and supports both Linux and Windows containers. However, Linux containers are more mature and have broader community support. Docker containers are also supported by major cloud services such as Azure and AWS.

In addition to isolating applications running on a host computer, Docker also provides software defined networks so containers running on a host can communicate without being exposed to the wider network. Docker also provides persistent storage for data created by containers. This makes it possible to host applications like database servers in containers.

What is important about containers?

Containers are important because they are less resource intensive than virtual machines. As such they start up faster and more can be hosted on a single host. It is not uncommon for Docker to be run inside a virtual machine.

Because containers use fewer resources it is easier to have the same execution environment at all stages. The containers that are run in development are the same containers that are used for testing and then released to staging and finally production. It is realistic to create an environment where “it works on my machine” means it works everywhere.

Containers also make it possible to migrate an application to the cloud without significant effort. As previously discussed, A developer creates and works with container images in the development environment. At some point the container images are ‘lifted’ to QA where testing is performed to identify defects. At some point the container images are again lifted to staging and eventually production. An organization could initially choose to run the application in their own data center but at some point could decide to again lift the application to the cloud. If the application is self-contained in its own group, or swarm, of containers then the application won’t even require configuration changes when its lifted to the cloud. If the application is using external resources, the organization’s SAP system for instance, then things like network access will have to be configured.

What is docker?

Docker is an open source project that has standardized containers. The idea of containers is not new, the idea goes back to how mainframes work. Linux introduced containers but it didn’t offer an easy way to create containers. That is where Docker came in.

In Docker containers are created from images. Images are representations of the file system (like a zip file) containing only the files specific to the application that will be run in the container. This means that if there are files in /bin/ and /tmp/wwwroot then the image will have those just those files. The container is a running instance of the image. A single docker host can run as many containers as it has memory and CPU to handle.

Going beyond containers Docker offers stacks which are groups of containers that are related. For instance, you can have one container that has a web-site and another container that has a database server. The stack provides docker the information it needs to deploy the containers together so that they will work together. The stack’s information will provide networking and volume information in addition to the image information.

Docker also provides the tools needed to build container images from scratch or from other container images. As an example, as a Microsoft stack developer I use Asp.net Core to build applications. Microsoft provides a container for compiling Asp.Net Core applications inside a container, I’ve built my own custom version of their image that adds a few additional tools that I use as a part of my build process.

Failing to mention the Docker Hub would be a mistake. As mentioned I have created my own image based on Microsoft’s image. The Microsoft image is distributed through the Docker Hub. You can search the hub and find thousands of images contributed by other companies and individual developers. You can sign up for your own account free of charge and contribute images to the community as well.

 

Investing in yourself

My focus, from a technology perspective, is on containerization (Docker & OpenShift). I believe that the best combining a technology with something else. It’s like the start-up pitch “Uber & X” where you get “Uber for pets” or something like that (hopefully more useful). In my case the other thing I’m looking at is blockchain.

I realize “The Blockchain” is a technology, but really plays in to the business side of things. How the blazes are we going to do this stuff and what direction should I go? If this was 2009 it would be all about BitCoin mining, but that ship has sailed. Unless you want to invest a million dollars in a mining operation I think you’re wasting your time. I think the applications of the blockchain outside of money are where the action is going to be. What new businesses can we enable because of the blockchain? Can I change my own business as a free-lance software developer because of the blockchain?

Once thing about this does give me pause. I read Satoshi Nakamoto’s paper “Bitcoin: A Peer-to-Peer Electronic Cash System” and I’m a bit dizzy. How the blazes did that unintelligible bit of writing create all of this? The paper is far from clear and leaves out a lot of critical information.

This is why I call this investing. It is not without risk. I’m investing my time and effort in to understand this with the expectations that I’ll profit. A clearer more easily understood paper would mean less effort on my part. Instead I’ll have to do more research. That increases risk and eats up more time….

Docker Cheat Sheet

The docker cheat sheet is just meant to be a list of commands that I’ve found useful when working with Docker. This is provided with little in the way of explanation or instruction. I’ll be providing the details in articles that will follow.

 
 

Build an image

docker build . -t [accountname/imagename] -f [dockerfile name]

example

docker build . -t jakewatkins/exampleapp0831 -f dockerfile.standalone

 
 

Create a container

docker create -p [port mapping] –name [container name] [image name]

Example

docker create -p 3000:80 –name testcontainer jakewatkins/exampleapp0831

 
 

Start a container

docker start [container name]

Example

docker start testcontainer

 
 

Just run the container

docker run -p [port mapping] –name [container name] [image name]

Example

docker run -p 3000:80 –name testcontainer jakewatkins/exampleapp0831

 
 

Get the logs from a running container

docker logs [container name]

Example

docker logs testcontainer

 
 

Run an image, give me a shell and then remove the container when I exit:

docker run -it –rm [image name]

Example

Docker run -it –rm jakewatkins/example0831

Containerization

I’m working on a series of articles about Containerization. For the past few months I’ve been having a blast playing with Docker and OpenShift. They’re very cool technologies but the documentation around them is rather mixed. With this series I hope to provide a clear direction for other people adopting this technology and flatten out the learning curve as much as possible.

Everything I write will be from my point of view and hands on experience. This means that all work is from the point of view of a Microsoft centric developer. My workstation runs Windows 10 pro, I use Visual Studio 2017 and when coding I target the .NET framework (.NET Core in this case).

I think this will be valuable because most of the voices I’m seeing in this space are coming from the Open Source community. However, today I think the distinction between being a Microsoft developer and an Open Source developer is meaningless. I wrote Node.js and use a lot of Open Source tooling. I even run Linux (RHEL). So perhaps my earlier Microsoft warning is meaningless.

Regardless – if you have any questions, please let me know and I’ll do my best to get them answered.

Improving my blog

I’m making an effort to post more regularly. However, before I really get going I need to clean up a few things in the posts I’m creating. In particular I want to stop posting pictures of source code. It drives me nuts. On one hand it looks good, but you my reader can’t do anything with it. In order to work with the code you have to download the code from my github repository.

To fix this I’m playing around to see what I need to do. The first thing is that I can wrap code in [code] … [/code] tags which does most of the work. However, I use Microsoft Word to compose my posts. If I paste source code in to Word and use the [code] tags you get something like this:


<span style="color:blue;font-family:Consolas;font-size:9pt;">public<span style="color:black;">
				<span style="color:blue;">static<span style="color:black;">
						<span style="color:blue;">void<span style="color:black;"> Main(<span style="color:blue;">string<span style="color:black;">[] args)
</span></span></span></span></span></span></span></span>

<span style="color:black;font-family:Consolas;font-size:9pt;">        {
</span>

<span style="color:black;font-family:Consolas;font-size:9pt;">
			<span style="color:blue;">var<span style="color:black;"> host = <span style="color:blue;">new<span style="color:black;">
							<span style="color:#2b91af;">WebHostBuilder<span style="color:black;">()
</span></span></span></span></span></span></span>

<span style="color:black;font-family:Consolas;font-size:9pt;">                .UseKestrel()
</span>

<span style="color:black;font-family:Consolas;font-size:9pt;">                .UseContentRoot(<span style="color:#2b91af;">Directory<span style="color:black;">.GetCurrentDirectory())
</span></span></span>

<span style="color:black;font-family:Consolas;font-size:9pt;">                .UseStartup<<span style="color:#2b91af;">Startup<span style="color:black;">>()
</span></span></span>

<span style="color:black;font-family:Consolas;font-size:9pt;">                .UseApplicationInsights()
</span>

<span style="color:black;font-family:Consolas;font-size:9pt;">                .Build();
</span>

<span style="color:black;font-family:Consolas;font-size:9pt;">            host.Run();
</span>

<span style="color:black;font-family:Consolas;font-size:9pt;">        }</span>

You’re better off if I just post a picture. So this means I’ll have a slightly more complex workflow than I want, but it will produce a better quality product. Eventually I’ll figure out a way to automate it. What I plan to do is just leave annotations in my post. They’ll look like:

[code language=”csharp”]

CoreTestApp/program.cs:12-22

[/code]

I’ll post the article to the drafts folder on WordPress and then add the source code manually in WordPress’s editor. Done that way the code looks like this:

public static void Main(string[] args)
{

var host = new WebHostBuilder()
.UseKestrel()
.UseContentRoot(Directory.GetCurrentDirectory())
.UseStartup<Startup>()
.UseApplicationInsights()
.Build();

host.Run();
}

Hopefully this will immediately yield a better article for my readers. And later I’ll figure out how to automate the process so I can just publish in one step without any manual interventions.