Zero downtime deployment for Java apps

https://stackoverflow.com/questions/4462042

10-10-2019
|

Pergunta

I am trying to build the very lightweight solution for zero downtime deployment for Java apps. For the sake of simplicity lets think that we have two servers. My solution is to use:

On the "front" -- some load balancer (software) - I am thinking about HAProxy here.
On the "back" - two servers, both running Tomcat with deployed application.

When we are about to deploy new release

We disable one of the servers with HAProxy, so only one server (let's call it server A, which is running old release) will be available.
Deploy new release on other server (let's call it server B), run production unit tests (in case we have them :-) and enable server B with HAProxy, disabling server A at the same time.
Now we have again only one server active (server B, with the new release). Deploy new release on server B, and re-enable it.

Any advises how to improve? How automate?

Any ready made solutions or do I have to end up with my own custom scripts?

Thanks!

Solução

Rolling upgrade is indeed a good solution, provided your load-balancer supports this option (server starvation). Another solution is to use OSGi-enabled application servers, to hot-replace parts or whole of your application.

I would recommend the first one. SpringSource's AMS supervision console can take down a cluster of tcServer (a custom tomcat on steroids), and IIRC do the rolling upgrade automatically (but check the docs).

Outras dicas

I have found some interesting solutions from this article regarding Zero downtime. I would like to highlight only few solutions in that article.

1. A/B switch: ( Rolling upgrade + Fallback mechanism )

We should have a set of nodes in standing by mode. We will deploy the new version to those nodes and switch the traffic to them instantly. If we keep the old nodes in their original state, we could do instant rollback as well. A load balancer fronts the application and is responsible for this switch upon request.

cons: If you need X servers to run your application, yon need 2X servers with this approach.

2. Zero downtime

With this approach, we don’t keep a set of machines; rather, we delay the port binding. Shared resource acquisition is delayed until the application starts up. The ports are switched after the application starts, and the old version is also kept running (without an access point) to roll back instantly if needed.

3. Parallel deployment – Apache Tomcat: ( For web applications only)

Apache Tomcat has added the parallel deployment feature to their version 7 release. They let two versions of the application run at the same time and take the latest version as default.

4. Delayed port binding:

we propose here is the ability to start the server without binding the port and essentially without starting the connector. Later, a separate command will start and bind the connector. Version 2 of the software can be deployed while version 1 is running and already bound. When version 2 is started later, we can unbind version 1 and bind version 2. With this approach, the node is effectively offline only for a few seconds.

5. Advanced port binding:

By breaking the myth: ‘Address already in use’, *both old process & new process will bind to same port. SO_REUSEPORT option in ON mode lets two (or more) processes bind to the same port. Once the new process binds to the port, kill the old process.

The SO_REUSEPORT option address two issues:

The small glitch between the application version switching: The node can serve traffic all the time, effectively giving us zero downtime.
Improved scheduling:

In Summary:

By combining both late binding and port reuse, we can effectively achieve zero downtime. And if we keep the standby process around, we will be able to do an instant rollback as well.

Have a look at the OSGi technology if you can accommodate an OSGi container since it provides good isolation and hot deployment for OSGi bundles. If you are using the Spring framework you could use Spring OSGi

LiveRebel provides the functionality for rolling restarts, provides CLI API and Hudson/Jenkins plugin for automation.

There is easy-deploy that does exactly that with Docker containers.

Deploy version 1

easy-deploy -p 80:80 -v some/path:other/path my-image:1

To deploy a new version just run the command with the updated tag name

easy-deploy -p 80:80 -v some/path:other/path my-image:2

Disclosure: I built this tool. I built it exactly because I couldn't find a simple solution for this problem.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow