Subscribe Us

Reliable task scheduling on Google Compute Engine

1
Many systems require regularly scheduled tasks, but getting them to run reliably in a distributed environment can be surprisingly hard.

Imagine trying to run the standard UNIX cron service in a fleet of virtual machines. Individual machines come and go due to autoscaling and network partitioning. A critical task might never run because the instance it was scheduled on became unavailable. Or a task meant to run only once might be duplicated by many servers as your autoscaler brings them online.

Using Google App Engine’s Cron service for scheduling and Google Cloud Pub/Sub for messaging, you can build a distributed and fault-tolerant scheduler for your virtual machines. We’ll teach you how in our Reliable Task Scheduling for Google Compute Engine article, which includes code for a sample implementation on GitHub.

In this design pattern, a lightweight App Engine application schedules events in the Cron service. When the Cron service calls this application’s event handlers, the App Engine application uses Cloud Pub/Sub to relay the events to a utility running on each Compute Engine instance.

When the subscribing utility receives a message, it runs a script corresponding to the Cloud Pub/Sub topic. The scripts run locally on the instance just as if they were run by Cron. In fact, you can reuse existing Cron scripts with this design pattern.

Using Cloud Pub/Sub for distributed messaging means that you can schedule an event to only run on one of many servers, or to run the task on several servers concurrently. The topic and subscriber model (shown in the diagram below) gives you fine-grained control over which instances receive a given task.
reliable-task-scheduling-compute-engine-overview.png
Figure 1 - Using App Engine Cron from Compute Engine

For a detailed explanation of this design pattern, check out our Reliable Task Scheduling for Google Compute Engine article, which includes a sample implementation on GitHub. Feel free to make pull requests or open issues directly on the open source sample, or ping me at @ptone on twitter if you find this useful.

- Posted by Preston Holmes, Solutions Architect


from Google Cloud Platform Blog http://googlecloudplatform.blogspot.com/2015/06/Reliable-task-scheduling-on-Google-Compute-Engine.html

Post a Comment

1 Comments
Post a Comment

Coronavirus Articles


To Top