Scheduled downtime of foss.heptapod.net for upgrade

Posted on Sat 18 April 2020 in announcements

Exceptionally, foss.heptapod.net will be offline next Monday, April 20th 2020 from 10:00 UTC+2, only to be back in the evening. Read this to know why and why it won't happen again.

foss.heptapod.net is the Heptapod service hosting Free and Open Source Software offered free of charge by Octobus and Clever Cloud.

Catching up on two major GitLab versions

So, next Monday, we're going to upgrade foss.heptapod.net to Heptapod 0.12.0.

That means jumping from GitLab 10.5 to 12.2, crossing twice the new major version line. Here lies the issue.

GitLab's standard upgrade process

Reference: GitLab upgrade recommandations

GitLab has a lot of data migrations to perform on upgrades, some of which are deferred to actually run in the background after the upgrade. Forward porting these migrations can't be done forever, and at some point, the current software must be able to assume that the data structure is recent enough. So they have to draw a line: the policy states that to cross a major version, one must be sure that the previous background migrations have been fully completed.

Here's an example of what it should have looked like for an installation going from GitLab 11.10 to 12.0:

  1. Upgrade to GitLab 11.11
  2. Keep running at least until the background migrations are done. On the largest installations, that can amount to almost a week, but it doesn't matter: users can work normally in the meantime.
  3. Upgrade to GitLab 12.0

For an installation that is always upgraded shortly after the GitLab release, the waiting part happens naturally: there's a whole month between 11.11 and 12.0 anyway.

Heptapod 0.12.0 migration

Reference: how to migrate to 0.12.0

In the case of Heptapod, we don't have fully functional versions based on GitLab 10.8, 11.0 and 11.11, as it would have been a disproportionate effort to do so, and that would have impaired our ability to really catch up. Instead, we have intermediate versions that can't be used for anything but performing the data migrations.

So that means we can't have the users enjoying the service while it's silently finishing to upgrade in the background: we have to wait for it to finish.

The case of foss.heptapod.net

GitLab background migrations are geared towards limiting load and concurrency on the largest instances. This is achieved by cutting some tasks in chunks and waiting a lot between tasks.

While not at the same scale as gitlab.com, our foss.heptapod.net is large enough that the cutting and waiting happens, with disproportionately long delays.

As a consequence, our test run for the upgrade of foss.heptapod.net took about 7 hours. We've lowered some of the delays since then, but there's so much we can do without taking excessive risks.

We know that long downtimes are bad, and we aren't pleased with that situation. But we found it better for everyone if our users can enjoy this major improvement in a few days and if we can focus on improving Heptapod rather than delay the upgrade and spend time watching test runs for an operation that will happen just once.

Why it won't happen next time

GitLab 13.0 is due on 2020-05-22

Fortunately, we're now much closer to current GitLab versions: instead of being more that two major versions behind, we're now less that one. That's the whole point of Heptapod 0.12.0.

In a nutshell, the transition of Heptapod to GitLab 13 will happen exactly as explained above for upstream GitLab: we'll make a fully functional version of Heptapod based on GitLab 12.10. The next one, based on 13.0 will be several weeks later.