At version 18.5, Heptapod development enters a new era
Several long standing goals of the Heptapod project have been reached over the past 12 months, culminating with the Heptapod 18.5 release in early November. It looks like a good time for a retrospective, especially since such groundwork is often not visible to end users, except as performance improvements.
By many aspects, this article is a follow-up to The road to fully native Mercurial in Heptapod.
Native Mercurial
To many of our users, native Mercurial in Heptapod may seem like old news or even rather obscure: in a recently deployed instance, all Mercurial projects are native, so much that the "native" word is not even mentioned.
But this was far from being the case with older installations, so perhaps a word of explanation would be needed.
In the very early days, Mercurial repositories were implemented in Heptapod by converting under the hood to an auxiliary Git repository that the GitLab web application was using as some kind of a side view. This way of doing had many drawbacks, but it brought us to speed very quickly.
For repository handling, GitLab uses an internal backend called Gitaly, with a well specified protocol based on the gRPC framework. Our next natural move was thus to implement the Gitaly protocol in our own backend – specially tailored for Mercurial, of course. This was the birth of the HGitaly project.
Unsurprisingly, it was not possible to switch the entire Mercurial support to HGitaly in one stroke: it was too big and too risky. For perspective, Gitaly has more that 150 methods, and gains at least some new parameters every month. Also, we already had active users at this point, including ourselves.
So we had instead to adopt an incremental approach:
- to represent and enclose the data incompatibility between "legacy" and "native" Mercurial projects, we treated them as being of different version control system (VCS) types, even though that is conceptually a bit of a stretch.
- define several milestones for the native Mercurial development effort: HGitaly1, the first to go in production introduced the legacy vs native dichotomy. HGitaly2 had all read-only needs covered by HGitaly. Finally HGitaly3 represented the distant goal of all Mercurial repository interactions being handled by HGitaly.
- prepare migration tasks to make existing projects native.
- switch over to the new handling progressively by means of feature flags.
This is what The road to fully native Mercurial in Heptapod was about, so we won't repeat the details here.
Now the classical problem with such incremental transitions is that it is hard to unplug the last exceptional cases, hence they tend to never be really finalized, leaving behind much technical debt. Things start being really ugly when the legacy code base is still around when it is time for a new full redesign.
In the case of Heptapod, it never went so bad, but it's true that we had much scaffolding around. Most of it was not visible to end users, being exposed to administrators only as GitLab feature flags. For instance, at some point we had "fully native Mercurial projects" but they still kept the auxiliary Git conversion for safety (to be honest, also because we had forgotten to implement a couple of things, such as programming languages analysis). So of course, the next round gave rise to "fully native Mercurial without Git". The reader can only imagine the amout of booleans about it in the various code layers, the combinatoric inflation in tests and CI…
But that is over, and we're prone to celebrate.
Code name: refgeddon
In late 2024, GitLab announced plans to make use of the newer reftable Git storage format and to migrate existing repositories to this format in a couple of months.
A quick test however showed that hg-git, the tool we're using for all conversions to Git could not handle repositories with reftable. Moreover, upstream's plans were that Gitaly would automatically migrate all Git repositories.
In concrete terms, we were facing a planned failure of all legacy Mercurial projects, as well as all mirrors pushing to external Git repos. Given the scale of implications, and following geeky developer traditions, we informally referred to the issue as the refgeddon.
Of course we could have implemented reftable support in hg-git, or rather, in the library it uses to access Git repositories (Dulwich), but this meant much effort to maintain something that we wanted to get rid of. Also it lacked a fallback plan.
We thus decided that it would be better to just hasten the migration of Mercurial projects to native. It also turned out that there were just enough intermediate versions left to make the migration automatic and guaranteed to be fully done before the announced GitLab milestone. In GitLab technical terms, this was done as a batched background migration and its finalization. These have to be spread over several releases.
As mentioned above, there was still the question of Git-push mirrors: these were also based on a local hg-git conversion and pushed to the remote by… Gitaly. Since we needed to strongly prevent Gitaly to touch them, the solution was to reimplement the Git push itself within HGitaly, as well as a very limited number of basic utilities, such as backup and removal. It is simpler than it sounds, because it all boils down to invoking git subprocesses. Still, there is the ironic fact that HGitaly is now also a Git repository handling tool.
At the end of the road, we were able to follow the plan to its conclusion. By Heptapod 17.9, we had no more legacy Mercurial projects and it was guaranteed that Gitaly would not be needed to handle the repositories of Mercurial projects.
Repository storages and Gitaly servers
The whole point of the Gitaly protocol is to externalize Git operations out of the main Web application, allowing to scale them independently and avoiding the need for file system sharing (typically NFS), which is notoriously very delicate to operate.
In small traditional setups, there is a single Gitaly server, living within the same system as the Web application, but that is not the case with larger or just cloud-native setups. The GitLab configuration can accomodate as many Gitaly servers as needed and does not care whether they are remote or not.
In short, the configuration defines several repository "storages", each with a name and the needed details to connect to the relevant Gitaly server. Here is how it looks with one storage in the main configuration file (config/gitlab.yml):
## Repositories settings
repositories:
storages: # You must have at least a `default` storage path.
default:
gitaly_address: unix:/home/git/gitlab/tmp/sockets/private/gitaly.socket # TCP connections are supported too (e.g. tcp://host:port). TLS connections are also supported using the system certificate pool (eg: tls://host:port).
# gitaly_token: 'special token' # Optional: override global gitaly.token for this storage.
An obvious feature here is that this configuration does not specify any path to repositories or rather, the location of the repositories is the Gitaly server.
In Omnibus and Docker installations, the gitlab.yml configuration file is actually generated from the main configuration file, /etc/gitlab/gitlab.rb, which is the one that end administrators are supposed to tweak. Thanks to this, the changes in configuration that we are about to explain here have been completely transparent for most instances. Actually, for all of them but installations from source.
When we introduced HGitaly a long time ago, it was obviously tempting to define "Mercurial storages" and let the upstream dispatching logic play its role, by putting Mercurial projects on Mercurial storages only.
But it soon became clear that it couldn't work this way, because our new "native" Mercurial projects would be using both Gitaly and HGitaly during the transition time.
So instead, we recorded the HGitaly address as an additional property of the default storage, and we had to patch the Gitaly client subsystem of the Rails application with the needed dispatch logic. Here is the resulting configuration example that shipped with Heptapod 0.17:
## Repositories settings
repositories:
# Paths where repositories can be stored. Give the canonicalized absolute pathname.
# IMPORTANT: None of the path components may be symlink, because
# gitlab-shell invokes Dir.pwd inside the repository path and that results
# real path not the symlink.
storages: # You must have at least a `default` storage path.
default:
path: /home/git/repositories/
gitaly_address: unix:/home/git/gitlab/tmp/sockets/private/gitaly.socket # TCP connections are supported too (e.g. tcp://host:port). TLS connections are also supported using the system certificate pool (eg: tls://host:port).
# gitaly_token: 'special token' # Optional: override global gitaly.token for this storage.
hgitaly_address: unix:/home/git/gitlab/tmp/sockets/private/hgitaly.socket # TCP connections are supported too (e.g. tcp://host:port). TLS connections are *not* at this point (tracking issue is hgitaly#3)
At this time the path setting was still supported by GitLab, and Heptapod was using it for all repository interactions that did no go through HGitaly nor Gitaly. After a deprecation period, upstream forbidded the use of path. We had therefore to introduce a separate hg_path setting.
Later on, when we introduced RHGitaly, a sped-up and very robust partial implementation of HGitaly in Rust, we complemented this with the additional rhgitaly_address. When RHGitaly in turn became the unique visible entry point, we removed the hgitaly_address property altogether. Here is the example as of Heptapod 18.0.0:
repositories:
storages: # You must have at least a `default` storage path.
default:
hg_path: /home/git/repositories/
gitaly_address: unix:/home/git/gitlab/tmp/sockets/private/gitaly.socket # TCP connections are supported too (e.g. tcp://host:port). TLS connections are also supported using the system certificate pool (eg: tls://host:port).
# gitaly_token: 'special token' # Optional: override global gitaly.token for this storage.
rhgitaly_address: unix:/home/git/gitlab/tmp/sockets/private/rhgitaly.socket # TCP connections are supported too (e.g. tcp://host:port). TLS connections are *not* at this point (tracking issue is hgitaly#3)
Before the refgeddon, we thought that it would stay this way for a very long time, perhaps forever, as there was not much incentive to make something more natural. But avoiding the refgeddon forced us to stop using Gitaly for Mercurial projects, even for their Git mirrors, so…
In Heptapod 18.2, we introduced a new VCS type field in the storages configuration, and registered HGitaly as a separate storage:
repositories:
storages: # You must have at least a `default` and a `hg:default` storage.
default:
vcs_type: git # this is also the default value, displayed here for completeness
gitaly_address: unix:/home/git/gitlab/tmp/sockets/private/gitaly.socket # TCP connections are supported too (e.g. tcp://host:port). TLS connections are also supported using the system certificate pool (eg: tls://host:port).
# gitaly_token: 'special token' # Optional: override global gitaly.token for this storage.
hg:default:
vcs_type: hg
gitaly_address: unix:/home/git/gitlab/tmp/sockets/private/rhgitaly.socket # TCP connections are supported too (e.g. tcp://host:port). TLS connections are *not* at this point (tracking issue is hgitaly#3)
hg_path: /home/git/repositories/ # Still used for some operations, hopefully will be ignored soon
Internally, this allowed us to drop a few hundred lines of low level code, in favor of just a high level hook to choose only Mercurial storages for Mercurial projects. Of course, at this point, the HGitaly server must still be local, but once that limitation is lifted in the future, we will be ready for Heptapod with multiple HGitaly servers as well.
Most of this is invisible to end users, even the configuration files, as they are in most frequent installations (Docker) actually generated from the gitlab.rb central configuration file.
HGitaly3: Removing filesystem acces by the Rails application ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-------- Removing all file system access of the Rails application to the repositories in favor of HGitaly calls has always been one of the goals of the HGitaly project, pretty much as well as it had earlier been one of Gitaly itself, and indeed we identified it as the HGitaly3 milestone.
At this point there were still a couple of cases of the Rails application performing repository operations using direct file system access, typically by spawning hg processes.
We removed them over the couple next releases, using GitLab feature flags for switchable live testing. (opt-in if the flag's default value is false, opt-out if it is true).
- The UserCommitFiles Gitaly method. This is the one that is meant to take care of server-side content creation. Its implementation had been waiting for a long time. The Hgitaly implementation was introduced as an opt-in in 18.2, became default in 18.3 and replaced previous code in 18.4.
- The Rebase HGitaly method was introduced in 18.4 as opt-out, and becames the unique implementation in 18.5
- The FetchBundle and CreateRepositoryFromBundle methods were not used by the Rails application, despite being long implemented and used by gitaly-backup. This was fixed, again as opt-out, in 18.4 and becomes the only case in 18.5
- Although not handling repositories, the backup and restoration of Group and Namespaces Mercurial configuration files was also using the path to repositories on the filesystem. New dedicated HGitaly methods have been introduced for this and are in use as of 18.5.
The end result is that in Heptapod 18.5, the Web does not have any knowledge of the path to the repositories root: the hg_path value will is simply no more read.
This brings us obviously much closer to cloud-native Heptapod, but there is still some way ahead (more on this in next section).
In the mean time, there are expected immediate benefits: as HGitaly is generally speaking more mature and powerful than the Ruby Mercurial support code we just got rid of. Let us take the example of UserCommitFiles:
- no need to pay the latency price of spawning hg processes (yes, plural), for at least 0.1 second each. This will be better felt on small repositories where the actual work being done is almost instantaneous in comparison.
- a working copy is needed to perform a commit, but of course Heptapod maintains them separately to avoid concurrency problems. HGitaly has a much more mature subsystem for this. It is notably able to reuse previously used working directories, and thus usually perform a small update to bring them to the wanted changeset. This is expected to bring on the average a huge improvement on larger repositories.
Towards cloud-native Heptapod
Wow, thanks to be with us so far! So, what exciting perspectives do we have for releases after 18.5?
Why Cloud Native?
Getting back again to the original goal of the Gitaly project, this is all about scalability and flexibility.
By enclosing repository management in its own service, treating it mostly like a special database, it becomes possible to scale components independently, to perform upgrades with less downtime, to update just one component to fix a bug. GitLab actually went further than that with High Availability Gitaly clusters.
Of course, this is very tangible with infrastructure systems that are themselves meant with such philosophy in mind. The prime target here is naturally Kubernetes, as it is widely used at Cloudcrane and in fact is our production standard. But it can be very useful in other kinds of setups, even PAAS platforms that could just treat Gitaly and HGitaly as they do with PostgreSQL, Redis or ElasticSearch.
One may think that this is very oriented towards the largest instances, but even smaller ones would benefit. Taking the example of an organization hosting their Heptapod instance in a Kubernetes cluster, it will be much more comfortable to have it spread over all the nodes rather than the current huge monolithic container which needs lots of resources at once.
Steps towards Cloud Native Heptapod
So, what is still missing?
This is summarized in heptapod#1647:
- Heptapod Shell, the component handling push/pull over SSH is currently spawning the same kind of hg subprocess as the vanilla Mercurial SSH support would. In particular, it needs to run on a system where the repositories are. It needs instead to call HGitaly, which in turn needs to know what to do of it, just like it is done with upstream GitLab and Gitaly.
- we need to tighten all the inter-service communications, currently very oriented towards HGitaly sitting right beside the Rails application anyway.
- lots of testing!
- The final step should be to provide a Cloud Native Hetpapod Helm chart.
Can we make it by the end of 2025 or until next summer?
This is still rather unclear at this point, but if you have the need and the means, please consider our sponsored issues program, or just drop us an email.