UX of distributing multiple binaries [take 2]

The possibility of distributing multiple separate binaries to validators has already been touched on at UX implications of PVF executor environment versioning. But that thread kind of went off-topic, and we have new requirements now due to recent security work. So we (me, @dmitry.sinyavin, and @eskimor) want to restart the conversation from a different angle (security + future-proofing).

Background

The immediate motivator for this change is splitting the PVF worker binaries out of polkadot. There are several reasons we want to do that, mostly having to do with security:

  1. We can statically link the worker binaries with musl. This lets us better sandbox the expected syscalls as they do not depend on the system libc, plus itā€™s a win for determinism.
  2. @koute wrote a syscall detection script which to my knowledge only works with separate binaries.
  3. Having smaller binaries to run untrusted code helps protect against possible ROP gadget attacks.
  4. It may eventually be needed for virtualization.
  5. We are considering using a global allocator wrapper which @dmitry.sinyavin wants to enable on only the workers, without suffering the performance hit on the rest of polkadot.

(Side note: work on multiple binaries has already started, but what we already tried was splitting out the binaries while preserving the existing single-binary UX. The idea was: build the binaries during polkadot compilation using something similar to wasm-builder, embed them in the polkadot binary, and extract the binaries at runtime. In addition to being rather hacky/unorthodox, multiple issues with this were discovered with this during development. One is that thereā€™s no good way to do the extraction step on MacOS.)

Proposal

So due to this security work, we want to start distributing multiple binaries soon. One way is with a simple zip file. But, since we already need to change the validator UX, we should do it properly. We should either provide an installer binary/script and/or distribute through package managers. There are several benefits to this:

  1. It is future-proof, as we wouldnā€™t have to change the validator UX again whenever new requirements surface later.
  2. We could do additional configuration that requires root.
    a. For example, as part of securing the workers @eskimor suggested setting up a new separate user that would run the PVF binaries. [I still need to research and write the issue for this.]
    b. Also, I believe @dmitry.sinyavin mentioned that another feature we are eyeing, cgroups, requires some configuration.
  3. We can do some security checks on the generated binaries with e.g. this Perl script.
  4. Installers and package managers are common practice, and the UX is arguably more familiar (and simpler) than manually managing multiple binaries from a zip file.

Some possible downsides are:

  1. Validators may not trust package repos to be secure ā€“ they can always inspect and run the install script themselves.
  2. It would be more work on our side to support multiple package managers ā€“ probably good to bite the bullet and do this sooner rather than later.
  3. Others? Letā€™s discuss!
2 Likes

Weā€™d represent node version and PVF worker version separately in libp2p info, and later in the slashing resistant session key certificates? Or would they always be aligned?

Or should this question be asked in UX implications of PVF executor environment versioning ?

Good question. Unless Iā€™m missing something, the versions should always be aligned; it seems safest to make this a requirement. If they differ (e.g. some bug when upgrading) an error would be raised due to https://github.com/paritytech/polkadot/pull/6861 (already in production). It also makes sense to have some post-install checks for version alignment.

1 Like

If we do interactive installer it should be automation friendly so that people can easily integrate it in their workflows (docker, ansible, etc).

Shipping deb/rpm is probably the easiest approach from user point of view but itā€™s a bit painful to maintain. Are there any statistic what kind of OS/Distribution the validators are using at the moment?

1 Like

Good to know, Iā€™ve no experience there. As long as it is not prohibitively painful for us it seems good to prioritize the user UX.

I was curious about this myself. From telemetry.polkadot.io:

Unfortunately there are big ā€œOtherā€ and ā€œUnknownā€ brackets which doesnā€™t help. I wonder if @will can shed some more light.

1 Like

Some info from @chevdor :

Releases are currently distributed via:

debs, binary on github and docker image

Concerns:

Deterministic builds might become harder, but there should be no showstoppers.

Artifacts are currently hosted on:

github
some docker registry (hub but moving I think)
s3 for the debs
1 Like

We, as the developers, should provide a proper installation guide. No script! Someone then can take this guide to write the Deb package (which already exists), the nix package (which also already exists) or for whatever operating system/distribution they like. Having a good guide will be much more valuable than any hacked script (which is always the case for installer scripts :wink:).

Wasnā€™t the issue just about MacOS not having memfd? And is this really such a huge deal? MacOS isnā€™t our main target. However in the entire process we should not forget about people playing around with Polkadot locally. This means I would like to see some kind of extra parameter --insecure-pvf-workers that disables some of the security settings and may just keeps using the old method of running the Polkadot binary as worker. But back to the individual worker binaries, what is the problem with unpacking them? You didnā€™t get into more details in GitHub. I think that unpacking them to the base path and running them from there should be doable? We could add checks if the base path allows to execute binaries and it not show a message and exit the node. Then we could add some cli option to provide a custom path for unpacking the binaries.

3 Likes

I was thinking we would have a single canonical script that did everything required. Iā€™m not sure if it makes sense to have validators do manual stuff (or maybe it does?) like security configuration etc. which it seems we will need soon. But we can discuss how to proceed exactly, my high-level goal is just to get multiple files to validators somehow.

Weā€™d have to support two very different architectures to do that which seems like a big complexity and maintainability burden. And a different UX on different platforms will confuse people. It would be much better to just have a single way to run the workers that works on Mac and Linux, secure mode or not.

I added some more details in the PR so we donā€™t get off-topic here. I mean, we could maybe pile on more and more hacks to get that idea to work but at some point we need to course-correct.

It will be impossible to write this script. There are soo many different systems out there that you can not adapt to all in your script. Your job as developer is that you write a guide explaining how to use this stuff. How to setup the system in the best way to make it secure.

First, the people running validators are not your or my mother :wink: These people are able to setup and manage a server. Second, just because you are writing the guide, doesnā€™t mean that you can also change the deb packages or dive into the nix packages :stuck_out_tongue: However, there are already people maintaining these packages and they could use your guide to adapt the packages and update the installation instructions for the respective systems.

Handling three different binaries for local testing sounds like a pain, yeah not a huge pain, but also not that nice :stuck_out_tongue: But yeah, if there isnā€™t anything better, we have to do itā€¦

1 Like

But people do that. rustup installation is a single script, as well as cpanm and many others. I donā€™t say we should go that way, but itā€™s not impossible.

There is something in between the zip file and the OS package: AppImage and Flatpak (also Snap but itā€™s much more on the package manager side). I wonder if we can make use of any of them. AppImage is very close to what we want to achieve: extracting binaries and support files to a FUSE-mounted filesystem with known properties on every startup, but I never had experience with it for server applications, only used with desktop ones (and it worked really well).

Also, it may be a question of religion for many. If I were running a validator, Iā€™d prefer having an apt repository to stick it into my unattended-upgrade cron job and forget about it. Some other people may want to have more control over things.

1 Like

https://telemetry.w3f.community/ is a more accurate representation of what versions and whatnot that validators are running - validators a part of the 1kv programme have their info sent here, so nearly all nodes are validators. The polkadot.io run telemetry has a lot of random full nodes (and other things), so some of the info about stats that might not be the best representation.

Definitely this :point_up_2: - if we separate things out we will need to provide people with a lot of detailed instructions (including something like a ā€˜migration guideā€™). Having something that does things for people wonā€™t really be as helpful because everyone runs their infrastructure differently.

Yeah, nobody runs any production infrastructure on MacOS, so I donā€™t think that should be a consideration for running/distributing software. We should have some way for people to run and develop things locally, but the target environment is generally linux.

Most validators will always set things up manually anyways. We generally just give then the binary at the moment, everything else related to running it and setting it up is not done for them.

I would say so long as thereā€™s a development environment for Mac (however hacky it would be to maintain or run), that should be sufficient.


I would say in general having multiple binaries to run and distribute will be annoying and not ideal. If there was a way to have it where we still distribute a single binary that instead runs multiple workers, that would be great. If the reason to not do this is because of MacOS, I would not consider MacOS a major decision making factor.

My concerns with multiple binaries would be:

  • What happens when one of the binaries is running but the other one isnā€™t? Whether itā€™s due to running out of cpu/memory, or one crashing - how do we ensure that things otherwise still run as normal?
  • How would people handle monitoring / metrics / observability of the additional binary running? Given that they are critical processes, people need to have awareness and alerting in place to ensure that they are running right.
  • Splitting things up into multiple binaries makes for multiple different streams of logs - people will need to either aggregate these themselves, or check them as well (and be aware of what ā€˜normalā€™ logs look like).
  • Multiple binaries adds a bit of complexity in trying to diagnose when things are not properly working right.

If the tradeoffs for splitting things up is a must, then that can be fine, but we will need to be able to address the additional operational complexity. This shouldnā€™t just be a ā€˜configure it all for youā€™ script, but instead provide a lot of additional resources on best practices for how to run things, how to set things up, how to aggregate logs/metrics, etc, and how to debug things when running multiple binaries.

In terms of what it looks like currently for people running things, itā€™s a pretty big mix of stuff - some people compile themselves, some people use the binaries we distribute on github, some people use docker images, some people use the deb package. And for how they run things, some people will use systemd, some people will use kubernetes, etc. We would need to provide additional support for the various ways that people run things in all these ways.

1 Like

I believe all your concerns are much more about running separate processes, not separate binaries. But we already run them, the only difference is we spawn those processes from the same binary. So weā€™ve already been handling the listed problems properly (more or less). The only thing that is changing here is the process binary path.

But if these separate processes are run with separate binaries, people will need to manage them additionally as well. Like they will need to then run this other binary with systemd, setup some alerting to make sure this other binary is running, if it restarts, collect logs from this separate binary, etc. Which right now with a single binary is not a concern because itā€™s all in one.

No, they neednā€™t! Thatā€™s the point. Weā€™re doing everything ourselves. The main binary (polkadot) spawns new processes from those auxiliary binaries on itself, kills them on itself, checks if versions match between binaries, gets state changes from them and logs them, etc. We just need them to be separate files on an executable filesystem, and we handle the rest.

Err, I guess Iā€™m a little confused then - do you mean that as in this is the current state of what is currently done, or what it will look like when we separate things out to separate binaries? Per @m-cat 's original proposal, what would this look like if someone wants to run things?

Yeah and also some people jumping from a plane with a flying suite and then flying through small holes. It is possible. Would I do it? For sure no.

I can tell you that most of these scripts are assuming some kind of environment and this breaks most of the time when they being run on Nixos. Nixos teached me a lot of things and one of the most fundamental things is that a guide is much more worth than any hacked script that only works on Ubuntu.

1 Like

It is already being done. The pvf validation is running in its own process. Currently this functionality is encapsulated in the Polkadot binary itself and we just create a new process with this binary. In the future there will be 3 different binaries. The operator executes the node binary as before and the node internally then starts one of the other two binaries as needed.

So from a node operator perspective, nothing with this proposal will change how they operationally run things? Or how does this affect them then. Is it just that need to have 3 binaries available (say just that they exist in /usr/local/bin), and their systemd config remains the same?

Yes, exactly! The UX of running the node/validator doesnā€™t change at all. Itā€™s still just running a single polkadot binary, and thatā€™s it. What changes is the installation/upgrade UX. It wonā€™t be just ā€œcopy the single binary to a server and run itā€ anymore. Youā€™ll have to deploy additional auxiliary files into their proper locations and after that run the polkadot binary. And when you upgrade, you are supposed to properly upgrade the polkadot binary AND the auxiliary files.

1 Like