Publish Substrate to crates.io

We used to be able to publish Substrate to crates.io and at some stage the process broke and remain broken as current.

I don’t think I need to re-iterate the benefits of be able to publish Substrate to a package registry.

Someone need to re-investigate the current status of the publish process, what’s broken, what needs to be done so we can have good idea of how much work are required to get this done.

And later Cumulus should also be published as well.

7 Likes

What would be the versioning scheme? Following polkadot version branches or syncing to something else?

Eventually we want semvar, but at this stage anything. Following polkadot version is a good starting point.

1 Like

We are very well aware, it’s a mess atm. We are coming back to this problem in the coming months.

Expect detailed plans on the roadmap before the end of November.

2 Likes

As a high level summary for the current state of this (as far as I know), the main blocker here has been the multiple separate repos which compose the Substrate / Polkadot ecosystem.

There has been talk of turning these repos into a monorepo for at least a short while in order to unblock all these publishing problems, and potentially over time spinning them out again to appropriately separate repos.

The tool used for publishing will probably be: GitHub - paritytech/subpub: Playing with automating the process of publishing crates from Substrate

Beyond that, I think we also talked about not following strict adherence to semver, and just having all crates release in lockstep with one another with specific version numbers.

I agree, the importance of releasing these crates are very high, and we should really prioritize to get this working.

1 Like

Substrate crates are now being published to crates.io, more specifically to the parity-crate-owner user.

Publishing happens automatically for new commits on Substrate’s master. The steps are described in Crates publishing automation · paritytech/releng-scripts Wiki · GitHub. That page also provides context on how the publishing automation is set up and what to do in case it fails. After crates are published a PR is submitted with all the version bumps e.g. [AUTOMATED] Update crate versions after publish by paritytech-ci · Pull Request #12946 · paritytech/substrate · GitHub, but merging those PRs is not relevant for actually using the published crates; the PR is useful for syncing the new versions to the source code, but it shouldn’t anyhow affect the development workflows.

There’s no manual intervention within that whole publishing pipeline, so you can count on automatically having new crate versions at most a few hours after a new commit appears on master. Only if some unexpected problem comes up we’ll need to manually intervene. On this note, automatic publishing has been enabled since last Friday (Dec 15) and the pipelines so far went well, save for one whose problem is already fixed. Overall the implementation has proven to be stable enough to the point where I can actually vouch for crates’ comsumption through crates.io.

One crucial detail is that currently any crate change is considered a breaking change. This means that minor and patch releases have to be performed manually because that use-case is not yet covered by the automation. As far as I’m aware there’s not even a plan for how to automate minor and patch releases, so having that level of sophistication is potentially a ways off.

For any concerns, reports or suggestions, please file a ticket at Issues · paritytech/releng-scripts · GitHub.

4 Likes

To address another point

No effort has gone into implementing publishing automation for Polkadot or Cumulus so far. That effort would turn into a roadmap item and/or ticket, the same way it was done for Substrate, but I’m not aware of any development on that front so far. For the lack of a plan I also can’t say when it’ll start. I’ll ping the people responsible for that and ask them to engage in this thread with more information later.

1 Like

Good to see some progress here but the work is useless for parachain teams if Polkadot/Cumulus is not updated to allow us to include Substrate from crates.io & Cumulus/Polkadot without duplicated deps.

shouldn’t the versions be based on polkadot releases? not just any commit to master?

Some updates

At the ask of @bkchr we’ll be moving to weekly publishing instead of per-commit. That change takes effect immediately.

This is still true, but I can provide a guess for when it could start: next in year, in January. That’s when the Release Engineering’s team lead (Mara) should be back from holidays and then we’ll be able to plan out how the crates.io publishing for Polkadot and Cumulus would work. I can draft a proposal in the meantime, but I’ll remain mostly inactive until January.

What does “based on” mean in this case, exactly? Does it mean that we should only trigger the publishing of Substrate crates before the releases of Polkadot? Or that the crates’ versions should somehow by influenced by the Polkadot version? Or something else?

I’m no Substrate/Polkadot expert but I infer that these projects are by design supposed to exist separately. This is corroborated by what’s mentioned in https://substrate.io/vision/substrate-and-polkadot:

However, although they’re synergistic, Polkadot and Substrate are not dependent on each other. Polkadot parachains can be built and maintained without ever touching Substrate (though alternative software options for doing so are currently limited) and chains built with Substrate do not need to be connected to Polkadot or Kusama. Substrate-based chains can exist as ‘solo-chains’ on an independent basis.

That being my understanding, I argue that Substrate crates’ versions shouldn’t anyhow be coupled to what’s happening in the Polkadot repository. In fact, this is how it works currently. For instance, a while ago, before the automated publishing pipeline even existed, version 7.0.0 of sp-std was published ad-hoc; “7.0.0” isn’t matching any Polkadot version - the PR even mentions that the publishing was motivated by subxt, not Polkadot. “7.0.0” also isn’t used uniformly for all crates in the repository since each has crate its own history and development lifecycle, e.g. some are newer or more/less mature than others.

I think the current versioning scheme for Substrate, where each crate has its own version and each crate version is managed separately, makes sense since crates have different histories and purposes. Not all crates are related to one other. I don’t see a reason to go for “one version for all” because that would incur bumping the versions of everything even if we want to actually only publish a new version of a single crate (example); the current strategy of only publishing new versions for crates which have changed is obviously more efficient than having to republish everything, every time, due to a sporadic version bump from a random crate.

1 Like

You are mixing there the Polkadot project and the Parity Polkadot implementation. The Parity Polkadot implementation is build on Substrate and there is a strong connection. Currently there are also no real Substrate releases and everything we do is “Polkadot centric”. This means people are building their stuff on our Polkadot branches in Substrate/Polkadot/Cumulus.

I see the switch to the weekly crates release just as an intermediate step as releasing after every commit is too much. Releasing every week would also be too much, as long as it is done automatically. In a perfect world we would release only crate versions when a human decides that a certain crate requires a new release. Maybe always enforced on a Polkadot release. But that would be too complicated for now and we should do this step by step.

The next step after the weekly crates release is that we release the crates when we do a Polkadot release. This would mean when there is a Polkadot release, we release the Substrate crates, we set the Substrate crates in Polkadot as deps, we release the Polkadot crates and then we do the same game again for Cumulus. Then we would have all crates in a coordinated way on crates.io.

6 Likes

All substrate crates come from the same monorepo. We already discussed this I think and we agreed to just treat them as one blob for now. Let us sync all those versions. It makes anyones life much easier. We can just say substrate version xxx. Instead of going to hunt for a compatible set of crates. This “only bump what you need” is a nice fantasy. But it is just that: A fantasy.

cumulus, polkadot, substrate is already a monorepo. Any change in a dependency repo (polkadot, substrate) requires a developer to make the dependees (cumulus, polkadot) compatible. This is a monorepo with extra steps. If we would depend via crates.io and it would be the job of the dependees to update their code it would be a whole different story.

Do we do this in addition to the weekly crates.io releases or instead of?

Does this mean the repos will be permanently linked via crates.io? Or do we keep git dependencies on master?

1 Like

Instead of. I don’t see any reasons then for weekly releases. We may require patch releases for certain crates in between, but this is a different story.

Good question, probably master uses git and releases switch to crates.io versions. That is something that we then can revisit in the future IMO.

Is it just me or the version discrepancies right now is a bit confusing? When we publish, we bump major at on each PR, but we don’t update the Cargo.toml on substrate. For example, [AUTOMATED] Update crate versions after publish by paritytech-ci · Pull Request #12946 · paritytech/substrate · GitHub has not been merged.

Shouldn’t bot merge push one last commit for you that updates the version, among a few optional other things like generating changelog?

We don’t release to crates.io per pr. In general I would like that we remove version numbers from master at all and then the tooling puts in the version numbers when it releases the things to crates.io.

Then the tooling would just bump the major?
Is that intended as long-term solution or just to get it started?

We want to start with just bumping the major any way. Later we can refine this process and see what kind of process we need/want.

Wen crates.io guys, we’re going to unionize

Another benefit here would be much more discoverable and up-to-date docs on docs.rs and lib.rs and all the other rust doc clone sites. Imagine being able to just type something substrate into Google and you get an up-to-date result without limiting your search to site:paritytech.github.io for example.

Having proper semvar also opens up the possibility to back-port fixes to prior major versions when necessary, so instead of having one giant linear tree, you could back-port non-breaking security fixes to the 1.x series even if you are already on 2.x for example. You get a non-linear tree. This is more complex, for sure, but imagine if for example once ubuntu 22.04 comes out you can no longer make security patches for 20.04, clearly with large complex projects with security-sensitive things this eventually becomes a necessary feature to any sane publishing strategy.

It is also worth noting this becomes a lot easier with a monorepo :wink:

1 Like