Stabilizing Polkadot

I believe that we do have very good tech, and i think that we should prove it. Just claiming this without data to back it up is the same that everybody does.

Sure anybody could do this… but Parity&Fellowship are mostly always the ones who: wrote the buggy code, investigate the issue, fix it, and write the post-mortem. Yes in theory anybody else could also do that.
But by us doing this, we are not preventing anyone else from also doing it. So i think its not an argument for us to not do it.

Anyway, disregarding of who will do this, it could still help to define some metrics to allow anybody to build helpful monitoring. Maybe that should be its own topic, but for now i will put it in here:

SLIs (Service-Level-Indicators) & SLOs (Service-Level-Objectives)

These are the metrics and objectives that can help to paint a picture of how well our techstack is working. I think there are some low-hanging fruits that should be easy to define.

Polkadot Relay Chain

Authoring Rate

SLI: The number of authored blocks per time window.
SLO: Within [95, 100] per 10 min window.

Slow/Fast Block Rate

SLI: The number of blocks that have a timestamp diff of more than 10 sec to their parent per time window.
SLO: Not more than 10 per 1 hour window.

Finalization Rate

SLI: The number of finalized blocks per time window.
SLO: Within [95, 100] per 10 min window.

Slow Finalization Rate

SLI: The number of blocks that take more than 20 seconds to finalize per time window.
SLO: Less than 10 per 1 hour window.

Relevance

The relevance of the metrics is difficult to assess. But they should give us a one-way implication in the form of "SLO not reached" => "bad end-user experience".
The other direction of the implication is more difficult to define, but i think its a good start like this.

(This approach is a bit similar to the sTPS (standartd-TPS) idea, where we quantify some exactly defined metrics to prove our point.)

1 Like

I think you mean like [95, 100] per 10 minutes? We do 10 blocks per minute :). And we can’t go over 100 because that would imply two blocks in one authoring slot.

Oh yea :man_facepalming: (edited it)

Okay, having 100 as strict upper limit is even better i guess.

The other approach to configuring the initial state could be runtime initial genesis config preset (introduced in #2714).

With this chopstick step would not be necessary , and 2nd step (generate chainspec) could be like this:

chain-spec-builder -c output/chainspec.json create -n poc-runtime -i poc-runtime -r ./output/poc_runtime.compact.compressed.wasm -s named-preset name-of-the-config

where name-of-the-config could be the name of preset used for local development, staging or local testnet, etc…

Would that be useful for developing runtime scenarios with omni-node?

1 Like

It could be useful but to be honest, it is just easier to modify the chopsticks yaml config than setup preset and use it

Enabling The New Release Process

The time has finally come to move on from our current way of releasing the Polkadot SDK. Release 1.14 will be the last of its kind. From then on, we move to the :sparkles: New Release Process.
This will deliver more stability and ease of use for parachain developers.

There will be a stable release every three months and bug/security fixes during its lifetime. The idea is that parachain teams can just use cargo upgrade during these three months to stay up-to-date with the latest fixes without having to integrate any changes.

Implications for Parachain Developers

Future adjustments to this process rely largely on your feedback. We want to make the lives of parachain developers easy. If the first iteration does not achieve this, then we ought to change it.

Please report any SemVer violation of stable releases during its lifetime so that we can backtrack and improve the process.

Implications for SDK Developers

Developers need to be aware of this shift in mentality when designing new things. Polkadot is already a platform with tons of good features; we now need to focus on making them more reliable, easy to use and stable. Stable in this sense means to not change something that Parachain Teams are already using.

More concretely, the PrDoc system will be extensively used. The bump of each crate will determine the next version of a crate upon release date. The R0-silent label needs to be used with more care, as to not accidentally break downstream projects.
Getting an API right the first time is difficult. One thing that can help here is to introduce new features as “experimental”. This allows for future changes to the API until it’s properly stabilized, since experimental code can change at any time.

Please refer to RELEASE.md for info on how to merge bug fixes into a stable release.

Implications for SDK Auditors

The tedious task of arguing about audit priorities with stakeholders should be mostly rectified. The order in which audits must happen is exactly the order in which changes got merged into master. One exception exists for backports; as these are critical fixes, they need to be audited out-of-order to be released swiftly.

The head of auditing is responsible for advancing the audited label in lockstep with delivered audits. This label must always point to a commit on the master branch and thereby indicate the latest change that was completely audited. See AUDIT.md.

Implications for SDK Project Managers

New features can only be shipped every three months. This should be taken into account when drafting project timelines.

21 Likes

The first stable release is out. The new tag naming schema is polkadot-stableYYMM with the latest release being called polkadot-stable2407.
There are backwards compatible tags as if it were called the 1.15 release, but still please adjust your release process for the future.

5 Likes

Should we expect this to change as well? And if so, when?

./polkadot --version
polkadot 1.15.0-743dc632fd6
2 Likes

Very good comment.

As we are packaging polkadot as a snap https://snapcraft.io/polkadot - we would like to be able name the version with what might be expectations for versioning of “stable” releases.

What would be the naming scheme for “hotfixes” etc? polkadot-stable2407-1 ?

What would the polkadot --version be is a relevant question.

1 Like

And the release analysis report for this release is out as well :tada:

First hotfix will be branched of tomorrow.

It will be stable2407-1. I dont expect a naming convention change of the binary version with that - so it will be 1.15.1.
For the next stable release (planned in September) we would adapt it. Not yet quite decided what to chose.

1 Like

Stable2409 is now cut off and planned to release on the 25th. It will undergo testing procedures until then.
Past and future releases are tracked in the Release Registry to improve communications. Feel free to open issues there if you think that some info is missing or in an unsuitable format. The registry is in an early stage, and currently trying to find a good fit.

3 Likes

Status Update for stable2409

(This is an experiment to try and coordinate comms inside & outside of Parity)

The release is now cut-off and being tested. The release draft can be found here. It is planned to be published on the 25th. This follows our new process of having a QA period before pushing out a release to ensure higher quality.
Future updates will be posted in the tracking issue.

6 Likes

Status Update for stable2409

Release stable2409 is published now. The Fellowship can now pick it up and and integrate it into the runtimes.

Westend and Rococo runtimes + nodes will be updated soon. The final changelog is here.