Stabilizing Polkadot

OliverTY · May 7, 2024, 2:05pm

I believe that we do have very good tech, and i think that we should prove it. Just claiming this without data to back it up is the same that everybody does.

Sure anybody could do this… but Parity&Fellowship are mostly always the ones who: wrote the buggy code, investigate the issue, fix it, and write the post-mortem. Yes in theory anybody else could also do that.
But by us doing this, we are not preventing anyone else from also doing it. So i think its not an argument for us to not do it.

Anyway, disregarding of who will do this, it could still help to define some metrics to allow anybody to build helpful monitoring. Maybe that should be its own topic, but for now i will put it in here:

SLIs (Service-Level-Indicators) & SLOs (Service-Level-Objectives)

These are the metrics and objectives that can help to paint a picture of how well our techstack is working. I think there are some low-hanging fruits that should be easy to define.

Polkadot Relay Chain

Authoring Rate

SLI: The number of authored blocks per time window.
SLO: Within [95, 100] per 10 min window.

Slow/Fast Block Rate

SLI: The number of blocks that have a timestamp diff of more than 10 sec to their parent per time window.
SLO: Not more than 10 per 1 hour window.

Finalization Rate

SLI: The number of finalized blocks per time window.
SLO: Within [95, 100] per 10 min window.

Slow Finalization Rate

SLI: The number of blocks that take more than 20 seconds to finalize per time window.
SLO: Less than 10 per 1 hour window.

Relevance

The relevance of the metrics is difficult to assess. But they should give us a one-way implication in the form of "SLO not reached" => "bad end-user experience".
The other direction of the implication is more difficult to define, but i think its a good start like this.

(This approach is a bit similar to the sTPS (standartd-TPS) idea, where we quantify some exactly defined metrics to prove our point.)

joepetrowski · May 7, 2024, 5:28pm

I think you mean like [95, 100] per 10 minutes? We do 10 blocks per minute :). And we can’t go over 100 because that would imply two blocks in one authoring slot.

OliverTY · May 7, 2024, 5:36pm

Oh yea (edited it)

Okay, having 100 as strict upper limit is even better i guess.

michal · May 10, 2024, 11:21am

The other approach to configuring the initial state could be runtime initial genesis config preset (introduced in #2714).

With this chopstick step would not be necessary , and 2nd step (generate chainspec) could be like this:

chain-spec-builder -c output/chainspec.json create -n poc-runtime -i poc-runtime -r ./output/poc_runtime.compact.compressed.wasm -s named-preset name-of-the-config

where name-of-the-config could be the name of preset used for local development, staging or local testnet, etc…

Would that be useful for developing runtime scenarios with omni-node?

xlc · May 13, 2024, 5:49am

It could be useful but to be honest, it is just easier to modify the chopsticks yaml config than setup preset and use it

OliverTY · June 27, 2024, 11:41am

Enabling The New Release Process

The time has finally come to move on from our current way of releasing the Polkadot SDK. Release 1.14 will be the last of its kind. From then on, we move to the New Release Process.
This will deliver more stability and ease of use for parachain developers.

There will be a stable release every three months and bug/security fixes during its lifetime. The idea is that parachain teams can just use cargo upgrade during these three months to stay up-to-date with the latest fixes without having to integrate any changes.

Implications for Parachain Developers

Future adjustments to this process rely largely on your feedback. We want to make the lives of parachain developers easy. If the first iteration does not achieve this, then we ought to change it.

Please report any SemVer violation of stable releases during its lifetime so that we can backtrack and improve the process.

Implications for SDK Developers

Developers need to be aware of this shift in mentality when designing new things. Polkadot is already a platform with tons of good features; we now need to focus on making them more reliable, easy to use and stable. Stable in this sense means to not change something that Parachain Teams are already using.

More concretely, the PrDoc system will be extensively used. The bump of each crate will determine the next version of a crate upon release date. The R0-silent label needs to be used with more care, as to not accidentally break downstream projects.
Getting an API right the first time is difficult. One thing that can help here is to introduce new features as “experimental”. This allows for future changes to the API until it’s properly stabilized, since experimental code can change at any time.

Please refer to RELEASE.md for info on how to merge bug fixes into a stable release.

Implications for SDK Auditors

The tedious task of arguing about audit priorities with stakeholders should be mostly rectified. The order in which audits must happen is exactly the order in which changes got merged into master. One exception exists for backports; as these are critical fixes, they need to be audited out-of-order to be released swiftly.

The head of auditing is responsible for advancing the audited label in lockstep with delivered audits. This label must always point to a commit on the master branch and thereby indicate the latest change that was completely audited. See AUDIT.md.

Implications for SDK Project Managers

New features can only be shipped every three months. This should be taken into account when drafting project timelines.

OliverTY · July 29, 2024, 8:16pm

The first stable release is out. The new tag naming schema is polkadot-stableYYMM with the latest release being called polkadot-stable2407.
There are backwards compatible tags as if it were called the 1.15 release, but still please adjust your release process for the future.

mcornholio · July 30, 2024, 4:49pm

Should we expect this to change as well? And if so, when?

./polkadot --version
polkadot 1.15.0-743dc632fd6

erik-dwellir · July 31, 2024, 12:15pm

Very good comment.

As we are packaging polkadot as a snap https://snapcraft.io/polkadot - we would like to be able name the version with what might be expectations for versioning of “stable” releases.

What would be the naming scheme for “hotfixes” etc? polkadot-stable2407-1 ?

What would the polkadot --version be is a relevant question.

gautam · August 2, 2024, 11:04am

And the release analysis report for this release is out as well

OliverTY · August 13, 2024, 8:10pm

First hotfix will be branched of tomorrow.

It will be stable2407-1. I dont expect a naming convention change of the binary version with that - so it will be 1.15.1.
For the next stable release (planned in September) we would adapt it. Not yet quite decided what to chose.

OliverTY · September 3, 2024, 11:36am

Stable2409 is now cut off and planned to release on the 25th. It will undergo testing procedures until then.
Past and future releases are tracked in the Release Registry to improve communications. Feel free to open issues there if you think that some info is missing or in an unsuitable format. The registry is in an early stage, and currently trying to find a good fit.

OliverTY · September 9, 2024, 2:38pm

Status Update for `stable2409`

(This is an experiment to try and coordinate comms inside & outside of Parity)

The release is now cut-off and being tested. The release draft can be found here. It is planned to be published on the 25th. This follows our new process of having a QA period before pushing out a release to ensure higher quality.
Future updates will be posted in the tracking issue.

OliverTY · September 27, 2024, 11:31am

Status Update for `stable2409`

Release stable2409 is published now. The Fellowship can now pick it up and and integrate it into the runtimes.

Westend and Rococo runtimes + nodes will be updated soon. The final changelog is here.

Topic		Replies	Views
Proposed Solutions for Unbricking an Enterprise Parachain with OpenGov Ecosystem	14	2059	September 18, 2023
Underestimated developer cost in Polkadot ecosystem Governance xcm	10	2232	May 24, 2024
Polkadot Parachain Omni-Node: Gathering Ideas and Feedback Ecosystem node	25	2567	October 30, 2024
Another proposal to fix Polkadot's crowdloan and auction system Ecosystem	8	900	April 19, 2023
The supply stability/sustainability of parachain slot Ecosystem	2	210	October 14, 2022

Stabilizing Polkadot

SLIs (Service-Level-Indicators) & SLOs (Service-Level-Objectives)

Polkadot Relay Chain

Authoring Rate

Slow/Fast Block Rate

Finalization Rate

Slow Finalization Rate

Relevance

Enabling The New Release Process

Implications for Parachain Developers

Implications for SDK Developers

Implications for SDK Auditors

Implications for SDK Project Managers

Status Update for stable2409

Status Update for stable2409

Related topics

Status Update for `stable2409`

Status Update for `stable2409`