Bread, pastries, and lots of other delicious foods are baked in an oven. If you are baking bricks or other building blocks, you need a kiln. Presently, Obsidian Systems' Kiln is an application to monitor Tezos nodes. Following releases will also monitor bakers, and in its final form, Kiln will run the node and the baker, baking the blocks that build Tezos.
A missed baking opportunity can often be traced to an issue with the node on which a baker relies. Nodes can fall behind the network, lose their internet connection, run out of storage, or experience an array of other issues. Any of these could cause a baker to miss their opportunity, and if the only way of knowing a node's health is to manually check-in, the risk of missing opportunities is much greater. Kiln identifies these issues the moment they happen and sends you an alert.
Monitoring your own nodes is good, but not quite enough. Their status (ie. head block level, the fitness of that block) is only relevant within the context of the network. Only then can you answer questions like 'is my node on a branch?' or 'has my node fallen behind?'
For that reason, Kiln has multiple Public Nodes for you to watch alongside your monitored nodes. Tezos is a distributed network with no canonical source of truth. Relying on one source for the context of the network would introduce a point of failure, cause false alarms, and promote centralization. We encourage you to connect to all 3 Public Node options - a node caching service offered by Obsidian Systems, a load balanced collection of nodes from the Tezos Foundation, and the tzscan API from OCaml Pro. We have plans to add more options in the future. We'd love to hear from you if you run a public node that is suitable for Kiln!
Kiln sends a notification when a monitored node (not a public node) has an issue. There are 4 in total:
- When Kiln can't connect to your node - this is most likely because it is no longer online.
- When your node has fallen 2 blocks behind the head block level - The head block level is the highest known block level of the fittest branch of any node (public nodes included) to which Kiln is connected. If your node is behind, your baker will miss its opportunity to bake or endorse.
- When your node is not on the fittest branch - There are no rewards for baking or endorsing blocks that are not on the main chain!
- When your node is on the wrong network - This is indicative of a setup issue. If you configured Kiln to monitor mainnet but your node is connected to alphanet, you'll see this error.
In addition to in-app notifications, Kiln can send you notifications via Telegram or by configuring your own SMTP mail server. For most users Telegram will be most convenient.
It is easy to start using Kiln. Running the Docker Image takes only two commands.
You can find the project here. It's entirely open sourced!
We're now shifting our focus to monitoring bakers. This covers information on a baker's history and rights, plus some other cool data bakers can use to measure performance and uptime. This also includes baker notifications relating to missed bakes and endorsements, sufficient baker balances, double operation accusations and more. You can expect to start seeing this functionality next month!
We've standardized the logging in Kiln (documented here) and will add notifications to those logs. If you are a sophisticated baker and have already found solutions for monitoring but would like to take advantage of some of these alerts, this should make it fairly easy to integrate it with services like PagerDuty.
Odds and Ends
We believe the alerts currently in place for nodes cover nearly every issue that could cause a baker to miss their rights due to node error. However, here's two ideas for additional notifications:
- Minimum Connection Threshold - If a baker is using Private Mode on one of their nodes, losing connection to their other nodes will jeopardize their participation in consensus. This setting would notify you if your node's connections drop below a number the user specifies.
- Tezos Update Available - Alphanet, zeronet, and mainnet periodically receive updates. This feature would compare a user provided git hash of their node and to the latest git hash of the respective Tezos branch in Gitlab. Should they no longer match up, the user would receive a notification.
We'd also like to add more notification pathways. Some options we're considering include Slack and Twilio (SMS).
We want to sincerely thank the bakers who have provided their input up until this point, either by filling out questionnaires, participating in interviews, and just talking about baking. We'd also like to thank the Tezos Foundation for their grant, allowing us to build these tools.
Using Kiln and have a question? Want to give feedback? We'd love to hear from you! Shoot us an email to join our Slack: email@example.com