What's new in Salt 3002 Magnesium
28 minute read Updated:
This is an unofficial summary of new features in the Salt Magnesium release. The release is relatively modest with new features and mostly focused on bugfixes (which is actually good because we had enough regressions in the past releases! 🤞). If you want to know about other changes and deprecations, go read the official release notes and the changelog. The packages are available at repo.saltproject.io.
And if you haven’t heard the shocking big news: VMware has acquired SaltStack on Oct 13, 2020 (Press release, Early announcement 1, Early announcement 2, The New Stack publication, Q/A video with Tom Hatch).
What's New in Salt 3002 Magnesium: Development, Notable bugfixes, Idem support, Cloud, FreeBSD, macOS Big Sur, Windows, Networking, GitFS, SaltSSH, Tiamat packages, Nifty tricks
- Notable bugfixes
- Development
- Idem support
- Cloud
- FreeBSD
- macOS
- Windows
- Networking
- Performance
- SaltSSH
- Certificates
- Files
- Disks and volumes
- MySQL
- GitFS and Pillar
- Bootstrap
- Beta Tiamat packages
- Nifty tricks
- Other notable changes
Want to read about the upcoming Argon release?
I’m always hesitant to commit to writing another post like this one (it takes a lot of time!). However, I get bits of motivation to do so when people subscribe to the mailing list:
Powered by Mailgit
Notable bugfixes
Usually, I do not cover bugfixes, but this time I’ll make an exception and highlight a couple of important ones (the choice of bugs to mention is totally subjective!). Unfortunately, the new Salt release policy means that these fixes won’t be backported to older supported versions 😞
ZMQ hang
It is a really nasty bug that manifested itself as a salt-call
command hang. The garbage collector was involved, and a couple of initial attempts (1, 2, 3) to find the root cause were unsuccessful. Finally, after three long months, the problem was solved!
Fixed in PR #58364 by Charles McMarrow
It looks like the multithreading/multiprocessing code in Salt is convoluted. Maybe that is the reason why it can’t be upgraded to Tornado 6 yet…
Loader race condition
Apparently, this multithreading bug affected Salt for years, and the first attempts to fix it were taken in 2015. You can take a look at issue #58369 and all the linked ones to see how it manifested itself and the troubles it caused. The initial fix was merged into the old develop
branch and was supposed to be shipped in Salt Neon, but it wasn’t backported to the new master
branch. The final fix was implemented by @dwoz
in Salt Magnesium.
Fixed in PRs #56513 by Daniel Wozniak , and #58630 by Pedro Algarvio
Msgpack buffer size
Another bug was also proliferating for almost two years (at least since 2019) and resulted in unresponsive Salt Master when a minion payload exceeded a certain size limit (with specific msgpack
versions).
salt-ssh
reverse DNS lookup
It looked more like a feature, but I’m really happy that it was finally removed because it affected my workflow quite often. The gist is that salt-ssh
did a reverse DNS lookup when an IP address was specified, and then used the resulting DNS name to connect to a host (or failed horribly when there was no reverse DNS record). I also believe that this feature (now removed) had potential security implications.
Fixed in PR #58163 by Megan Wilhite
Development
Welcome Bot
The Stale Bot is dead; long live the Welcome Bot!
PR #58414 by Kirill Ponomarev
Pyupgrade
A pre-commit hook that uses pyupgrade to upgrade code syntax to Py3 incrementally. While the automated Py3 transition is really cool, the incremental part is debatable because it adds a lot of noise to each PR, and it is impossible to add it to .git-blame-ignore-revs
. I would prefer a single giant commit like with Black.
PR #57936 by Pedro Algarvio
Rstcheck
Another cool pre-commit
addition will check the documentation *.rst
files using the rstcheck
tool. It can validate both the rst
syntax and any embedded code blocks.
PR #58675 by Pedro Algarvio
Pytest
There were tons of Pytest-related changes made by Pedro Algarvio , but the switch to Pytest didn’t happen yet.
Idem support
Idem is a new standalone state/execution engine that is similar to Salt states. This topic is quite extensive, and I’m not going to cover it here (see references for more details on that). Instead, I’ll focus on Salt’s actual changes, provide a few examples, and add a couple of thoughts at the end.
There are three things the change brings:
- Sub state returns
idem
execution moduleidem
state module
Sub state returns
Below is the current state data return format:
def succeed(name):
return {
'name': name,
'result': True,
'comment': 'Success!',
'changes': {
'old': 'Old thing',
'new': 'New thing'
}
}
And here is the same state return with the new sub_state_run
key:
def succeed(name):
return {
'name': 'Salt_Idem_State',
'result': True,
'comment': 'Success!',
'changes': {
'old': 'Old thing',
'new': 'New thing'
},
'sub_state_run': [
{
'name': 'Idem_Sub_State',
'result': True,
'comment': 'Sub state success!',
'changes': {
'testing': {
'old': 'Old sub thing',
'new': 'New sub thing'
}
},
'low': {
'name': 'Idem_Sub_State',
'state': 'test',
'__id__': 'test_state',
'fun': 'succeed_with_changes'
}
},
]
}
These sub_state_run
items will be formatted and printed alongside other Salt states, as long as they look like normal states with low and high data. This is useful for states that can return multiple state runs from an external engine. For example, it could be utilized by state modules that extend tools like Puppet, Chef, Ansible, and Idem. So far, the feature is only used by Idem state in Salt.
idem
execution module
First, we need to install Idem on a minion:
% sudo salt minion1 pip.install idem
minion1:
----------
pid:
17444
retcode:
0
stderr:
stdout:
Collecting idem
...
Installing collected packages: aiofiles, aiologger, pop-config, dict-toolbox, pop, colored, toml, rend, acct, idem
Successfully installed acct-2.3 aiofiles-0.4.0 aiologger-0.6.0 colored-1.4.2 dict-toolbox-1.12 idem-7.5 pop-14 pop-config-6.10 rend-4.2 toml-0.10.1
Now we can try running an Idem execution module:
% sudo salt minion1 idem.exec test.ping
minion1:
True
Success! This is an Idem ping, not a Salt one, although it is proxied through Salt. The call chain looks like this:
[Salt CLI] -> [salt-master] -> [salt-minion] -> [Salt Idem module] -> [POP] -> [Idem] -> [Idem test.ping]
The same is possible on the salt-master:
% sudo pip3 install idem
Collecting idem
...
Installing collected packages: toml, dict-toolbox, aiofiles, aiologger, pop-config, pop, acct, colored, rend, idem
Successfully installed acct-2.3 aiofiles-0.4.0 aiologger-0.6.0 colored-1.4.2 dict-toolbox-1.12 idem-7.5 pop-14 pop-config-6.10 rend-4.2 toml-0.10.1
% sudo salt-run salt.cmd idem.exec test.ping
[INFO ] Module /usr/local/lib/python3.8/dist-packages/pop/mods/pop/testing returned virtual FALSE: Async pop testing libs are not available
True
[INFO ] Runner completed: 20201018154413570141
We got some logging garbage, but the ping has succeeded! The call chain looks different:
[Salt Runner CLI] -> [salt-master] -> [salt.cmd runner] -> [Salt Idem module] -> [POP] -> [Idem] -> [Idem test.ping]
I guess it could be more straightforward if an Idem runner did exist:
% sudo salt-run idem.exec test.ping
Because there are various idem-*
cloud drivers being developed, running them directly from the Salt master (through this hypothetical runner or maybe a Salt cloud driver) would be more convenient.
idem
state module
Now let’s try to run some Idem states through Salt states. First, we need to write an Idem state (it looks exactly like a Salt one):
test_sub_state:
test.succeed_with_changes:
- name: Idem_Sub_State
Because Idem doesn’t have any fileserver components yet (it is pretty much like the salt-call --local
command), the state file needs to reside on the target machine. However, we can distribute it using Salt! Let’s save this state into salt://magnesium/idem/idem_state.sls
.
Next, we need to write a Salt state that distributes the file to a target machine and then calls the Idem state module:
# magnesium/salt_idem_state.sls
cache_idem_state_file:
file.cached:
- name: salt://magnesium/idem/idem_state.sls
idem_state:
idem.state:
- runtime: parallel
- sls: idem_state
# The dir where Idem will look for idem_state.sls:
- source_dir: /var/cache/salt/minion/files/base/magnesium/idem
Let’s try to run it:
% sudo salt minion1 state.apply magnesium.salt_idem_state
minion1:
----------
file_|-cache_idem_state_file_|-salt://magnesium/idem/idem_state.sls_|-cached:
----------
__id__:
cache_idem_state_file
__run_num__:
0
__sls__:
magnesium.salt_idem_state
changes:
----------
hash:
----------
new:
0c35e83383dbc4310e9763f6c1c92af38d087769386cd44f6c130633be0f61d7
old:
55a744653d314fbf9075d0e676fe03f97aaa5c0d287e50b7dcf0cb61fb7b38b3
comment:
File is cached to /var/cache/salt/minion/files/base/magnesium/idem/idem_state.sls
duration:
55.144 ms
name:
salt://magnesium/idem/idem_state.sls
result:
True
start_time:
23:29:42.799900
idem_|-idem_state_|-idem_state_|-state:
----------
__id__:
idem_state
__run_num__:
1
__sls__:
magnesium.salt_idem_state
changes:
----------
comment:
Ran 1 idem states
duration:
86.894 ms
name:
idem_state
result:
True
start_time:
23:29:42.866037
test_|-idem_sub_state__|-Idem_Sub_State_|-succeed_with_changes:
----------
__run_num__:
3
__sls__:
magnesium.salt_idem_state
__state_ran__:
True
changes:
----------
testing:
----------
new:
Something pretended to change
old:
Unchanged
comment:
Success!
duration:
None
name:
Idem_Sub_State
result:
True
start_time:
None
The output looks weird (unlike the normal highstate output), but I guess it is ok for an experimental feature. I haven’t tried running any cloud drivers (idem-vagrant
, idem-virtualbox
, idem-aws
, etc…), but I plan to do so when I have more free time. So far, I spent about 6 hours experimenting with this Salt/Idem integration, the docs are quite sparse, and I encountered a couple of issues which I plan to report (the examples above are intentionally crafted to avoid them).
Overall, there are a couple of other gaps that aren’t addressed yet.
No reliable ways to install/upgrade
The recommended method to install Idem is to use pip
. Unfortunately, the dependencies (and any transitive ones) aren’t pinned to exact package versions. This means installs aren’t repeatable, and something can break unexpectedly in a new environment.
It is harder to manage the upgrade process, as compared to apt/yum, for example.
Some components aren’t regularly uploaded to PyPI, so you have to install them directly from Git.
Atomized dependencies
idem
, pop
, pop-config
, acct
, rend
, takara
, and the various idem-*
components. It is good for developers, but for new users, it can be hard to understand how these components interact, how to troubleshoot any problems, and where to file bugs. The documentation is focused on developers, is quite spotty, and dispersed across multiple projects, which also doesn’t help.
No compelling reasons to switch
So far, POP and Idem are mostly marketed to developers. The overall messaging is that these components:
- Are a new programming paradigm and a dataflow language
- Are easy to compose/merge, support pipelining
- Have typed arguments, contracts, etc.
- Use asynchronous Python 3 code
- Are easy to write
- Are easy to teach/handoff to a new developer
There is much less information on why end-users should use these projects instead of (or in addition to) other tools:
- What can Idem do for me that I can’t achieve with Terraform, Pulumi, Ansible, Salt, etc?
- What is the feature parity of the various
idem-*
cloud providers with other tools? Are there any killer use-cases that are only achievable with Idem? - “Takara is the standalone manager for keeping track of secrets at rest”. How is it different from the (encrypted) pillar or Vault?
Summary
- POP, Idem, etc., are still in the early stages of development
- If you are a developer and want to understand and play with these projects - definitely do that!
- For end-users - try if you are adventurous, but do not expect any super practical use-cases yet. The potential is there, but it needs a lot of development efforts to achieve it.
- This distributed (plugin-based) development model could result in three big changes (as compared to Salt): a) some components will be iterated on and released much faster, and some of them can become unmaintained; b) the overall quality variation will be higher (because the components will be maintained by different people instead of SaltStack team); c) it would be harder to install/upgrade the resulting product (atomized and incompatible components, unsynchronized releases, feature disparity, etc.)
- Overall it makes sense to keep an eye on these projects. They are in active development, and it was said that VMware is going to continue investing in this area.
References
PRs #58119 and #57993 by Akmod
Cloud
Linode APIv4
The Linode APIv4 went to the Early Access stage somewhere in 2017, then switched to the General Release stage in May 2018. The SaltStack cloud driver transition was initiated about a year ago (I’ve seen a couple of Slack discussions during the summer of 2019), and the actual PR was merged in Sep 2020. The previous version (APIv3) is marked as deprecated, but I was unable to find a scheduled removal date.
To switch the driver to v4, use the api_version
salt-cloud provider option:
my-linode-provider:
driver: linode
api_version: v4
apikey: f4ZsmwtB1c7f85Jdu43RgXVDFlNjuJaeIYV8QMftTqKScEB2vSosFSr...
password: F00barbaz
PRs #58093 and #58415 by Charlie Kenney
Azure
Enable rudimentary support for Azure MSI-style authentication. This is similar to assigning roles via IAM profiles in Amazon EC2 and prevents the requirement to put secure credentials into config files inside of VMs.
AWS
- Make the
list_availability_zones
AWS cloud function public. PR #58422 by Mark Ferrell - Allow looking up the default AWS VPC, which is automatically created at account creation time. PR #58628 by Mark Ferrell
- Major refactoring of
boto
modules to switch fromboto
toboto3
. It is a work in progress by Mark Ferrell . Unfortunately, all his PRs didn’t make into the Salt Magnesium release (it would be cool if SaltStack did some release planning/coordination with major contributions like this). PRs #58660, #58622. WIP (unmerged) PRs: #58624, #58709, #58713, #58715, #58723
Other cloud changes
- Allow using a custom port for the Proxmox connection. PR #58103 by @mahalel
- Add
libvirt
memory tuning options to allow much greater control of memory allocation. Available options:hard_limit
,soft_limit
,swap_hard_limit
, andmin_guarantee
. PR #57636 by Guoqing Li - Modify the existing
virt.migrate
functionality to use the Python bindings oflibvirt
instead of thevirsh
command-line tool. This approach makes thelibvirt
migration options available to users via a generic interface that doesn’t require an increased number of functions. Although backward compatibility is preserved, thessh
parameter ofvirt.migrate
,virt.migrate_non_shared
, andvirt.migrate_non_shared_inc
have been marked as deprecated. PR #57947 by Radostin Stoyanov - Storage pool and node device events are now reported as Salt events in
libvirt
events engine. PR #57747 by Cedric Bosdonnat - Allow setting VM boot devices order in
virt.running
andvirt.defined
states through theboot_dev
parameter. PR #57545 by Cedric Bosdonnat - Leave boot parameters untouched if boot parameter is set to
None
invirt.update
. PR #58332 by Cedric Bosdonnat - Convert the disks that are defined using
libvirt
volumes into file- or block-based disks where possible. PR #58400 by Cedric Bosdonnat
FreeBSD
FreeBSD is now officially supported!
- New CI pipeline
- Improved
salt-bootstrap
support. PRs #1462 by Christer Edwards and #1487 by Kirill Ponomarev - Tons of fixed tests and other improvements by Kirill Ponomarev and Pedro Algarvio : PRs #57427, #57457, #57615, #57616, #57620, #57622, #57623, #57643, #58088, #58152, #58161, #58200, #58236, #58240, #58268, #58275, #58290, #58292, #58293, #58295, #58527, #58697, #58756
macOS
The majority of macOS-related changes are bugfixes, but as a whole they mean that macOS Big Sur is now supported!
- Various bugfixes in macOS and Windows service management modules. PR #57942 by Wesley Whetstone
- Include Big Sur version “11” in
mac_softwareupdate
(apparently Apple is using both “10.16” and “11” for versioning Big Sur, depending on where you look). PR #58247 by Shea Craig - Fix
mac_service
failures on Big Sur. PR #58144 by Wesley Whetstone - This change makes it so if you specify a service on macOS by the naming convention similar to that of other platforms like
salt-minion
it will convert it over to its proper namecom.saltstack.salt.minion
. It will do the same forsalt-master
,salt-api
, andsalt-syndic
. PR #57646 by Wesley Whetstone - Prevent the GUI prompt to install the Developer Command Line Tools from appearing at GitPython import time. PR #58581 by Shea Craig
- Also, there is some research going on to support Code Signing and Notarization for Salt packages: https://gist.github.com/twangboy/064936b14e3c5ce4b58771d3e4534c9a
Windows
- Experimental support for
salt-api
on Windows. The only supported backend isrest_cherrypy
, without PAM and SSL features. PR #58049 by @rares-pop - Add optional
execution_timeout
parameter tochocolatey.installed
state. PR #58053 by @rmustard and Shane Lee - New
chocolatey.list_sources
function to get an overview of the repositories present on the minions. Newchocolatey.source_present
state to add a custom repository with optional authentication. PR #58590 by @TGuimbert - Add restart delay of 60 seconds when minion fails to start (to prevent a CPU-hungry restart loop). PR #58523 by Shane Lee
Networking
Delta Proxy supporting code
These changes prepare some ground for the closed source Delta Proxy feature (that is NOT going to be open-sourced yet because of the implementation complexity). They are mostly focused on passing configuration and context options, subproxy process management, and netmiko
proxy module improvements. There is no point in trying to run this code because the proprietary bits are not there, and the minion will likely crash if you set the metaproxy: deltaproxy
config option.
PR #58403 by Gareth J. Greenaway
Other networking changes
- Make
proxy_config
read the proxy specific configuration which is typically found in/etc/salt/proxy.d/minionid/
. PR #58307 by Gareth J. Greenaway - Add network teaming support (and clean up bond support) for RHEL/CentOS. Network teaming support was added via two new interface types:
team
andteamport
. Interface templates for EL5 and EL6 were removed, as Salt is no longer supported on these platforms. PR #57775 by Erik Johnson and Andreas Thienemann - Add
accept_ra 2
(Accept Router Advertisements) option support tomodules.debian_ip
(in addition to currently supported 0 and 1 modes). PR #58097 by @Natrinicle - The
nftables
execution and state modules are now much closer to theiptables
modules and work onnftables
versions tested with CentOS 8. PR #56259 by Nicholas Hughes - Make DNS optional in the
nilrt_ip.set_static_all
execution function. PR #58479 by Cristian Hotea - Configure network interfaces without cable with
nilrt_ip
, plus a bit of refactoring. PR #56893 by @alexvasiu - Adjust CIMC proxy module to handle HTTP errors and timeouts properly. PR #58050 by @spenceation
Performance
- Fix pillar caching for multiple pillar environments. PR #58274 by Gareth J. Greenaway
- Reduce proxy minion startup time by changing the
enable_fqdns_grains
setting to default toFalse
. PR #57676 by Gareth J. Greenaway - Add 2-second timeout on DNS socket check to
utils.network.dns_check
. This avoids holding up the minion start for several minutes and helps detect unreachable masters much more quickly. PR #58046 by @gdavis33 - Do not fallback to 127.0.0.1 for un-resolvable masters in failover mode. It also avoids unnecessary delays on minion start and salt-call operations when their primary master is down. PR #57699 by Serge Dubrouski
SaltSSH
- A small change for the
scan
roster type to makesalt-ssh
accept a comma-separated list of minions. PR #57799 by Dmitry Kuzmenko - Do not allow Python 2 modules to be added to
salt-ssh
tar archive by default. You either need to install Python 3 on the target host or use an older Salt version (e.g., 2019.2.5) withssh_ext_alternatives
. PR #58389 by Megan Wilhite
Certificates
- Add optional
retcode
argument towin_certutil.add_store
andwin_certutil.del_store
functions so they return retcode instead of standard output. Thewin_certutil
state will now succeed across different languages by observing return code rather than output. PR #57831 by Tyler - Allow
x509.certificate_managed
state to use a CSR. PR #58282 by @alxwr
Files
- An ability to regex match a line in the managed file in order to place the block above or below that line. This capability is enabled with the two mutually exclusive parameters for
file.blockreplace
state (insert_before_match
andinsert_after_match
), that accept regex expressions. PR #56691 by Nicholas Hughes - The
serializer
argument has been added to thefile.serialize
state. It supersedes the mutually exclusiveformatter
argument, serves the same purpose, and brings more symmetry with theserializer_opts
anddeserializer_opts
argument names. This change also avoids adding a trailing newline for binary serializers like msgpack. PR #57858 by Erik Johnson - Ignore file permissions for symlinks in
file.directory
state. PR #57782 by Dmitry Kuzmenko - Add new
verify_ssl
option (True
by default) to file modules (file.managed
state;cp.cache_file
,file.get_source_sum
,file.get_managed
,file.check_managed_changes
,file.check_file_meta
,file.manage_file
functions). When set toFalse
, remote file sources (https://
) andsource_hash
won’t attempt to validate the server certificate. PR #58451 by Megan Wilhite - Create an
ini
file if it does not exist when using theini.options_present
state module. PR #58339 by Megan Wilhite
Disks and volumes
- Add
lvm
grain to list LVM volume groups and their logical volumes (works on Linux and AIX). PR #57631 by Piter Punk - Change the
lvm.lv_present
state to accept aresizefs
switch that will resize the filesystem when the logical volume is resized. Add thepvresize
andlvextend
functions tolinux_lvm
module. Add theforce
flag tolinux_lvm.pvcreate
,linux_lvm.pvremove
,linux_lvm.vgcreate
,linux_lvm.vgextend
,linux_lvm.vgremove
,linux_lvm.lvremove
,linux_lvm.lvresize
. PR #58133 by Piter Punk - If the size specified in a
lvm.lv_present
state is different than the logical volume actual size, the logical volume will be increased to the new specified size. PR #57659 by Piter Punk - Add the
fs_mount
parameter to themount.fstab_present
state to determine if the filesystem will be mounted bymount all
on AIX machines. PR #57669 by Piter Punk
MySQL
- Update
mysql_database
state to work in test mode. PR #58149 by Cesar Augusto Sanchez - Add support of
GRANT
with per column privileges (GRANT SELECT(column1, column2) ON database.table TO user@host;
should work now). PR #55980 by sizgiyaev
GitFS and Pillar
Per-remote git_pillar_base
override
When you have multiple pillar git repositories, and the majority of them have a full set of branches (i.e., develop
, qa
, staging
and master
) you can also have a couple of exceptions that have less branches (i.e., just develop
and master
). In this case, you may want to always point the latter ones to a specific branch on every salt-master (while keeping the ability to override pillarenv
). Now you can do so:
git_pillar_base: develop
ext_pillar:
- git:
- __env__ https://gitserver/git-pillar.git
- __env__ https://gitserver/git-pillar2.git:
- base: master
PR #57288 by Mathieu Parent
Update specific GitFS repos
A new remotes
argument to specify which repositories to update:
salt-run fileserver.update backend=git remotes=myrepo,yourrepo
PR #55014 by Mathieu Parent
Bootstrap
There are a couple of improvements in salt-bootstrap.sh
script:
- Support for Linux Mint 20. PR #1502 by Tai Groot
- Improved Gentoo support: add
git
installation method; install any version available in Gentoo portage using thestable
method; support both OpenRC and systemd; add tests. PR #1500 by Ivo Jánský - Allow pinning minor 3xxx versions by specifying a fake .0 version suffix:
salt-bootstrap.sh -x python3 stable 3001.0
. PR #1491 by Max Arnold - Improved FreeBSD support
Beta Tiamat packages
In addition to the traditional Salt packages, there are new beta Tiamat-based ones (generated nightly). Tiamat is a wrapper around PyInstaller that simplifies the process of building Python projects as a single frozen binary.
The packages are available through artifactory.saltstack.net and there is a README.md with instructions on how to install them on different operating systems.
Nifty tricks
Jinja profiling
This feature adds some fine-grained profiling capabilities to Jinja templates. It could help optimize pillar (think salt-master
CPU usage!), and state render times, especially when you abuse heavily use Jinja and various map files.
Let’s take a simple state file (that also uses a yaml map) to see how to profile it:
# magnesium/profile_example.sls
{% import_yaml 'magnesium/profile_map.yaml' as data %}
{% set local_data = {'counter': 0} %}
{% for i in range("0xB00"|int(base=16)) %}
{% do local_data.update({'counter': i}) %}
{% endfor %}
always-changes-and-succeeds:
test.succeed_with_changes:
- comment: "Count: {{ local_data['counter'] }}"
# magnesium/profile_map.yaml
{% set data = {'counter': 0} %}
{% for i in range("0x700BAD"|int(base=16)) %}
{% do data.update({'counter': i}) %}
{% endfor %}
data: {{ data }}
Here is how profiling-related log messages looked before Salt Magnesium:
% sudo salt-call state.apply magnesium.profile_example -l profile 2>&1 | grep PROFILE
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/magnesium/profile_example.sls' using 'jinja' renderer: 6.4014732837677
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/magnesium/profile_example.sls' using 'yaml' renderer: 0.0005137920379638672
We just see the basic render time for each renderer stage, and that’s it. In Salt Magnesium, all import_*
statements are also automatically profiled! Note the first log line that is new here and is missing in the previous output:
% sudo salt-call state.apply magnesium.profile_example -l profile 2>&1 | grep PROFILE
[PROFILE ] Time (in seconds) to render import_yaml 'magnesium/profile_map.yaml': 7.446487665176392
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/magnesium/profile_example.sls' using 'jinja' renderer: 7.470700025558472
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/magnesium/profile_example.sls' using 'yaml' renderer: 0.002416372299194336
This automatic profiling is enabled for the import_yaml
, import_json
, and import_text
statements. Unfortunately, it doesn’t work with native Jinja imports (import
and from ... import ...
). Below is a modified state file that uses from ... import
instead of import_yaml
:
# magnesium/profile_example.sls
{% from 'magnesium/profile_map.yaml' import data %}
{% set local_data = {'counter': 0} %}
{% for i in range("0xB00"|int(base=16)) %}
{% do local_data.update({'counter': i}) %}
{% endfor %}
always-changes-and-succeeds:
test.succeed_with_changes:
- comment: "Count: {{ local_data['counter'] }}"
Oops, we’re back to square one (just two default log lines):
% sudo salt-call state.apply magnesium.profile_example -l profile 2>&1 | grep PROFILE
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/magnesium/profile_example.sls' using 'jinja' renderer: 6.4014732837677
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/magnesium/profile_example.sls' using 'yaml' renderer: 0.0005137920379638672
Fortunately, the new profile
Jinja block statement allows you to dig even further and measure how much time a specific Jinja line or block takes. Let’s add it to the state file to measure the import statement as well as the loop that modifies the local_data
dictionary:
# magnesium/profile_example.sls
{% profile as 'import data' %}
{% from 'magnesium/profile_map.yaml' import data %}
{% endprofile %}
{% profile as 'local data' %}
{% set local_data = {'counter': 0} %}
{% for i in range("0xB00"|int(base=16)) %}
{% do local_data.update({'counter': i}) %}
{% endfor %}
{% endprofile %}
always-changes-and-succeeds:
test.succeed_with_changes:
- comment: "Count: {{ local_data['counter'] }}"
Now there are four logged messages! The first one shows how much the from ... import
statement takes, and the second one measures the for
loop block:
% sudo salt-call state.apply magnesium.profile_example -l profile 2>&1 | grep PROFILE
[PROFILE ] Time (in seconds) to render profile block 'import data': 6.11327338218689
[PROFILE ] Time (in seconds) to render profile block 'local data': 0.002644062042236328
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/magnesium/profile_example.sls' using 'jinja' renderer: 6.132274866104126
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/magnesium/profile_example.sls' using 'yaml' renderer: 0.0004177093505859375
You can add as many profile
block statements as you want (and even nest them!) to find what part of your template takes most of the time.
PR #57850 by Justin Findlay and Brian Harring
Pro Tip: if you want to perform some profiling on a remote minion and do not want to leave your cozy salt-master
console (running salt-call
remotely or checking the logs can be cumbersome), try the following command:
% sudo salt minion1 cmd.run \
'salt-call state.apply magnesium.profile_example -l profile 2>&1 | grep PROFILE'
minion1:
[PROFILE ] Time (in seconds) to render profile block 'import data': 6.6108009815216064
[PROFILE ] Time (in seconds) to render profile block 'local data': 0.002589702606201172
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/magnesium/profile_example.sls' using 'jinja' renderer: 6.627198219299316
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/magnesium/profile_example.sls' using 'yaml' renderer: 0.00037360191345214844
In one shot, you can run a minion process (the salt-call
command) with an elevated logging level AND return the resulting logs back to the master.
Cleanup of template context variables
This improvement fixes a couple of long-standing bugs with sls template context variables and also documents all of them:
Variable | How formula is applied | Old value | Value with enable_slsvars_fixes |
---|---|---|---|
tplfile | state.apply formula.formula | formula.sls | formula/formula.sls |
Variable | State file | Old value | Value with enable_slsvars_fixes |
---|---|---|---|
tplfile | salt://example.sls | (blank) | example.sls |
tplfile | salt://formula/example.sls | formula/example.sls | formula/example.sls |
Variable | How formula is applied | Old value | Value with enable_slsvars_fixes |
---|---|---|---|
slspath | state.apply formula | formula | formula |
slspath | state.apply formula.init | formula/init | formula |
sls_path | state.apply formula | formula | formula |
sls_path | state.apply formula.init | formula_init | formula |
slsdotpath | state.apply formula | formula | formula |
slsdotpath | state.apply formula.init | formula.init | formula |
slscolonpath | state.apply formula | formula | formula |
slscolonpath | state.apply formula.init | formula:init | formula |
tplfile | state.apply formula | formula/init.sls | formula/init.sls |
tplfile | state.apply formula.init | formula/init.sls | formula/init.sls |
tpldir | state.apply formula | formula | formula |
tpldir | state.apply formula.init | formula | formula |
tpldot | state.apply formula | formula | formula |
tpldot | state.apply formula.init | formula | formula |
To enable the improved behavior, add the following option to minion config (and possibly master too, if you use those variables in reactors/orchestrations) and restart the daemon:
features:
enable_slsvars_fixes: true
The feature flag will go away in Salt Phosphorus (3005), and the fixed functionality will become the default.
PR #58238 by @mlasevich
<RANT INCOMING>
There were two issues with this changeset that I want to highlight.
1. The feature was broken out of the box
It had mandatory (since SEP 10) unit tests, passed multiple reviews, was merged, and released in Salt Magnesium Release Candidate. And yet:
- The docs used a feature toggle name that was different from the implementation
- When enabled, the code crashed with an exception
It was quite easy to find serious crashes in previous Salt releases by briefly trying new features. I did that a couple of times when I wrote these unofficial release notes in the past. And it still quite easy to find such bugs right now, despite the SEP 10 and more than a year of improvements. I’m happy to help in testing, but it is scary that I’m able to find issues like this with so little effort. Sometimes I feel like I’m the only person who does that… It is very worrying that SaltStack has no QA processes that can catch such obvious defects.
To me, this tweet made some time ago by Michael DeHaan (Ansible creator) still rings somewhat true:
Many projects have issues beyond just systems mgmt - Kubernetes is turning into a minefield. Salt remains swamped on tickets and still auto-merges PRs without apparent testing of any kind. Basically all projects owe their users a promise to not overcommit.
— laserllama (@laserllama) November 30, 2018
Here is another one:
they also shared a lot about how they were the friendly community. They merged EVERY patch. Which is SCARY. People said I wasn’t nice because I did some code review and said no to some things.
— laserllama (@laserllama) May 15, 2020
2. It also contained yet another feature flag subsystem
I understand the intent behind that, and 100% agree with it. Backward-incompatible and other dangerous changes need to be hidden behind feature flags. However, this topic is quite complex, and SaltStack lacks robust processes around that. There are more than 100 boolean config options in Salt that were introduced over time, and I bet at least some of them are feature flags:
salt-repo % grep ': bool,' salt/config/__init__.py | wc -l
117
They all are named differently, aren’t organized into namespaces, some of them do not have a defined deprecation timeline (i.e., are just there indefinitely), etc. Moreover, there are flags that only exist in code, aren’t mentioned in the docs, and even can’t be found by the above grep
command. This is not sustainable, and there is no system in that…
And now we have this PR that sneaks a generic feature flag subsystem implementation. The new features
namespace can be set through config files and needs a minion restart. Okay, looks good.
But Salt already has an existing feature flag subsystem since 2017! The feature flag namespace is named use_superseded
(although the name is terrible). And it is more powerful:
- Feature toggles can be set through config files or pillar data (can be set centrally and without a restart)
- It provides a decorator that can automatically switch implementations without introducing new conditionals in the code
- And it automatically prints a deprecation warning message!
Maybe there were perfectly valid reasons to add the second implementation, but they weren’t communicated. All I have is just questions:
- Is it necessary to have both? Can the existing one be improved instead?
- What should be added to the Salt development docs about adding new feature toggles?
- What about the UX? How are end-users supposed to deal with deprecations? Heck, there is even a long-forgotten SEP, that nobody wants to implement…
Oh, and the old module.run
syntax (deprecated since 2017) has been “undeprecated”, and it looks like instead of actively and safely managing (and owning) the transition, SaltStack has decided to leave it in limbo…
Am I too spoiled by the excellent feature deprecation process in Django, and how reluctant (in a positive way) they are about adding new settings?
</END OF RANT>
Unless/onlyif result parsing
This feature extends the unless/onlyif module calling capability that was released in Salt Neon and further improved in Salt Sodium. It introduces the new get_return
key, that is used to determine the value to parse for modules that return deep data structures. Internally, it relies on the traverse
Jinja lookup filter.
Given the state:
test:
test.nop:
- name: foo
- unless:
- fun: test.arg
kwarg:
deep: False
The test.arg
execution module call returns kwargs: {deep: False}
. We’d like to evaluate that key for the onlyif or unless behavior, but it is not possible. Having a result returned at all is evaluated as True
, and thus the state does not run.
Here is how the state looks with the new get_return
keyword added:
test:
test.nop:
- name: foo
- unless:
- fun: test.arg
kwarg:
deep: False
get_return: kwargs:deep
The False
return of the module can now be detected and evaluated by the unless/onlyif requisites, and the state is run.
PR #57504 by Christian McHugh
Another interesting twist is buried in the PR comment:
We have an initiative right now to start standardizing returns from execution modules. Though, it’s going to take some time to complete those efforts. We’re aiming to get some validation in place during MG that can start warning of non-standard returns.
I believe that initiative is being explored in PR #58508 by Dmitry Kuzmenko
Other notable changes
ansiblegate
module/state improvements. Take care of failed, skipped and unreachable tasks during playbook execution. Propagateretcode
fromansible-playbook
CLI execution when callingansible.playbooks
to allow proper success/failure detection. Only executeansible-playbook --check
in test mode (test=True
) to allow running playbooks that are using modules which do not allow--check
. PR #58214 by Pablo Suárez Hernández- Add
only_fails=True
option to saltcheck to only display test failures. PR #57824 by Christian McHugh - Add a boolean
strict
flag tosdb.get
to force it to fail if the URI is not valid. PR #57630 by Proskurin Kirill - Pass
cmd.run
state arguments to unless and onlyif when they exist (restore the function behavior that was lost in Salt Sodium). PR #57825 by Christian McHugh - Add the
slackware_service
module to support managing services in Slackware Linux. PR #58207 by Piter Punk - Add a COPR option to
pkgrepo.absent
andpkgrepo.managed
on RedHat based systems (Cool Other Package Repositories, similar to Ubuntu PPA repos). It is a shortcut for either manually downloading the repo file or using cmd.run to add the COPR repository. PR #57259 by @mymindstorm - Handle the preservation of raw
application/x-www-form-urlencoded
traffic incherrypy
. PR #54901 by Nicholas Hughes - Support Zabbix API >= 4.4. PR #56996 by Bruno Costa
- Add
windows_codepage
argument (65001
by default which refers to theUTF-8
character set) tocmd.run
,cmd.powershell
,cmd.powershell_all
,cmd.shell
,cmd.run_stdout
,cmd.run_stderr
,cmd.run_all
,cmd.retcode
,cmd.script
,cmd.script_retcode
, andcmd.run_bg
. PR #58008 by Charles McMarrow - Update
git_pillar
on master start instead of waiting forgit_pillar_update_interval
. PR #56316 by Mathieu Parent - Add
display controller
to supportedgpu_classes
, enabling Intel GPU’s using that classname to be found. PR #57603 by Gijs Peskens - Add
append=True
to variouswarnings.filterwarnings
calls to avoid overidding previously defined filters. PR #57598 by Jean-Yves NOLEN - Return binary data in binary mode from
file.read
. PR #58034 by Justin Findlay - Support lists in
grains.filter_by
(utils.data.traverse_dict_and_list
). PR #56689 by Akmod - Add underscore to
sanitize_host
accepted charset, to preventnetwork.ping
from mangling perfectly valid hostnames. PR #58069 by @mattp- - Accept nested namespaces in spacewalk.api runner function. PR #57491 by Alexander Graul
- Rename
_schedule.conf
to_schedule.confYAMLError
on YAML parse error to prevent high CPU usage. I’m surprised that this patch was merged because it feels more like a band-aid than a root cause fix, but I guess having unresponsive minions is much worse. PR #58179 by Markus
Want to read about the upcoming Argon release?
I’m always hesitant to commit to writing another post like this one (it takes a lot of time!). However, I get bits of motivation to do so when people subscribe to the mailing list:
Powered by Mailgit