New features in Salt 3004 Silicon
27 minute read Updated:
Salt 3004 Silicon didn’t follow the usual 4-month release cycle and was released seven months after the previous major version. I believe this slowdown is actually good, and I hope VMware won’t rush with the next version too. However, Salt 3004 ships with several new major components and internal changes that may (or may not) signal that something interesting is cooking under the hood.
Enjoy the reading (and check out the official announcement as well)!
New features in Salt 3004 Silicon: Pluggable transports, DeltaProxy, Loader refactoring, Vault Enterprise, VMware extensions, Transactional systems, Salt SSH, Memory leaks mitigations
- Pluggable transports
- DeltaProxy
- Salt extension modules for VMware
- Native minions
- Internal changes
- Memory leaks
- Slack
- Engines
- Beacons
- State system
- Transactional Systems
- New operating systems support
- Windows improvements
- Vault Enterprise namespaces
- Salt SSH
- Grains
- Packages
- Nifty tricks
- Other notable changes
Want to read about the upcoming Argon release?
I’m always hesitant to commit to writing another post like this one (it takes a lot of time!). However, I get bits of motivation to do so when people subscribe to the mailing list:
Powered by Mailgit
Pluggable transports
This is a work in progress, but I want to mention it anyway because it can lead to some exciting developments in future versions.
The first PR #60852 by Daniel Wozniak removes transport singletons. It was done as part of tech-debt removal and is related to the second pull request.
The second PR #60867 by Daniel Wozniak
wasn’t merged into the Salt 3004 release and is still a work in progress (UPD: it was superseded by #61450). It splits the transport
module namespace into channel
and transport
parts and introduces a couple of classes for channels and transports:
ReqChannel
PushChannel
PullChannel
AsyncReqChannel
AsyncPubChannel
AsyncPushChannel
AsyncPullChannel
ReqServerChannel
PubServerChannel
RequestClient
RequestServer
PublishServer
PublishClient
It also adds a new transport module that uses a centralized RabbitMQ broker server. The rationale behind the new developments is explained in the Pluggable Transports SEP that was accidentally merged without any community discussion. The SEP also mentions an abandoned HTTP Transport PR created after all these disastrous CVEs in 2020. Oh, and salt-syndic
doesn’t look like a recommended solution to scale Salt (it was even discussed whether it makes sense to deprecate it).
I’m pretty sure this new transport is not related in any way to the fact that VMware (who acquired SaltStack in 2020) sells Tanzu RabbitMQ - an enterprise version of the open-source RabbitMQ broker. Personally, I’d very much prefer a really secure and well-audited lightweight built-in transport that allows running untrusted minions over the internet than a centralized broker with who knows how many potential vulnerabilities… However, the scalability and fault tolerance bits, plus the potential of new community-driven transport implementations, are really interesting for some use-cases. UPD: there is another PR that has additional details, see #61464.
DeltaProxy
DeltaProxy is a special kind of proxy minion that can control multiple devices per proxy process instead of a single device. Its development probably started somewhere in 2018, and the corresponding abstraction layer (MetaProxy) was released as part of Salt 2019.2.1. For some time, the DeltaProxy source code was proprietary. Then, in November 2019, SaltStack briefly considered open-sourcing it but ultimately postponed the decision. And finally, after some refactoring and stabilization efforts, it was open-sourced in Salt 3004 Silicon. To read this story in more detail, check out the MetaProxy section of my Salt Neon release notes.
Now to the feature itself. First, the documentation is non-existent. This fact can trigger me to write another rant, but I’m not in the right mood at the moment :) The only configuration examples I was able to find are located in #60177. Below is my (possibly incorrect) summary.
First, you need to have a node to run the salt-proxy
process. It could be hosted on the same node as your salt-master, on any minion node, or a dedicated one. Second, to enable the feature, you need to add the following option to /etc/salt/proxy
configuration file on that node; otherwise the default metaproxy module will be used (metaproxy: proxy
):
master: SALT_MASTER_ADDRESS
metaproxy: deltaproxy
Then you need to define the pillar data for each node:
# pillar/top.sls
base:
controlproxy:
- controlproxy
device1:
- device1
device2:
- device2
For the control proxy (DeltaProxy) node (where you run the salt-proxy
process), you need to specify proxytype: deltaproxy
and a list of proxied devices:
# pillar/controlproxy.sls
proxy:
proxytype: deltaproxy
ids:
- device1
- device2
And then, you need to add a pillar file for each proxied device. Since I do not have any real devices to test, I’m using the dummy
proxy module:
# pillar/device1.sls
proxy:
proxytype: dummy
# pillar/device2.sls
proxy:
proxytype: dummy
The final step is to start the salt-proxy
process and accept the keys sequentially (it is a known limitation):
salt-proxy --proxyid deltaproxy -l debug
salt-key -a deltaproxy
salt-key -a device1
salt-key -a device2
Now you should be able to ping the deltaproxy minion and its proxied dummy devices:
salt '*' test.ping
deltaproxy:
True
device1:
True
device2:
True
salt '*' grains.item osfinger
deltaproxy:
----------
osfinger:
proxy-proxy
device1:
----------
osfinger:
proxy-proxy
device2:
----------
osfinger:
proxy-proxy
I’m not sure which proxy modules are safe to run through DeltaProxy, but the merged PR touches the following ones:
dummy
napalm
rest_sample
Also, it is not clear how many devices could be realistically controlled via a single DeltaProxy (Control Proxy) instance and the difference in consumed resources compared to the same number of regular salt-proxy processes. If you run DeltaProxy in production with real devices and are willing to share some stats, please drop an email to .
PRs #60090 and #60791 by Gareth J. Greenaway
Salt extension modules for VMware
The saltext.vmware
collection of modules is not a part of the Salt 3004 release (but was announced around the same time). Instead, it is distributed as a separate Python library using the Salt Extensions mechanism.
The extensions rely on pyVmomi (the Python SDK for the VMware vSphere API to manage ESX, ESXi, and vCenter) and have the following modules:
- ESXi grains
- Proxy Minion interface module for managing ESXi hosts
- Execution modules to manage:
- Datacenters
- Clusters, DRS, and HA
- Distributed vSwitch instances
- ESXi hosts
- NSX-T managers, IP address pools and blocks, licenses, segments, Tier 0/1 gateways, transport nodes, profiles, zones, uplink profiles
- VMs
- VMC DHCP profiles, Direct Connect, distributed firewall rules, DNS forwarders, NAT rules, networks, public IPs, SDDCs, security groups and rules, VPN stats
- State modules to manage:
- Datacenters
- NSX-T managers, IP address pools and blocks, licenses, segments, Tier 0/1 gateways, transport nodes, profiles, zones, uplink profiles
- VMC security rules
For more details, see the Open Hour recording for September 30th on Youtube and read the introductory blog post. And check out the following howto: Salt SDDC Modules – Getting Started.
The modules are well documented and even have ADRs (yay!).
I also found a vRealize Automation module that is not a part of the main extension and is distributed through the _modules
dir.
Native minions
Salt 3004 native minion packages are availale for the following platforms:
- Arista 32-bit/64-bit
- Juniper (x86_64)
- Solaris 10 Intel and Sparc
- Solaris 11.4 Intel and Sparc
- AIX v7.1 and v7.2
Instructions for installing the latest packages can be found at gitlab.com.
Internal changes
Loader
- Allow the discovery of Salt extensions installed while Salt is running. Additionally, prevent loading
utils
from extensions which would be packed into__utils__
. PR #60214 by Pedro Algarvio - Stop using
pkg_resources
for Salt’s entry points loading. PR #60868 by Pedro Algarvio - Restore support of loading generator-based entry points. PR #60175 by Pedro Algarvio
- Refactor loader into submodules. PR #60595 by Daniel Wozniak
- Simplify the
LazyLoader
. PR #60714 by Pedro Algarvio
Other changes
- Drop Python 2 code from the entire codebase. PR #59934 by Daniel Wozniak
- Add Python 3.10 requirements. PR #59953 by Pedro Algarvio
- Handle signals and properly exit instead of raising exceptions. PRs #60972 and #61013 by Pedro Algarvio
- Do more rigorous checking of
__salt__
module docstrings on CI. PR #58539 by Pedro Algarvio - Consolidate
__getstate__
and__setstate__
methods of theProcess
class, to ensure that any forked process on Windows (and all other platforms which support forking) will behave properly without having to implement their own getstate/setstate functions. PR #55793 by Pedro Algarvio - Do not break
master_tops
for minion with version lower to 3003 (the compatibility alias will be deprecated in 3006). PR #60980 by Pablo Suárez Hernández - Deprecate
salt.payload.Serial
. PR #60954 by Daniel Wozniak - Redirect imports of
salt.ext.six
tosix
. PR #60967 by Pedro Algarvio - New
allow_one_of()
andrequire_one_of()
utility decorators. PR #58742 by Mark Ferrell - Get rid of
salt.utils.zeromq.ZMQDefaultLoop
, because Salt no longer supports older versions of ZeroMQ. PR #60618 by Daniel Wozniak
Memory leaks
This is the long-awaited progress in mitigating memory leaks that were related to Gitfs backends. It was done in a crude but very simple and practical way - instead of fighting with memory leaks caused by 3-rd party libraries, the Salt Master file server update thread will restart periodically to release held memory.
I do not believe that the restart interval is configurable (it is set to 300 seconds for now). However, if the gitfs_update_interval
setting is higher than 300 seconds, it will be used as the update thread restart interval.
PR #60386 by Daniel Wozniak
Another master memory leak was mitigated in PR #60262 by Daniel Wozniak
Slack
- Update the
slack.post_message
state,slack
engine,slack
execution module, andslack
returner to adhere to the deprecated usage of a token as a query string param for web API requests. Finally! PR #60165 by @xeacott - Add support for posting events to
slack_webhook
returner. PR #57480 by Nate Mellendorf
Engines
Engine processes got enhanced process titles. They could be helpful if your custom engine consumes too many resources and you want to spot it just by looking at the process list. To enable this feature, install the python3-setproctitle
package, then add some engines to master or minion config files and restart the services:
# /etc/salt/{master,minion}.d/engines.conf
engines:
- test:
This is how the new titles look in the process list:
ps ax | grep salt | grep test
32647 ? Sl 0:00 /usr/bin/python3 /usr/bin/salt-master salt.engines.Engine(salt.loaded.int.engines.test)
33690 ? Sl 0:00 /usr/bin/python3 /usr/bin/salt-minion KeepAlive MultiMinionProcessManager MinionProcessManager salt.engines.Engine(salt.loaded.int.engines.test)
And if you use multiple instances of the same engine, the process titles will use instance aliases instead:
# /etc/salt/{master,minion}.d/engines.conf
engines:
- test_instance1:
engine_module: test
- test_instance2:
engine_module: test
ps ax | grep salt | grep test
34388 ? Sl 0:00 /usr/bin/python3 /usr/bin/salt-minion KeepAlive MultiMinionProcessManager MinionProcessManager salt.engines.Engine(salt.loaded.int.engines.test-test_instance1)
34389 ? Sl 0:00 /usr/bin/python3 /usr/bin/salt-minion KeepAlive MultiMinionProcessManager MinionProcessManager salt.engines.Engine(salt.loaded.int.engines.test-test_instance2)
34436 ? Sl 0:00 /usr/bin/python3 /usr/bin/salt-master salt.engines.Engine(salt.loaded.int.engines.test-test_instance1)
34437 ? Sl 0:00 /usr/bin/python3 /usr/bin/salt-master salt.engines.Engine(salt.loaded.int.engines.test-test_instance2)
PR #60260 by Daniel Wozniak
Beacons
- Handle beacon exceptions by logging and firing an event that includes the exception. PR #60619 by Gareth J. Greenaway
- Refresh available beacons when the
refresh_modules
flag is passed as an argument to a state. PR #60542 by Gareth J. Greenaway - Make the
%
sign optional when configuring usage beacons (diskusage
,memusage
,sensehat
, andswapusage
). PR #60685 by Gareth J. Greenaway
State system
- Allow
onfail
requisite to be used withonchanges
and other requisites in a single state. PR #59985 by @xeacott - Fix
file.accumulated
dependency handling, when astate_id
dependency format is used instead of afunction: state_id
format. PR #60636 by Gareth J. Greenaway - Make the state aggregation system properly handle requisities. PR #60604 by Gareth J. Greenaway
- Add ability to pass
exclude
kwarg tosalt.state
from orchestrations. PR #58062 by @vryzhenkin and Wayne Werner - Make sure to always check
state_type
while compilingrequire_in
, even if thename
being searched for already exists at top-level in a highstate, because two different ids can exist with the same name. PR #59943 by @vin01
New operating systems support
The changes listed below do not mean official support as described in the Supported Operating Systems document. Instead, they mean that someone made an improvement in Salt for a specific operating system or the OS was included in the official test suite.
- Run tests on Debian 11 Bullseye. PR #60473 by Bryce Larson
- Run tests on AlmaLinux 8. PR #60209 by Bryce Larson
- Add a
salt.util.platform
check to detect the AArch64 64-bit extension of the ARM architecture. PR #59915 by Kirill Ponomarev - Run tests on CentOS Stream 8. PR #60141 by Bryce Larson
- Recognize Rocky Linux 8 as RedHat in the
os_family
grain. PR #59682 by @StackKorora PR #60427 by Kirill Ponomarev - Recognize Aliyun Linux as RedHat family. PR #59687 by @xuchunmei000
- Add support for Mendel Linux to be detected as Debian. PR #59893 by Morgan Kesler
- Astra Linux (AstraLinuxCE, AstraLinuxSE) is now considered a Debian family distro. PR #59353 by Anton Karmanov
- Add ARM64 support for Ubuntu 20 test pipeline. PR #57997 by Kirill Ponomarev
- Add Debian 11 on ARM64. PR #60901 by Bryce Larson
- Run tests on Fedora-34 instead of Fedora-32. PR #60124 by Bryce Larson
pip-tools-compile
now knows what FreeBSD is. PR #60138 by Pedro Algarvio
Transactional Systems
This feature adds support for transactional systems and openSUSE MicroOS in particular. MicroOS has a read-only root filesystem and the transactional-update
tool that leverages snapper
, zypper
, btrfs
and overlayfs
to perform atomic updates. Salt 3004 ships with two new execution modules (transactional_update
and rebootmgr
) and a new executor module (transactional_update
). The executor module wraps Salt module calls with transactions. Below is a rough summary of how the feature works:
- It can be activated by adding
module_executors: [transactional_update, direct_call]
to the minion config file, or by using the command line argumentsalt-call --module-executors='[transactional_update, direct_call]' test.version
- The list of functions that are wrapped by default:
state.single
,state.sls
,state.apply
,state.highstate
(it can be controlled viadelegated_functions
oradd_delegated_functions
minion options) - These modules are also wrapped by default (the list can be controlled via
delegated_modules
oradd_delegated_modules
minion options) - You can also schedule a reboot if needed:
salt-call --module-executors='[transactional_update]' state.sls stuff activate_transaction=True
- It also adds three new grains (
efi
,efi-secure-boot
, andtransactional
) and a new function (chroot.in_chroot
)
I wonder if NixOS support could be added to this transactional framework (it seems like a right fit).
PR #58520 by Alberto Planas
UPDATE: the internal implementation is going to be redesigned in PR #61188 that was submitted after the release.
Windows improvements
Install anywhere
The default install location will be %ProgramFiles%\Salt Project\Salt
for the binary data and %ProgramData%\Salt Project\Salt
for the Root Directory (root_dir
). A couple of switches control the installer behavior:
/install-dir
allows the user to define the install location via the command line/move-config
moves config fromC:\salt
(if found) to%ProgramData%
And for the uninstaller:
/delete-install-dir
deletes the installation directory that contains the config and pki directories. This applies to old method installations where the root directory and the installation directory are the same. The default is not to delete it./delete-root-dir
deletes the root directory that contains the config and pki directories. Default is to not delete it.
For more details on how this feature is designed, read the SEP-31.
PRs #60267 and #60952 by Shane Lee
file.patch
I needed this feature a couple of times to self-patch Windows minions and had to implement my own workarounds for that using patch.exe
and msys-2.0.dll
from Git for Windows Portable. The good thing is that the feature is now built-in; the not-so-good thing is that the patch executable is not bundled with the Salt installer and needs to be delivered to a minion using a separate state. Anyway, the feature is quite helpful, and I’ll definitely try it.
Other Windows-related changes
- Surface the errors that occur when user account creation fails on a Windows box (e.g., when a password does not meet the password policy requirements). PR #59563 by @xeacott
- Fix
win_servermanager.install
so it will reboot whenrestart=true
is passed. PR #60111 by Shane Lee - Standardize on using the “Success and Failure” for all auditing policies (both normal and advanced ones). PR #60178 by Shane Lee
- Do not ship unmaintained PythonWin IDE (that is installed with PyWin32) with Salt installer. PR #60754 by Shane Lee
- Update Windows build deps & DLLs, use Python 3.8. PR #59870 by Shane Lee
- Update the build scripts to use a standalone installer for Visual C++ Build Tools 2015 that is a part of VS Build Tools 2017. PR #60093 by Shane Lee
Vault Enterprise namespaces
Namespaces is a set of features within Hashicorp Vault Enterprise that allows Vault environments to support Secure Multi-tenancy (or SMT) within a single Vault infrastructure. Through namespaces, Vault administrators can support tenant isolation for teams and individuals as well as empower delegated administrators to manage their own tenant environment. API operations performed under a namespace can be done by providing the relative request path along with the namespace path using the X-Vault-Namespace header.
To enable the feature, add an optional namespace
key to the vault
master config section:
vault:
# ...
namespace: vault_enterprice_namespace
# ...
PR #58586 by Edmund Adderley
Salt SSH
Directory roster
This is a new type of roster called “directory roster”. The directory roster is a flat directory of files. Each file’s name is a minion id, and the contents of each file must yield the data structure expected within each roster entry after being rendered with the salt rendering system. It was introduced to help solve the following use-case:
We maintain our roster in a git repo. As our team grows and we add and remove systems from the roster, the number of merge conflicts in the flat roster file has increased significantly. Switching to this directory roster system has significantly decreased the headache of git merge semantics when multiple git users introduce different roster changes at the same time.
Configuration example:
# /etc/salt/master.d/roster.conf
roster: dir
roster_dir: config/roster.d
# roster_domain: example.com
# config/roster.d/minion-x:
host: minion-x.example.com
port: 22
sudo: true
user: ubuntu
# config/roster.d/minion-y:
host: minion-y.example.com
port: 22
sudo: true
user: gentoo
If you uncomment the roster_domain
setting, you can omit the domain part in the individual roster files.
PR #60364 by @kojiromike and Gareth J. Greenaway
Heist minion presence events
Allows presence events to work with Heist-Salt minions. If you set the master configuration option
detect_remote_minions
toTrue
it will try to detect connected minions over port 22 unless the port specified is changed with the configurationremote_minions_port
.
Another feature with almost useless documentation. As of the time of this writing, Google search on docs.saltproject.io gives zero matches for the heist
keyword. How is a Salt user supposed to learn what Heist is, how it differs from Salt SSH or enable_ssh_minions
, and what is the use-case for this option?
Below is what I was able to understand after many hours of research.
There are two googleable repositories that are located on GitLab and have very cryptic descriptions and the motto of “making deployment and management of Salt easy”:
heist
- “ephemeral software tunneling and delivery system”heist-salt
- “App-merge components for deploying salt with heist”
Using Heist
So, let’s install heist
first:
pip3 install heist==5.0.0
Create the necessary folders and files:
mkdir -p /etc/heist/rosters
touch /etc/heist/heist.conf
cat << EOF > /etc/heist/rosters/roster.conf
minion1:
# host: minion1
host: 10.211.55.25
username: vagrant
EOF
I struggled for 5 hours trying to make Heist work. Here is the list of problems I found:
- I had to run
pip3 install aiologger
; otherwisepop_config
crashed when I ranheist
. Same as a year ago, the dependencies in thepop
ecosystem are not pinned, and the installs are not repeatable. - I had to install the latest Heist from git by running
pip uninstall heist && pip install git+https://gitlab.com/saltstack/pop/heist.git
- thegrains
subcommand was removed in 5.0.0 (but not from the CLI args!), thetest
one mentioned in the README was not yet released - I had to downgrade pop-config
pip install pop-config==6.11
; otherwise I got nothing (not even log messages with--log-level debug
). Did I say already that pop-based installs are not reproducible? - I was unable to make the roster dir work automatically (got
KeyError: 'ssh_scan_ports'
), so I had to use the explicit-R /etc/heist/rosters/roster.conf
argument - The cryptic
ValueError: The roster scan did not return data when rendered
error took the most time to solve. Using a Python debugger, I was unable to understand where exactly the CLI options are passed down to theheist.roster.init.read()
and why they are empty. As it turned out the order of the arguments does matter:heist --log-level info -t minion1 -R /etc/heist/rosters/roster.conf test
fails, andheist --log-level info test -t minion1 -R /etc/heist/rosters/roster.conf
works. Good luck finding that viaheist --help
…
So, after so much wasted time, I was able to make the bare Heist work (yeah, the majority of the INFO messages should be logged as DEBUG ones):
heist --log-level info test -R /etc/heist/rosters/roster.conf
[INFO ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/curio_loop returned virtual FALSE: No module named 'curio'
[INFO ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/selector_win returned virtual FALSE: WindowsSelectorEventLoop only runs on windows
[INFO ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/proactor returned virtual FALSE: WindowsProactorEventLoop only runs on windows
[INFO ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/qt returned virtual FALSE: No module named 'qasync'
[INFO ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/trio_loop returned virtual FALSE: No module named 'trio_asyncio'
[INFO ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/uv_loop returned virtual FALSE: No module named 'uvloop'
[INFO ] Picking default roster: flat
[INFO ] This is a test heist manager. You have installed heist correctly. Install a heist manager to use full functionality
It looks like Heist is just a skeleton library with some functions. Here is what they can do:
- Grab host lists from different rosters (clustershell, scan, flat with optional fernet encryption)
- Connect to remote hosts using
asyncssh
Python library, send files back and forth, run commands (with optionalsudo
support) - Create reverse tcp tunnels (via SSH) from ports on the target system back to ports on the source system
- Detect the target OS family and CPU architecture
- Fetch binary artifacts (based on the target OS/arch), verify their signatures and unpack the artifacts
- Manage remote services using systemd
In summary, Heist is a tool to deploy some binaries to remote systems through ssh and run them, with optional tcp tunneling back to the source system. To do something useful with it, you need to install a manager. I’m not sure why Heist is split into two tools (thus complicating the user experience), because I was unable to find any other addons that use it other than heist-salt
.
Using Heist Salt
Now let’s install heist-salt
as well:
pip3 install heist-salt==4.0.0
Below is what heist-salt
is supposed to do:
- Download a single-binary Salt minion package from https://repo.saltproject.io/salt/singlebin/
- Deploy the Salt minion package to a remote system via SSH
- Establish two tcp tunnels back to the source system - 44505 -> 4505, 44506 -> 4506 (you are supposed to run a Salt master on the source host)
- Add a minion configuration file that connects it to localhost master address with ports 44505 and 44506 (i.e., via the SSH tunnel)
- Set the remote minion grain
minion_type: heist
- Generate minion keys
- Start the salt-minion service
- Accept the keys on the master
Alternatively, it can bootstrap a Salt minion and connect it to an existing master (Heist will skip the grain and tunnels setup in this mode)
As you can guess, it didn’t work for me either. Setting aside the unreasonably chatty INFO logging, this is what I got when I ran heist --log-level info salt.minion -t minion1 -R /etc/heist/rosters/roster.conf
:
AssertionError: version 3004rc1-1 is not valid
. I was running this from a Salt 3004 RC1 master installed from Git, and it looks like the release candidates are not supported (although the RC1 single-binary is available in the repo). I’m not sure how I’m supposed to try a feature that needs 3004rc1 to work.- Simultaneously, I got the
NameError: name 'open' is not defined
exception deep in the standard Python logging library. It is an asyncio-induced error; go to bugs.python.org for more details. - Then I added an explicit version number (
3003.3
didn’t work, so I had to use3003
)heist --log-level info salt.minion -t minion1 -R /etc/heist/rosters/roster.conf --artifact-version 3003
and Heist proceeded a little bit further. It was able to download Salt single binary, but thesalt-call
command exited with error, and everything failed with thejson.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
exception. - When I ran the
/var/tmp/heist_vagrant/6547/salt call --config-dir /var/tmp/heist_vagrant/6547 --local grains.items --out json
command on the target host manually, it returned a perfectly valid JSON data structure. - Then I ran
chmod go+rwx /var/log/salt && rm -f /var/log/salt/minion
on the target system, the Heist command progressed a bit more (it printed some grains), but then failed to create the/etc/systemd/system/salt-minion.service
over SFTP due to permission issues. - I tried to add
sudo: true
to the roster, but it failed early withasyncssh.sftp.SFTPPermissionDenied: Permission denied
trying to copy the Salt binary (a temp folder is created usingsudo
and is owned by root, but the sftp operation runs under a regular user). Apparently, there is no way to win here with a regular user account andsudo
.
I enabled SSH root login, re-ran the command, and voila! The heist
command generated and accepted a minion key and stayed in the foreground, keeping the tunnels alive. After 8+ hours, I was finally able to run a command on Salt Heist minion:
salt minion1 grains.get minion_type
minion1:
heist
Now let’s get back to my original questions:
1. How is a Salt user supposed to learn what Heist is?
- Use these notes until better docs are available.
- Ask in the #pop Slack channel (visit saltproject.io and search for “slack” for an invitation link).
- Google and dig into the source code a lot.
2. How Heist differs from Salt SSH or enable_ssh_minions
?
- It is more complex than Salt SSH because it requires you to run a Salt master and keep the
heist
command running on the master (not always, just when you need to run something on Heist minions). - Heist deploys an experimental single binary Salt minion package, and uses reverse TCP tunnels on top of SSH to connect via ZeroMQ to Salt master. Salt SSH just packs Salt python modules and your state tree into a tarball, uploads to a remote system using SSH, runs
salt-call --local
and sends the results back. - Heist is slower to deploy than Salt SSH, but once the tunnels are up, you can run subsequent commands much faster.
- It needs a roster file like the
enable_ssh_minions
mode, but it can use all the features available for regular ZeroMQ-connected minions (Salt SSH is more limited). - It can be used to deploy regular Salt minions that connect to any master via ZMQ and do not need Heist and reverse tunnels running, with the exception that the minions use experimental single-binary packages. But you can do the same with Salt SSH by writing some states to bootstrap a minion. Or you can use Fabric or Ansible as well for this task.
- There is also a Salt extension that provides the
heist.deploy
runner to deploy a Heist minion viasalt-run
3. What is the use-case for these new detect_remote_minions
and remote_minions_port
master options?
This turned out to be a pretty obscure feature:
- Some Salt master features (
stalekey
engine,manage
runner,minions
wheel module, master cache worker, AES key rotation) need to know which minions are connected to the master right now. - Because the default ZeroMQ protocol does not expose client IP addresses, Salt master has no direct ability to know which minion IDs are connected.
- As a workaround, the master uses the
ss
Linux command output to find connected minions. - To distinguish minion tcp connections from any other connections, it filters them by remote address and local or remote port number.
- To check if a specific remote address belongs to a minion, it compares it with the cached
ipv4
andipv6
grains. - And to check the port number, it matches connections against master ports (4505 and 4506).
- Because remote Heist minions are connected via SSH tunnels that originate from localhost, the master checks established SSH connections in addition to standard master ports. To enable this behavior, you need to set
detect_remote_minions: true
. - And because SSH can use a non-standard port, you can account for that by setting
remote_minions_port
.
To summarize:
- This is a quite clever trick to make some existing Salt master features work with minions that are deployed via Heist and connected via reverse SSH tunnels
- You only need to enable the
detect_remote_minions
flag if you use Heist minions - You do not want to use Heist minions in the near future unless you like to test highly experimental software, ready to dig into source code and submit bug reports and patches
PR #60633 by Megan Wilhite . A little bit more context can be found in the salt-heist repo.
Grains
- Ignore the
enable_fqdns_grains
setting on AIX, Solaris, and Juniper (always useFalse
). PR #60533 by David Murphy - Clear the cached network interface grain information upon init of minion or when
saltutil.refresh_grains
is requested. PR #60130 by @xeacott - Improve
virtual*
grain handling for LXC containers. Now thevirtual
grain is set tocontainer
andvirtual_subtype
toLXC
even when Salt is running inside of another virtual machine. PR #60196 by Piter Punk - Rename
manufacture
grain tomanufacturer
for Solaris on SPARC. PR #60514 by Lukas Raska - Implement
grains.uuid
on Windows. PR #59928 by Piter Punk
Packages
- Add
rpm_vercmp
Python library for version comparison. It is needed for Tiamat-based builds to avoid pulling a binary C library that does the same thing. PR #60815 by Megan Wilhite - Use apt CLI to manage repos (as an alternative to python-apt library). Again, it will be used in Tiamat-based builds. PR #60900 by Megan Wilhite
- Handle various architecture formats (e.g.,
amd64
) in theaptpkg
module. PR #60986 by Megan Wilhite
Nifty tricks
Override command retcode based on output
This feature was inspired by Ansible’s failed_when
directive:
Installer script:
file.managed:
- name: /tmp/installer.sh
- mode: 755
- contents: |
#!/bin/sh
# This is a contrived example of idempotent command that exits with 1 on a second run
if [ ! -f /tmp/.installed ]; then
touch /tmp/.installed
echo "The thing was installed successfully"
exit 0
else
echo "The thing is already installed"
exit 1
fi
# Note that the command will run each time you apply the state
# With success_stdout it just won't fail on subsequent runs
Run the installer:
cmd.run:
- name: /tmp/installer.sh
- require:
- file: Installer script
- success_stdout:
- The thing is already installed
It is supported in cmd.wait
, cmd.wait_script
, cmd.run
and cmd.script
states. You can specify a list of lines to match (using the success_stdout
and success_stderr
arguments), and if any of them is found in the command output, then the resulting retcode will be overridden with zero.
The example above is a bit contrived because you can achieve the same result with the unless
directive that checks for /tmp/.istalled
and prevents the state from being run the second time. However, if a command does something less obvious (for example, interacts with a network service), then this feature could be useful.
PR #59841 by Gareth J. Greenaway and Loren Gordon
File lookup functions
The slsutil.findup
function was originally written to help state files locate a Jinja file to be imported. It will find the first path matching a filename or list of filenames in a specified directory or the nearest ancestor directory. It could be useful for formulas that typically contain a map.jinja
that needs to be included by every state file.
New functions:
slsutil.findup
find the first path matching a filename or list of filenames in a specified directory or the nearest ancestor directory. Returns the full path to the first file found.slsutil.file_exists
returnTrue
if a file exists in the state tree,False
otherwise (usescp.list_master
internally)slsutil.dir_exists
returnTrue
if a directory exists in the state tree,False
otherwise (usescp.list_master_dirs
internally)slsutil.path_exists
returnTrue
if a path exists in the state tree,False
otherwise. The path could refer to a file or directory (uses bothcp.list_master
andcp.list_master_dirs
)
Example:
{% from salt['slsutil.findup']('formulas/shared/nginx', 'map.jinja') import nginx with context %}
The following folders (relative to salt://
file tree) will be searched for the map.jinja
file:
formulas/shared/nginx
formulas/shared
formulas
.
Other notable changes
- Netbox pillar enhancements - Virtual Machines, Interfaces, IP Addresses, Documentation. PR #59500 by Gary T. Giesen
- Remove all Silicon deprecations. PR #60895 by Wayne Werner
- Remove
glance
state module in favor ofglance_image
. PR #59784 by Megan Wilhite - Bump
keystone
deprecation warning to Phosphorus. PR #59813 by Gareth J. Greenaway - Drop support of Ubuntu 16.04. PR #59869 by Bryce Larson
- Make relative Jinja includes work with Jinja 3.0. PR #60811 by Alberto Planas
- Update AWS API so
salt-cloud
can create VMs with IPv6 addresses. PR #60804 by Bryce Larson - Many Zabbix inventory handling improvements. PR #60400 by Piter Punk
- Fix
salt-ssh
extra-filerefs option handling. PRs #61014 and #60891 by Daniel Wozniak - Introduce a mechanism to figure out the actual Python version available inside the container when executing
dockermod.call
, in the same way as forsalt-ssh
. PR #60229 by Pablo Suárez Hernández - Update
pcs
module to support versions > 0.10. PR #60257 by @waynegemmell - Honor the
--log-file
CLI argument insalt-api
. PR #59881 by Daniel Wozniak - Add
poudriere -i -j jail_name
option to list jail information for poudriere on FreeBSD. PR #59831 by Kirill Ponomarev - Allow GCE Salt Cloud to use previously created IP addresses. PR #60043 by @dawidpogorzelski and Gareth J. Greenaway
- Reinstate
ignore_cidr
option insalt-cloud
openstack driver. PR #59778 by Mark Hyde - Add
nosync
switch tolvm.lvcreate
to disable initial raid synchronization. PR #59193 by Jerzy Drozdz - Update schedule state and module to report changes. PR #59997 by Gareth J. Greenaway
- Remove the unnecessary
nginx/
prefix fromnginx.version
return, so it can actually be used in functions likepkg.version_cmp
. PR #57111 by @syphernl and @alexey-zhukovin - Handle volumes on stopped pools in
virt.vm_info
. PR #60133 by Cedric Bosdonnat - Use
/dev/kvm
to detect KVM. PR #60420 by Cedric Bosdonnat - Pass emulator when getting domain capabilities from libvirt. PR #60492 by Cedric Bosdonnat
- Better handling of bad public keys from minions. PRs #60662 and #60688 by Daniel Wozniak
- Add
psutil
as a dependency on all platforms. This prevents a minion from starting if an instance is already running. PR #60946 by Charles McMarrow - Make
pillar_roots
order deterministic. PR #59212 by @mkirkland4874 - Multiple fixes for Ansible modules in Salt. PRs #60208 by Pedro Algarvio and #59746 by Pablo Suárez Hernández
You can find other changes and bugfixes in the official CHANGELOG.md and Release Notes
Want to read about the upcoming Argon release?
I’m always hesitant to commit to writing another post like this one (it takes a lot of time!). However, I get bits of motivation to do so when people subscribe to the mailing list:
Powered by Mailgit