New features in Salt 3004 Silicon

28 minute read Updated:

Salt Silicon

Salt 3004 Silicon didn’t follow the usual 4-month release cycle and was released seven months after the previous major version. I believe this slowdown is actually good, and I hope VMware won’t rush with the next version too. However, Salt 3004 ships with several new major components and internal changes that may (or may not) signal that something interesting is cooking under the hood.

Enjoy the reading (and check out the official announcement as well)!

New features in Salt 3004 Silicon: Pluggable transports, DeltaProxy, Loader refactoring, Vault Enterprise, VMware extensions, Transactional systems, Salt SSH, Memory leaks mitigations

Pluggable transports

This is a work in progress, but I want to mention it anyway because it can lead to some exciting developments in future versions.

The first PR #60852 by Daniel Wozniak removes transport singletons. It was done as part of tech-debt removal and is related to the second pull request.

The second PR #60867 by Daniel Wozniak wasn’t merged into the Salt 3004 release and is still a work in progress (UPD: it was superseded by #61450). It splits the transport module namespace into channel and transport parts and introduces a couple of classes for channels and transports:

It also adds a new transport module that uses a centralized RabbitMQ broker server. The rationale behind the new developments is explained in the Pluggable Transports SEP that was accidentally merged without any community discussion. The SEP also mentions an abandoned HTTP Transport PR created after all these disastrous CVEs in 2020. Oh, and salt-syndic doesn’t look like a recommended solution to scale Salt (it was even discussed whether it makes sense to deprecate it).

I’m pretty sure this new transport is not related in any way to the fact that VMware (who acquired SaltStack in 2020) sells Tanzu RabbitMQ - an enterprise version of the open-source RabbitMQ broker. Personally, I’d very much prefer a really secure and well-audited lightweight built-in transport that allows running untrusted minions over the internet than a centralized broker with who knows how many potential vulnerabilities… However, the scalability and fault tolerance bits, plus the potential of new community-driven transport implementations, are really interesting for some use-cases. UPD: there is another PR that has additional details, see #61464.

DeltaProxy

DeltaProxy is a special kind of proxy minion that can control multiple devices per proxy process instead of a single device. Its development probably started somewhere in 2018, and the corresponding abstraction layer (MetaProxy) was released as part of Salt 2019.2.1. For some time, the DeltaProxy source code was proprietary. Then, in November 2019, SaltStack briefly considered open-sourcing it but ultimately postponed the decision. And finally, after some refactoring and stabilization efforts, it was open-sourced in Salt 3004 Silicon. To read this story in more detail, check out the MetaProxy section of my Salt Neon release notes.

Now to the feature itself. First, the documentation is non-existent. This fact can trigger me to write another rant, but I’m not in the right mood at the moment :) The only configuration examples I was able to find are located in #60177. Below is my (possibly incorrect) summary.

First, you need to have a node to run the salt-proxy process. It could be hosted on the same node as your salt-master, on any minion node, or a dedicated one. Second, to enable the feature, you need to add the following option to /etc/salt/proxy configuration file on that node; otherwise the default metaproxy module will be used (metaproxy: proxy):

master: SALT_MASTER_ADDRESS
metaproxy: deltaproxy

Then you need to define the pillar data for each node:

# pillar/top.sls
base:
  controlproxy:
    - controlproxy
  device1:
    - device1
  device2:
    - device2

For the control proxy (DeltaProxy) node (where you run the salt-proxy process), you need to specify proxytype: deltaproxy and a list of proxied devices:

# pillar/controlproxy.sls
proxy:
  proxytype: deltaproxy
  ids:
    - device1
    - device2

And then, you need to add a pillar file for each proxied device. Since I do not have any real devices to test, I’m using the dummy proxy module:

# pillar/device1.sls
proxy:
  proxytype: dummy
# pillar/device2.sls
proxy:
  proxytype: dummy

The final step is to start the salt-proxy process and accept the keys sequentially (it is a known limitation):

salt-proxy --proxyid deltaproxy -l debug
salt-key -a deltaproxy
salt-key -a device1
salt-key -a device2

Now you should be able to ping the deltaproxy minion and its proxied dummy devices:

salt '*' test.ping

deltaproxy:
    True
device1:
    True
device2:
    True
salt '*' grains.item osfinger

deltaproxy:
    ----------
    osfinger:
        proxy-proxy
device1:
    ----------
    osfinger:
        proxy-proxy
device2:
    ----------
    osfinger:
        proxy-proxy

I’m not sure which proxy modules are safe to run through DeltaProxy, but the merged PR touches the following ones:

Also, it is not clear how many devices could be realistically controlled via a single DeltaProxy (Control Proxy) instance and the difference in consumed resources compared to the same number of regular salt-proxy processes. If you run DeltaProxy in production with real devices and are willing to share some stats, please drop an email to .

PRs #60090 and #60791 by Gareth J. Greenaway

Salt extension modules for VMware

The saltext.vmware collection of modules is not a part of the Salt 3004 release (but was announced around the same time). Instead, it is distributed as a separate Python library using the Salt Extensions mechanism.

The extensions rely on pyVmomi (the Python SDK for the VMware vSphere API to manage ESX, ESXi, and vCenter) and have the following modules:

For more details, see the Open Hour recording for September 30th on Youtube and read the introductory blog post. And check out the following howto: Salt SDDC Modules – Getting Started.

The modules are well documented and even have ADRs (yay!).

I also found a vRealize Automation module that is not a part of the main extension and is distributed through the _modules dir.

Native minions

Salt 3004 native minion packages are availale for the following platforms:

Instructions for installing the latest packages can be found at gitlab.com.

Internal changes

Loader

Other changes

Memory leaks

This is the long-awaited progress in mitigating memory leaks that were related to Gitfs backends. It was done in a crude but very simple and practical way - instead of fighting with memory leaks caused by 3-rd party libraries, the Salt Master file server update thread will restart periodically to release held memory.

I do not believe that the restart interval is configurable (it is set to 300 seconds for now). However, if the gitfs_update_interval setting is higher than 300 seconds, it will be used as the update thread restart interval.

PR #60386 by Daniel Wozniak

Another master memory leak was mitigated in PR #60262 by Daniel Wozniak

Slack

Engines

Engine processes got enhanced process titles. They could be helpful if your custom engine consumes too many resources and you want to spot it just by looking at the process list. To enable this feature, install the python3-setproctitle package, then add some engines to master or minion config files and restart the services:

# /etc/salt/{master,minion}.d/engines.conf
engines:
  - test:

This is how the new titles look in the process list:

ps ax | grep salt | grep test

  32647 ?        Sl     0:00 /usr/bin/python3 /usr/bin/salt-master salt.engines.Engine(salt.loaded.int.engines.test)
  33690 ?        Sl     0:00 /usr/bin/python3 /usr/bin/salt-minion KeepAlive MultiMinionProcessManager MinionProcessManager salt.engines.Engine(salt.loaded.int.engines.test)

And if you use multiple instances of the same engine, the process titles will use instance aliases instead:

# /etc/salt/{master,minion}.d/engines.conf
engines:
  - test_instance1:
      engine_module: test
  - test_instance2:
      engine_module: test
ps ax | grep salt | grep test

  34388 ?        Sl     0:00 /usr/bin/python3 /usr/bin/salt-minion KeepAlive MultiMinionProcessManager MinionProcessManager salt.engines.Engine(salt.loaded.int.engines.test-test_instance1)
  34389 ?        Sl     0:00 /usr/bin/python3 /usr/bin/salt-minion KeepAlive MultiMinionProcessManager MinionProcessManager salt.engines.Engine(salt.loaded.int.engines.test-test_instance2)
  34436 ?        Sl     0:00 /usr/bin/python3 /usr/bin/salt-master salt.engines.Engine(salt.loaded.int.engines.test-test_instance1)
  34437 ?        Sl     0:00 /usr/bin/python3 /usr/bin/salt-master salt.engines.Engine(salt.loaded.int.engines.test-test_instance2)

PR #60260 by Daniel Wozniak

Beacons

State system

New operating systems support

The changes listed below do not mean official support as described in the Supported Operating Systems document. Instead, they mean that someone made an improvement in Salt for a specific operating system or the OS was included in the official test suite.

Transactional Systems

This feature adds support for transactional systems and openSUSE MicroOS in particular. MicroOS has a read-only root filesystem and the transactional-update tool that leverages snapper, zypper, btrfs and overlayfs to perform atomic updates. Salt 3004 ships with two new execution modules (transactional_update and rebootmgr) and a new executor module (transactional_update). The executor module wraps Salt module calls with transactions. Below is a rough summary of how the feature works:

I wonder if NixOS support could be added to this transactional framework (it seems like a right fit).

PR #58520 by Alberto Planas

UPDATE: the internal implementation is going to be redesigned in PR #61188 that was submitted after the release.

Windows improvements

Install anywhere

The default install location will be %ProgramFiles%\Salt Project\Salt for the binary data and %ProgramData%\Salt Project\Salt for the Root Directory (root_dir). A couple of switches control the installer behavior:

And for the uninstaller:

For more details on how this feature is designed, read the SEP-31.

PRs #60267 and #60952 by Shane Lee

file.patch

I needed this feature a couple of times to self-patch Windows minions and had to implement my own workarounds for that using patch.exe and msys-2.0.dll from Git for Windows Portable. The good thing is that the feature is now built-in; the not-so-good thing is that the patch executable is not bundled with the Salt installer and needs to be delivered to a minion using a separate state. Anyway, the feature is quite helpful, and I’ll definitely try it.

PR #60399 by @xeacott

Vault Enterprise namespaces

Namespaces is a set of features within Hashicorp Vault Enterprise that allows Vault environments to support Secure Multi-tenancy (or SMT) within a single Vault infrastructure. Through namespaces, Vault administrators can support tenant isolation for teams and individuals as well as empower delegated administrators to manage their own tenant environment. API operations performed under a namespace can be done by providing the relative request path along with the namespace path using the X-Vault-Namespace header.

To enable the feature, add an optional namespace key to the vault master config section:

vault:
  # ...
  namespace: vault_enterprice_namespace
  # ...

PR #58586 by Edmund Adderley

Salt SSH

Directory roster

This is a new type of roster called “directory roster”. The directory roster is a flat directory of files. Each file’s name is a minion id, and the contents of each file must yield the data structure expected within each roster entry after being rendered with the salt rendering system. It was introduced to help solve the following use-case:

We maintain our roster in a git repo. As our team grows and we add and remove systems from the roster, the number of merge conflicts in the flat roster file has increased significantly. Switching to this directory roster system has significantly decreased the headache of git merge semantics when multiple git users introduce different roster changes at the same time.

Configuration example:

# /etc/salt/master.d/roster.conf
roster: dir
roster_dir: config/roster.d
# roster_domain: example.com
# config/roster.d/minion-x:
host: minion-x.example.com
port: 22
sudo: true
user: ubuntu
# config/roster.d/minion-y:
host: minion-y.example.com
port: 22
sudo: true
user: gentoo

If you uncomment the roster_domain setting, you can omit the domain part in the individual roster files.

PR #60364 by @kojiromike and Gareth J. Greenaway

Heist minion presence events

Allows presence events to work with Heist-Salt minions. If you set the master configuration option detect_remote_minions to True it will try to detect connected minions over port 22 unless the port specified is changed with the configuration remote_minions_port.

Another feature with almost useless documentation. As of the time of this writing, Google search on docs.saltproject.io gives zero matches for the heist keyword. How is a Salt user supposed to learn what Heist is, how it differs from Salt SSH or enable_ssh_minions, and what is the use-case for this option?

Below is what I was able to understand after many hours of research.

There are two googleable repositories that are located on GitLab and have very cryptic descriptions and the motto of “making deployment and management of Salt easy”:

Using Heist

So, let’s install heist first:

pip3 install heist==5.0.0

Create the necessary folders and files:

mkdir -p /etc/heist/rosters
touch /etc/heist/heist.conf
cat << EOF > /etc/heist/rosters/roster.conf
minion1:
  # host: minion1
  host: 10.211.55.25
  username: vagrant
EOF

I struggled for 5 hours trying to make Heist work. Here is the list of problems I found:

So, after so much wasted time, I was able to make the bare Heist work (yeah, the majority of the INFO messages should be logged as DEBUG ones):

heist --log-level info test -R /etc/heist/rosters/roster.conf

[INFO    ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/curio_loop returned virtual FALSE: No module named 'curio'
[INFO    ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/selector_win returned virtual FALSE: WindowsSelectorEventLoop only runs on windows
[INFO    ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/proactor returned virtual FALSE: WindowsProactorEventLoop only runs on windows
[INFO    ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/qt returned virtual FALSE: No module named 'qasync'
[INFO    ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/trio_loop returned virtual FALSE: No module named 'trio_asyncio'
[INFO    ] Module /usr/local/lib/python3.8/dist-packages/pop_loop/loop/uv_loop returned virtual FALSE: No module named 'uvloop'
[INFO    ] Picking default roster: flat
[INFO    ] This is a test heist manager. You have installed heist correctly. Install a heist manager to use full functionality

It looks like Heist is just a skeleton library with some functions. Here is what they can do:

In summary, Heist is a tool to deploy some binaries to remote systems through ssh and run them, with optional tcp tunneling back to the source system. To do something useful with it, you need to install a manager. I’m not sure why Heist is split into two tools (thus complicating the user experience), because I was unable to find any other addons that use it other than heist-salt.

Using Heist Salt

Now let’s install heist-salt as well:

pip3 install heist-salt==4.0.0

Below is what heist-salt is supposed to do:

Alternatively, it can bootstrap a Salt minion and connect it to an existing master (Heist will skip the grain and tunnels setup in this mode)

As you can guess, it didn’t work for me either. Setting aside the unreasonably chatty INFO logging, this is what I got when I ran heist --log-level info salt.minion -t minion1 -R /etc/heist/rosters/roster.conf:

I enabled SSH root login, re-ran the command, and voila! The heist command generated and accepted a minion key and stayed in the foreground, keeping the tunnels alive. After 8+ hours, I was finally able to run a command on Salt Heist minion:

salt minion1 grains.get minion_type

minion1:
    heist

Now let’s get back to my original questions:

1. How is a Salt user supposed to learn what Heist is?

2. How Heist differs from Salt SSH or enable_ssh_minions?

  1. It is more complex than Salt SSH because it requires you to run a Salt master and keep the heist command running on the master (not always, just when you need to run something on Heist minions).
  2. Heist deploys an experimental single binary Salt minion package, and uses reverse TCP tunnels on top of SSH to connect via ZeroMQ to Salt master. Salt SSH just packs Salt python modules and your state tree into a tarball, uploads to a remote system using SSH, runs salt-call --local and sends the results back.
  3. Heist is slower to deploy than Salt SSH, but once the tunnels are up, you can run subsequent commands much faster.
  4. It needs a roster file like the enable_ssh_minions mode, but it can use all the features available for regular ZeroMQ-connected minions (Salt SSH is more limited).
  5. It can be used to deploy regular Salt minions that connect to any master via ZMQ and do not need Heist and reverse tunnels running, with the exception that the minions use experimental single-binary packages. But you can do the same with Salt SSH by writing some states to bootstrap a minion. Or you can use Fabric or Ansible as well for this task.
  6. There is also a Salt extension that provides the heist.deploy runner to deploy a Heist minion via salt-run

3. What is the use-case for these new detect_remote_minions and remote_minions_port master options?

This turned out to be a pretty obscure feature:

To summarize:

  1. This is a quite clever trick to make some existing Salt master features work with minions that are deployed via Heist and connected via reverse SSH tunnels
  2. You only need to enable the detect_remote_minions flag if you use Heist minions
  3. You do not want to use Heist minions in the near future unless you like to test highly experimental software, ready to dig into source code and submit bug reports and patches

PR #60633 by Megan Wilhite . A little bit more context can be found in the salt-heist repo.

Grains

Packages

Nifty tricks

Override command retcode based on output

This feature was inspired by Ansible’s failed_when directive:

Installer script:
  file.managed:
    - name: /tmp/installer.sh
    - mode: 755
    - contents: |
        #!/bin/sh
        # This is a contrived example of idempotent command that exits with 1 on a second run
        if [ ! -f /tmp/.installed ]; then
           touch /tmp/.installed
           echo "The thing was installed successfully"
           exit 0
        else
           echo "The thing is already installed"
           exit 1
        fi

# Note that the command will run each time you apply the state
# With success_stdout it just won't fail on subsequent runs
Run the installer:
  cmd.run:
    - name: /tmp/installer.sh
    - require:
        - file: Installer script
    - success_stdout:
        - The thing is already installed

It is supported in cmd.wait, cmd.wait_script, cmd.run and cmd.script states. You can specify a list of lines to match (using the success_stdout and success_stderr arguments), and if any of them is found in the command output, then the resulting retcode will be overridden with zero.

The example above is a bit contrived because you can achieve the same result with the unless directive that checks for /tmp/.istalled and prevents the state from being run the second time. However, if a command does something less obvious (for example, interacts with a network service), then this feature could be useful.

PR #59841 by Gareth J. Greenaway and Loren Gordon

File lookup functions

The slsutil.findup function was originally written to help state files locate a Jinja file to be imported. It will find the first path matching a filename or list of filenames in a specified directory or the nearest ancestor directory. It could be useful for formulas that typically contain a map.jinja that needs to be included by every state file.

New functions:

Example:

{% from salt['slsutil.findup']('formulas/shared/nginx', 'map.jinja') import nginx with context %}

The following folders (relative to salt:// file tree) will be searched for the map.jinja file:

PR #60159 by @amendlik

Other notable changes

You can find other changes and bugfixes in the official CHANGELOG.md and Release Notes