Tips on writing a k8s ansible operator
k8s Ansible Operators promise to provide a simple way to automate infrastructure management by bringing Ansible into the k8s complex and using k8s custom resources as well as other information contained in the k8s api.
Most of the examples in documentation and posts focus on using ansible inside the k8s complex, and while valuable, there are plenty of ways to solve those problems. I wanted to use ansible operators to manage infrastructure outside of the k8s complex. Here are a few things that I needed to figure out to achieve this.
Can I watch any resource?
You can watch any resource in the complex, just add them to the watches file and make sure that you have permission, the default roles do not include pod/status. Remember that when your executing a role from a resource other than the CR, variables (extra_vars) from the CR will not be available, it’s necessary to look them up independently.
The Operator Reconciler.
By default the operator associated with the CR runs every 60 seconds, the reconcilePeriod. This is configured in the watches file, 30s 1m etc.
Querying k8s API.
It’s straight-forward to query the api, there are multiple methods I ended up using the lookup plugin as enables the use of python functions and pipes. Don’t spend too much time with the “field selector”, it requires specific implementation in the kubeapi for each field and its not extensively implemented, you can only count on metadata.name and metadata.namespace.
Responses to queries contain a lot of information and it’s likely that you are only looking for a few fields. You can descend into the structure just as you would with any ansible variable.
lookup(‘k8s’, api_version=’v1', kind=’ConfigMap’,namespace=’metallb-system’).data.config
The easiest way to figure out the structure is to pipe the output to JSON and look at the structure in a JSON editor
Getting the specific information.
Loading ansible vars with specific information sourced from k8s lookups is unfortunately tedious. Returned data from the API converts into python nested lists and dicts. While I’m sure that some ansible experts can navigate these nested structures, I resorted to writing to a JSON file and writing a python program to extract the specific data I needed into a simple list or dict, write to a file and then load the vars from file. Untangling nested looping of vars in ansible was too frustrating and its much more transparent to debug simple conversion code.
Communicating with devices outside the k8s complex.
By default the ansible operator is not setup for outbound connections. SSH is not installed in the default container, in addition I ran into other userid problems. Note that in both containers inside the POD the ansible userid is ansible-operator UID 1001. Once SSH and userid’s were sorted out, it simple to project keys into the POD from k8s secrets for access.
Inventory and where do the roles run?
The ansible operator runs roles on localhost, which is the ansible container in the operator POD. Instead of trying to figure out how to make changes to the inventory, the solution I used was to add the remote hosts in the role using add_host combined with the delegate_role command to run the task on the remote host.
Workflow and debugging.
I found the best way to develop the ansible was to do it independently of the operator on a workstation that has k8s admin keys installed, this allows all of the k8s functions to return data. Once complete I moved them into the operator.
Debugging the operator is simple. Set the following annotation in the cr
apiVersion: test.example.com/v1alpha1
kind: Test
metadata:
name: example-test
annotations:
“ansible.operator-sdk/verbosity”: “4”
spec:
enable: True
Once the POD is running, inspect the logs for the ansible container.
If you would like to look at my operators, they are hosted on github at https://github.com/acnodal