Arista Network Test Automation (ANTA) Example
When I was at Autocon1, I visited the Arista booth and had an interesting chat with their team. They mentioned a tool called Arista Network Test Automation (ANTA), which sounded promising, and I wanted to try it out in my lab. As with anything related to automation, I always want to experiment and share my findings with my readers.
What is ANTA?
In a nutshell, if you work with Arista devices, you can use the ANTA Python library to write tests using a simple YAML declarative syntax. From a very high level, you define some tests, run them, and they either pass or fail. This straightforward approach helps you quickly verify the health and configuration of your network.
Here are some of the tests you can write.
- Checking if all your EOS devices are running a specific EOS version,
- Ensuring all BGP peers are up and running
- Verifying if you have a specific route on all or specific devices
- Verify PortChannels and Interfaces
- And many more
Here is a very simple test that checks if all my routers (6 in total) have the specified two routes in their routing tables. The test will fail if the route doesn't exist.
---
anta.tests.routing:
generic:
- VerifyRoutingTableEntry:
vrf: default
routes:
- 0.0.0.0
- 1.1.1.0
Why Do You Need It?
From my experience, when there are changes to networks, applications, or systems, we usually ask different teams to check if everything is working as expected. Other teams, like application support or systems, often have scripts that they run to perform tests. They then come to us, network engineers, and say, “All good.”
But how do we test the network? In my experience, we lag behind in automated tests. Typically, we check our monitoring systems to see if everything is green. We might also login to some devices and run show commands to verify everything is functioning as expected. But what if we have several hundred devices? What if I want to check something very specific that is not available via SNMP?
Usually, I write my own custom Python scripts to achieve this, and they work well. However, I'm excited to see an open-source solution from the vendor themself. Our ultimate goal here is to have something that we can run to ensure our network is functioning as expected.
Installation and Initial Setup
You can use ANTA as a CLI application or integrate it into your own Python scripts. To get started, you need to install both ANTA and the CLI package. Here are the commands. As always, create a Python virtual environment first (optional but recommended)
python -m venv anta-env
source anta-env/bin/activate
pip install anta
pip install 'anta[cli]'
Defining the Inventory
As with any other tool, you need to define your devices in an inventory file. The contents of the YAML inventory file must start with the anta_inventory
key and then define your devices in one or multiple methods.
- hosts - Define each device individually.
- networks - Scan a network for devices.
- ranges - Scan a range for devices.
#inventory.yaml
---
anta_inventory:
hosts:
- host: 192.168.200.11
name: R1
tags: ['cpe']
- host: 192.168.200.12
name: R2
- host: 192.168.200.13
name: R3
- host: 192.168.200.14
name: R4
- host: 192.168.200.15
name: R5
- host: 192.168.200.16
name: R6
For context, here is the network diagram that I'm using which consists of 6 routers. The neighboring routers have eBGP peering between them. Each router advertises a /24
network with R1 is advertising 1.1.1.0/24
, R2 is advertising 2.2.2.0/24
and so on.
Test Catalog - The Things We Want to Test
You can check all the available tests on their official documentation page here. Just to give you an idea, you can check for things like whether the devices are running the intended EOS version, if they have specific BGP peers, and even specify the number of BGP peers each router should have. There are also tests for OSPF, interfaces, VXLAN, multicast, and more. In this example, I will show you just a few of them and how easy it is to set them up.
#test.yaml
---
anta.tests.software:
- VerifyEOSVersion: # Verifies the device is running one of the allowed EOS version.
versions: # List of allowed EOS versions.
- 4.25.4M
- 4.32.0.1F-36950381.43201F (engineering build)
anta.tests.configuration:
- VerifyRunningConfigDiffs:
anta.tests.connectivity:
- VerifyReachability:
hosts:
- source: Management0
destination: 192.168.200.1
- VerifyLLDPNeighbors:
neighbors:
- port: Et1
neighbor_device: R2
neighbor_port: Ethernet1
- port: Et2
neighbor_device: R3
neighbor_port: Ethernet1
filters:
tags: ['cpe']
anta.tests.routing:
generic:
- VerifyRoutingTableEntry:
vrf: default
routes:
- 0.0.0.0
- 1.1.1.0
bgp:
- VerifyBGPPeersHealth:
address_families:
- afi: "ipv4"
safi: "unicast"
vrf: "default"
So, here are the tests I’m running for this example. As you can see, the tests are defined in a YAML file, making it very easy to set up. I’m checking for things like
- EOS Version - Ensuring the devices are running the intended EOS versions.
- Configuration Drift - Verifying that the startup and running configs are the same, so there is no drift.
- Connectivity Check - Checking connectivity to the IP 192.168.200.1 sourced from the management interface. You can also check against multiple IPs.
- LLDP Neighbors - Running a test against devices with a specific filter
cpe
to check for LLDP neighbors. - Routing:
- Routing Table - Ensuring the devices have specific routes (0.0.0.0 and 1.1.1.0) in their routing table.
- BGP Peers Health - Making sure all BGP peers are in the ‘established’ state.
ANTA CLI
Now we have the inventory and the tests we want to run, but how do we run them? That’s where the ANTA CLI comes in. Remember we installed the CLI before? You can use it to run the tests.
But before we proceed, you might have noticed that we didn’t define the credentials in the inventory files. We can pass the credentials
- via the CLI as parameters, or
- we can define them as environment variables.
In our example, the following CLI command kicks off the tests. Here, I’m passing the credentials as parameters, but make sure not to expose them in production environments.
anta nrfu --username admin -P --inventory inventory.yaml --catalog test.yaml
The output was too long so, here are the truncated test results. Everything looks green and all the tests passed.
If you don't want to pass the credentials directly into the CLI, you can use environment variables as shown below.
export ANTA_USERNAME=admin
export ANTA_PASSWORD=admin
export ANTA_ENABLE=True
export ANTA_ENABLE_PASSWORD=admin
export ANTA_INVENTORY=inventory.yaml
anta nrfu --catalog test.yaml
Let's Make a Test Fail
To show you a failing test, I’m going to make a small change on one of the devices, but I’m not going to save the running configuration to the startup configuration. This should cause our test that checks the diff between running and startup configs to fail on that device. To keep the output to a minimum, I’m going to remove all the other tests.
R1#
R1#conf ter
R1(config)#ntp server 10.10.10.10
R1(config)#end
As you can see, not only is the test marked as failed for R1, but it also shows you the difference between the startup and running configurations. This makes it easy to identify and rectify the issue.
Practical Uses and Closing Up
These tests can be useful in scenarios where you need to check crucial aspects of your network like the routing table, BGP peering, etc. By running these tests and ensuring they pass, you can be confident that critical services are working as expected. You can run these tests every morning as part of your daily checks or after any major changes to the network. This ensures that your network remains stable and any issues are quickly identified and addressed.
You don’t have to write tests in YAML and use the CLI. You can write these tests in Python and integrate them with the rest of your workflow or CI/CD pipelines. You can even create your own custom tests if the test you need is not available.
I’m new to this tool, so I’m still learning. If you like it and want to see more, please let me know in the comments. Your feedback helps me create content that matters to you.