Palo Alto High Traffic Latency Troubleshooting
We all know that firewalls are limited by hardware resources. Larger devices support higher throughput, while smaller ones may not perform as well. When experiencing slow traffic or latency issues on a firewall, we typically check resource usage and session counts to see if we are reaching these limits. If we are, that often concludes our troubleshooting. But what if we aren't hitting these limits and still experience traffic slowness? In this blog post, we'll explore a few methods to troubleshoot high latency issues on Palo Alto firewalls.
Please note that this troubleshooting is applicable when the dataplane CPU and session count are well below the limit, but you are still experiencing some form of latency issues or random packet loss. If this issue sounds familiar, please continue reading.
Packet Descriptors (on-chip)
If you find yourself in a situation where resource usage is well under the limit but you are still experiencing high latency, the next step is to identify sessions that consume too much of the on-chip packet descriptor.
You can run the following command on any hardware-based firewall model (not a VM-Series firewall) to identify, for each slot and dataplane, the on-chip packet descriptor percentage used, the top five sessions using more than 2%of the on-chip packet descriptor, and the source IP addresses associated with those sessions.
show running resource-monitor ingress-backlogs
The command displays a maximum of the top five sessions that each use 2% or more of the on-chip packet descriptor.
High on-chip descriptor and packet buffer usage can occur under various conditions. For example, a device might be flooding syslog traffic, and if the firewall is configured to deny and log this traffic, it can lead to high descriptor usage. Another scenario could involve a policy that allows streaming telemetry or sending some stats over an HTTP(S) session. While the traffic may not be bandwidth-intensive, the session could send a larger number of packets, requiring the firewall to perform extensive Layer 7 lookups.
Slowpath Policy Deny
Let's look at a scenario where UDP syslog traffic (usually high volume) is being sent out but the firewall is configured to deny this traffic. Even then, we might assume, "Come on, syslog packets are very small and the firewall should handle these packets easily," but there is a caveat. UDP syslog traffic is usually high volume and uses the same source port, and this denied traffic has the same 6-tuple configuration (source/destination IP, source/destination port, protocol and ingress zone). Let's explore when and why a policy denial causes a spike in on-chip descriptor usage.
If an incoming packet does not match an existing session, it is subjected to slowpath. If the packet matches a deny policy in slowpath (with session logging enabled), the packet is dropped and a traffic log entry is created, but a session is not installed. The next packet with the same 6-tuples would go through the same path as the previous packet. Slowpath, as the name suggests, can take a greater number of processing cycles because it is in this step that all the tasks associated with establishing a session are done.
The time taken to complete the slowpath depends on the firewall configuration and traffic pattern. For instance, if there are a large number of security or NAT policies, the time taken to complete slowpath would be higher. All the packets with the same 6-tuples are subjected to ATOMIC or serial packet processing in the incoming order (one-at-a-time). As these packets have to be serially processed, they cannot be sent to different cores or threads for parallel processing.
As the packets are waiting to be processed by the DP CPU, depending on the incoming rate of packets and the time taken to complete the slowpath for each packet, there can be an accumulation of packets. If there is a significant rate/amount of such same 6-tuples traffic hitting the firewall, getting denied on slowpath, the on-chip descriptors and eventually the packet buffers can get filled up. At this point, you would start to see traffic issues.
Policy Allow
Not only the denied traffic but also allowed traffic can cause issues. The best way to troubleshoot and identify the offending traffic is to run the following command.
show running resource-monitor ingress-backlogs
suresh@pa-440> show running resource-monitor ingress-backlogs
Sat Dec 14 08:50:22 2024
-- SLOT: s1, DP: dp0 --
USAGE - ATOMIC: 96.42390621% TOTAL: 100.0%
TOP SESSIONS:
SESS-ID PCT GRP-ID COUNT Special Notes
5238 92% flow_fastpath 835
SESSION DETAILS
SESS-ID PROTO SZONE SRC SPORT DST DPORT IGR-IF TYPE APP
5238 6 SERVER 10.10.12.15 47524 10.10.20.11 8080 ethernet1/1 FLOW web-browsing
Once you identify the traffic, you can decide what to do with it. If traffic is necessary, you can consider creating an App-Override policy that disables Layer 7 inspection, thus reducing the load.
Here we could see traffic from 10.10.12.15 to 10.10.20.11 on port 8080 is using a lot of on-chip resources. Now that we have identified the traffic, we can try to stop this traffic for a while to see if that makes a difference. If this is the case try and create an App-Override policy to disable any L7 deep packet inspection.
Application Override is where the firewall is configured to override the normal Application Identification of specific traffic passing through the firewall. As soon as the Application Override policy takes effect, all further App-ID inspection of the traffic is stopped and the session is identified with the custom application.
References
https://knowledgebase.paloaltonetworks.com/KCSArticleDetail?id=kA14u000000HBjNCAW