Action 5: Provide Monitoring and Debugging Tools
5-1 Provide a Looking Glass for Members
A looking glass is an important facility that can help debug routing incidents or anomalies and prevent or shorten potential outages. An IXP should offer a looking glass interface of its Route Server to its members.
5-2 Internal Monitoring of IXP Infrastructure
The following tools are important to help IXP staff monitor the status of the IXP and track growth trends.
-
SmokePing (https://oss.oetiker.ch/smokeping/)\
SmokePing is a deluxe latency measurement tool. It can measure, store, and display latency, latency distribution, and packet loss. SmokePing uses RRDtool to maintain a long-term data-store and to draw graphs, giving up to the minute information on the state of each network connection. At an IXP, SmokePing graphs can help measure latency to the different ISP end points peering at an IXP and send alerts in the event latencies increase, which could point out to a congested link or a faulty link.
-
IXP-Watch ([https://github.com/euro-ix/IXP-Watch])
IXP-Watch is a tool to continuously monitor layer 2 traffic on the exchange. As well as storing a regular traffic sample, it will generate alerts for the following:
- Excessive ARP;
- Excessive traffic captured;
- Spanning Tree;
- Non-IP/IPv6 Traffic (for example CDP);
- Multicast/Traffic directed to 255.255.255.255 - DHCP/OSPF/IGP etc.;
- Stray SNMP.
At an IXP, keeping an eye on Layer 2 traffic helps to detect issues related to infrastructure that could negatively affect performance.
-
IXP Manager (https://www.ixpmanager.org/)
IXP Manager is a full stack management platform that includes an administration and customer portal; provides end-to-end provisioning; and both teaches and implements best practice. It is a powerful platform used at IXPs globally and features:-
Administrative portal for managing an IXP
-
Abstracted model of an IXP which includes: Infrastructures, VLANs, locations, cabinets, patch panels, switches, switch ports, IP addresses, MAC addresses, IXP members, user accounts, route servers, IRRDB configuration
-
Monitoring information includes per-member statistics (bits, packets, errors, discards), p2p traffic from sflow telemetry and Peering Matrix
-
Integration with third party packages (Birdseye Looking Glass), BIRD, BIND, Mailman, SmokePing, tac_plus4, Nagios, etc.
-
Member login system provides Peering Manager, route server prefix analysis tool, graph views
-
-
Zabbix/Nagios/Observium
These are tools with the capability to monitor IXP devices and services to check for uptime and send alerts in the event a device or service - for example an email or web service - goes offline.
-
Nagios - [https://nagios.org
-
Zabbix - [https://zabbix.com
-
Observium - [https://observium.org
-
Log Collector
A log collector aggregates logs from across the IXP's infrastructure into a single searchable store, which aids troubleshooting, security investigation, and auditing. Examples include Splunk and the ELK (Elasticsearch, Logstash, Kibana) stack. -
Cacti ([https://cacti.net)
Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. With Cacti, one can monitor traffic usage across a multitude of devices and also monitor additional metrics like CPU and RAM usage. Cacti is a great tool to help show the IXP's traffic growth over a period of months and years.
5-3 Other Useful Tools to Consider
Here is a list of other tools to consider running at an IXP:
-
BGPalerter ([https://github.com/nttgin/BGPalerter)
-
Netflow tools
-
PMACCT: [http://www.pmacct.net/
-
NFSEN: [https://nfsen.org
-
-
aRouteServer (https://arouteserver.readthedocs.io/en/latest/)