Troubleshooting Ethernet bridging on Linux
To diagnose problems arising from use of the Linux
An Ethernet bridge (or switch) is a device for forwarding packets between two or more Ethernets so that they behave in most respects as if they were a single network. It could be a physical device, but it is also possible for a bridge to be implemented entirely in software. The Linux kernel has the ability to perform bridging by means of the
The most likely symptoms of a bridging problem are that:
- the bridge does not forward traffic,
- the bridge forwards traffic intermittently,
- the bridge causes a storm of duplicate traffic, or
- the machine hosting the bridge appears to freeze.
If the bridge is not forwarding traffic then there are at least six possibilities to consider:
- The bridge has not been created.
- The appropriate interfaces have not been attached to the bridge.
- The bridge or the attached interfaces are not in the ‘up’ state.
- The bridge ports are not in the ‘forwarding’ state.
- The traffic to be bridged is not reaching the relevant interface.
- The traffic is being filtered by ebtables.
Intermittent forwarding usually has some form of intermittent connectivity as its root cause, however there are two ways in which the use of bridging can exacerbate what might otherwise have been a less serious problem:
- If STP is enabled then the spanning tree may become unstable due to the topology changing faster than the tree can converge.
- Even without STP, the bridge forwarding delay typically adds 15 seconds to the recovery time for even the briefest of outages.
If the problem is likely to reoccur frequently then it may be possible to tune the bridge parameters so that the network is more resiliant to outages of this nature.
A storm of duplicate traffic almost certainly indicates that the network contains one or more loops. You then have a choice between:
- finding the loops and breaking them manually, or
- enabling STP (the Spanning Tree Protocol) or an equivalent, which automatically disables any link that would cause a loop.
(Be aware that loops are sometimes created deliberately in order to provide redundancy. It is then necessary to have either some form of failover or load balancing mechanism. STP can be used to provide failover, whereas load balancing requires use of a protocol such as LACP.)
If the machine appears to freeze after adding a network interface to a bridge then this could be because:
- you are administering it remotely via that interface (for example using SSH), or
- the machine depends on that interface for vital services (for example NFS or LDAP).
Removing the interface from the bridge will solve the immediate problem. The underlying issue is that when an interface is attached to a bridge then any network addresses need to be bound to the bridge, not to the interface.
Remember that rule changes made using the
ifconfig commands are not persistent. Most GNU/Linux distributions provide a mechanism for creating a persistent bridge, however the configuration method varies.
A list of bridges can be displayed using the
brctl show command:
the output from which should be of the form:
bridge name bridge id STP enabled interfaces br0 8000.0200c0a80091 no eth0 eth1
Verify that the bridge exists, has the name you expect, and is attached to the appropriate interfaces.
Bridges, like network interfaces, have an ‘up’ state and a ‘down’ state and they will not pass any traffic unless they are up. You can check whether a bridge is up or down using the
Here is an example of the output from this command for an interface that is down, with the relevant line highlighted:
br0 Link encap:Ethernet HWaddr 36:0a:79:b5:4e:66 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
and for the same interface when up:
br0 Link encap:Ethernet HWaddr 36:0a:79:b5:4e:66 inet6 addr: fe80::340a:79ff:feb5:4e66/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:2 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:168 (168.0 B)
If the bridge needs to be brought up then this can be done using the
ifconfig br0 up
The same considerations apply to each of the attached Ethernet interfaces: these can be brought up or down independently of the bridge, and they will only pass traffic if they are up.
At any given time, a Linux bridge port will be in one of five possible states: ‘disabled’, ‘listening’, ‘learning’, ‘forwarding’ or ‘blocking’. You can find out which using the
brctl showstp command:
brctl showstp br0
the output from which should be of the form:
br0 bridge id 8000.e0699577868f designated root 8000.e0699577868f root port 0 path cost 0 max age 20.00 bridge max age 20.00 hello time 2.00 bridge hello time 2.00 forward delay 15.00 bridge forward delay 15.00 ageing time 300.01 hello timer 0.64 tcn timer 0.00 topology change timer 0.00 gc timer 15.64 flags eth0 (1) port id 8001 state forwarding designated root 8000.e0699577868f path cost 4 designated bridge 8000.e0699577868f message age timer 0.00 designated port 8001 forward delay timer 0.00 designated cost 0 hold timer 0.00 flags eth1 (2) port id 8002 state forwarding designated root 8000.e0699577868f path cost 100 designated bridge 8000.e0699577868f message age timer 0.00 designated port 8002 forward delay timer 0.00 designated cost 0 hold timer 0.00 flags
The relevant fields have been highlighted. ‘forwarding’ is the state you want the port to be in so that it can carry traffic. In this state you can be reasonably confident that the bridge and interface are up, that there is a physical connection, and that forwarding has not been blocked by STP.
‘blocked’ indicates that the port has been prevented from forwarding traffic by STP or an equivalent in order to avoid a bridge loop from being formed. This should only happen when there is another path that the network traffic can take (excepting the brief loss of connectivity that occurs when the spanning tree changes).
If you want a particular network segment to be used in preference to any other paths that might be available then there are two ways to achieve that safely: either change the network topology manually so that it becomes the only path, or adjust the STP path costs so that it becomes the cheapest path. Otherwise, be assured that the ‘blocking’ state is a normal part of the operation of STP and does not by itself indicate that there is a problem.
‘listening’ indicates that the STP implementation has not yet decided whether the port should enter the ‘forwarding’ or ‘blocked’ state. ‘learning’ indicates that it is about to enter the ‘forwarding’ state, but it attempting to populate its MAC address table first to avoid an immediate burst of packets echoed to all ports. These should be transient states that last for a few tens of seconds at most. If you find that a link is spending an excessive amount of time in one or other of these states then that could indicate there is a link that is flapping up and down (not necessarily a local one), or that the size and complexity of the network has cause the spanning tree to become unstable.
In the course of its operation a bridge must attempt to determine which MAC addresses are reachable through each of its attached interfaces. It does this by inspecting the source address of each packet that arrives at the bridge and recording it in a table. In the case of the Linux bridging module it is possible to inspect the content of this table using the
brctl showmacs command:
brctl showmacs br0
The output is typically of the form:
port no mac addr is local? ageing timer 1 02:54:65:73:74:31 no 1.42 1 02:54:65:73:74:32 no 3.34 1 02:54:65:73:74:33 no 2.46 1 02:54:65:73:74:34 yes 0.00 2 02:54:65:73:74:35 no 1.42 2 02:54:65:73:74:36 no 3.34 2 02:54:65:73:74:37 no 2.46
The value of this information for troubleshooting is that it tells you whether any packets from a given machine are being processed by the bridge. Possible explanations for the non-appearance of a MAC address are that:
- packets from the machine in question are not reaching the bridge for some reason;
- the receiving interface (see above);
- the bridge port is disabled (see above); or
- the address was in the table but has since expired.
Addresses typically expire after 5 minutes, so this is unlikely to be an issue if packets are being actively sent at the time you check the table, but it is a point to bear in mind if there has been any substantial delay between sending and checking.
Ebtables is a packet filter that is similar in concept to iptables, except that it operates at the link layer rather than the network layer (acting on Ethernet frames as they are bridged as opposed to IP datagrams as they are routed). Ebtables is transparent by default, and this is the state you are likely to find it in on most machines, but it is worth checking because ebtables rules are capable of blocking or altering bridge traffic in an almost arbitrary manner.
ebtables command is available then you can view the rulebase using the
-L option. For example, for the
ebtables -t filter -L
Normally you would expect this to be empty:
Bridge table: filter Bridge chain: INPUT, entries: 0, policy: ACCEPT Bridge chain: FORWARD, entries: 0, policy: ACCEPT Bridge chain: OUTPUT, entries: 0, policy: ACCEPT
The same applies to the
ebtables command is not installed then that strongly suggests ebtables is not being used, although it is conceivable that rules could have been added by some other means.
If the rulebase is non-empty then you can obtain some insight into what effect it might be having by inspecting the counters associated with each rule:
ebtables -t filter -L --Lc
Each rule has two counters:
pcnt (the number of packets) and
bcnt (the number of bytes). As with iptables, it can be helpful to insert additional rules for the purpose of monitoring.
Bridge loops are by their nature difficult to track down because the resulting packet storms will propagate throughout the entire network unless stopped. The packet source addresses are unlikely to be helpful because they say only where the traffic was originally sent from and not where it was replicated. Tracing the packet flow with a tool like tcpdump is not straightforward when all copies of a given packet are identical.
A more effective method is to partition the network until the symptoms disappear, then cautiously reconnect it one link at a time until they reappear. This is not something you would normally want to do to a network of any importance, but if it has already been incapacitated by a bridge loop then the potential for further harm is likely to be limited.
A packet capture tool such as tcpdump or Wireshark can be used for monitoring. It does not matter greatly where this is attached, provided that you are not using bridges that have active protection against packet storms (see below). You should disable reverse DNS lookup of captured IP addresses (the
-n option in the case of tcpdump) because the DNS is unlikely to work reliably if there is a bridge loop. If you do not do this then there is likely to be a delay between when packets are captured and when they are displayed, which will prevent you from obtaining a timely view of what is happening on the network.
You should also ensure that there is a source of broadcast traffic on the network, so that a packet storm will occur promptly whenever a loop is created. An ongoing attempt to ping a non-existant IP address on a local subnet will have the required effect. (If a responsive IP address were chosen then the traffic would largely consist of unicast ICMP echo requests which would not necessarily be amplified. Choosing an address that does not exist will instead result in broadcast ARP requests.)
If the act of reconnecting a network segment causes the symptoms to reappear then there are two possibilities to consider:
- The segment may form part of the loop that you are investigating, in which case reconnecting will have caused the loop to be reestablished.
- The loop may lie beyond that segment, in which case it existed throughout the test but was unreachable from the monitoring point while the segment was disconnected.
A characteristic of the first condition is that there will be connectivity between the two parts of the network even when the segment under test is disconnected. (This connectivity might be unidirectional, but for a loop to form there must be a return path of some description.) By sending a stream of test packets from one side of the disconnected segment to the other it should be a relatively straightforward matter to trace the path they are taking.
If the loop lies beyond the disconnected segment then you can reconnect the remainder of the network then repeat this diagnostic procedure for the problematic region in isolation.
A complicating factor is that some networking equipment attempts to detect packet storms and actively protect against them, typically by disabling the port receiving the traffic. In the best case, where the loop is located at the edge of the network, this can both contain the effects of the loop and greatly simplify diagnosis (as the cause may be obvious once you know which port has been disabled). In other circumstances it can hinder troubleshooting by making the network more stateful, and there may be a case for temporarily turning this feature off despite the immediate negative consequences.
Rather than attempting to find and break loops manually you can use the Spanning Tree Protocol (STP) to achieve the same result automatically:
brctl stp br0 yes
Ideally STP should be enabled on all bridges throughout the network. Failing that, any bridges that form part of a loop and are not STP-aware must be transparent to it. For obvious reasons there must be at least one STP-aware bridge in each loop.
For small networks STP should just work without further configuration. If the network is larger, or has frequent topology changes, then some tuning may be necessary to achieve acceptable results.
As noted above, adding an interface to a bridge causes it to stop acting as an Internet Protocol endpoint. This could result in the machine appearing to freeze if:
- you are administering it remotely via that network interface, for example using SSH, or
- the machine depends on the network for vital services, for example NFS or LDAP.
The solution is to remove the interface from the bridge by the most graceful means possible. In order of preference:
- Log on using the console and issue a
brctl delifcommand, for example
brctl delif br0 eth0.
- Reboot the machine gracefully, for example by sending control-alt-delete to the console. Note that, depending on what has been disabled, this may take considerably longer than it would do if the network were available.
- Forcibly reboot the machine, for example by power-cycling it.
If the bridging commands have been inserted into the startup scripts then you will need to remove them. You may be able to do this by booting into a recovery mode or from a live CD, however for a remotely hosted machine you may have to resort to reimaging it (with loss of all data).
- Bridge traffic between two or more Ethernet interfaces on Linux
- Persistently bridge traffic between two or more Ethernet interfaces (Debian)
- Persistently bridge traffic between two or more Ethernet interfaces (Red Hat)
- Persistently bridge traffic between two or more Ethernet interfaces (SUSE)
- bridge, The Linux Foundation (bridge module official website)
- Spanning Tree Protocol Problems and Related Design Considerations, Document ID 10556, Cisco Systems Inc, updated August 2005