Ensure symmetric routing on a server with multiple default gateways
|Debian (Squeeze, Wheezy)|
|Ubuntu (Precise, Trusty)|
On a host with multiple network interfaces, each with its own default gateway, ensure that routing is symmetric (so that inbound and outbound traffic take the same path) for any given pair of endpoints
Routing on a multihomed host is straightforward when each interface provides connectivity to a different set of hosts. There is only one path that any given packet can take, and for outbound traffic this can be deduced from the destination IP address. All that is needed is a routing table that reflects which hosts are reachable through each interface.
If there are two or more possible paths then a decision must be made as to which one to use. A condition that you should normally try to avoid is ‘asymmetric routing’, whereby traffic sent to a given IP address arrives on one interface, but traffic originating from that address leaves by a different interface. There are a number of problems that this can cause:
- Packets sent through the wrong interface will appear to have a spoofed source address, and may therefore be blocked as a matter of policy. It is common practice for Internet service providers to do this.
- Network devices which depend on connection tracking, such as stateful firewalls and intrusion detection systems, usually need to see both inbound and outbound traffic if they are to function correctly.
- Routing most outbound traffic through one interface can cause unnecessary congestion.
- If the intention was to provide a redundant path by which the server can be reached then routing all return traffic through a single interface will frustrate this.
Suppose that a server has two connections to the public Internet:
- the first via
eth0with an IP address of 198.51.100.87/24 and a gateway address of 198.51.100.1; and
- the second via
eth1with an IP address of 203.0.113.144/24 and a gateway address of 203.0.113.1.
The server must be able to accept inbound TCP connections from arbitrary locations on the Internet to either of its public IP addresses. You wish to ensure that the outbound traffic associated each connection is sent through the same interface as the inbound traffic.
For the traffic at issue there are two possible source IP addresses which an outbound packet could have: 198.51.100.87 or 203.0.113.144. In order to achieve the desired effect, the former must be sent via
eth0 and the latter via
eth1. The destination IP address is not relevant to how outbound packets should be routed.
Routing in this manner is beyond the capability of a traditional routing table, which can match against the destination address but not the source address.
The required behaviour can be obtained using the policy-based routing capabilities of the Linux kernel, which allows for routing decisions to be based on criteria other than the destination address. The method described here has four steps:
- Create a separate routing table for each of the interfaces.
- Add policy rules to direct outbound traffic to the appropriate routing table.
- Ensure that the main routing table has a default route.
- Flush the routing cache.
This approach can be scaled trivially to any number of interfaces if required.
The requirement cannot be met by a single routing table, but with policy-based routing it is possible to have multiple tables. These are created using the
ip route add command. The table for
eth0 should route traffic via the gateway at 198.51.100.1 if it cannot be delivered directly:
ip route add 198.51.100.0/24 dev eth0 table 1 ip route add default via 198.51.100.1 table 1
whereas the table for
eth1 should use the gateway at 203.0.113.1:
ip route add 203.0.113.0/24 dev eth1 table 2 ip route add default via 203.0.113.1 table 2
The first argument after the
add keyword is the addresses prefix to which the route is applicable.
default is equivalent to 0.0.0.0/0, so matches any destination address, but the other rule in each table takes precedence because it is more specific.
via keyword sets the address of the next hop. As always this must be directly reachable, and in this instance is one of the ISP-provided gateway addresses. The interface name can be explicitly specified using the
dev keyword, but this is not necessary if it can be deduced from the IP address.
table keyword selects the routing table to which the route should be added. This can be either a numeric table ID (in the form of an unsigned 32-bit integer) or a name from the file
/etc/iproute2/rt_tables. The table IDs 0, 253, 254 and 255 are reserved, otherwise it does not matter which numbers are chosen provided they are not being used for other purposes.
The purpose of the first route in each table is to prevent traffic from being unnecessarily routed via the gateway. These could reasonably be omitted if the interface is a point-to-point link, because in that case the gateway is the only directly reachable destination (other than the local interface itself).
Two policy rules are needed, one for each of the above routing tables, to arrange for the tables to be consulted when packets from the corresponding source address are seen:
ip rule add from 198.51.100.87/32 table 1 priority 100 ip rule add from 203.0.113.144/32 table 2 priority 110
The priority argument determines the order in which the rules are applied. In this case it does not matter which rule has the higher priority because only one of them will match any given packet, however you should check that there are no other rules in the table that would interfere with the ones listed above. The rules cannot be given the same priority because priorities are required to be unique.
When a program initiates an outbound connection it is normal for it to use the wildcard source address (0.0.0.0), indicating no preference as to which interface is used provided that the relevant destination address is reachable. This is not replaced by a specific source address until after the routing decision has been made. Traffic associated with such connections will not therefore match either of the above policy rules, and will not be directed to either of the newly-added routing tables. Assuming an otherwise normal configuration, it will instead fall through to the main routing table.
The main routing table is the best place to handle this traffic because it does not require any special treatment on account of its source address. An ordinary default route via one of the available gateways will suffice:
ip route add default via 198.51.100.1
(Alternatively it is possible to load balance between the two interfaces, but that is beyond the scope of these instructions.)
When the routing tables are queried the outcome is cached for efficiency, but according to the iproute2 documentation the cache is not flushed automatically when rules are added or removed. For this reason, the cache should be flushed explicitly once you have finished making changes:
ip route flush cache
In practice recent kernels do appear to perform an implicit cache flush, however relying on this behaviour would be very much at your own risk so long as the documentation says otherwise. Flushing twice is not harmful.
You may sometimes see this command written as
ip route flush cached or
ip route flush table cache. The effect is the same regardless.
You can verify that traffic is being sent via the appropriate interface by inspecting it using a tool such as tcpdump or Wireshark. There are three types of traffic which you should try to observe:
- outbound traffic associated with inbound connections to each of the two interfaces, and
- outbound traffic associated with outbound connections.
For these tests you should preferably use a remote endpoint that is sufficiently distant from the machine under test that there is no possibility of any special treatment by the routing tables. (For example it would be best to avoid using either of the gateway machines, or indeed any host connected to either of the local networks.) It does not matter whether the traffic is inspected locally, remotely, or at some point in between.
Before undertaking any detailed investigation you should check that the routing tables and policy database contain what they should. You can view the content of either routing table with the
ip route show command:
ip route show table 1
For the scenario described above, here is the expected output for table 1:
198.51.100.0/24 dev eth0 scope link default via 198.51.100.1 dev eth0
and for table 2:
203.0.113.0/24 dev eth1 scope link default via 203.0.113.1 dev eth1
The policy database can be listed using the
ip rule show command:
ip rule show
Provided that no rules have been added for other purposes, the list should contain five entries:
0: from all lookup local 100: from 198.51.100.87 lookup 1 110: from 203.0.113.144 lookup 2 32766: from all lookup main 32767: from all lookup default
If the configuration appears to be correct then you can test the behaviour of the routing tables using the
ip route get command. Usually the only argument to this command would be the destination address, but in this instance it is necessary to give the source address too in order to exercise the three different routing tables that could be invoked.
A destination address of 192.0.2.7 is used here to illustrate the process, but it would be better to use the address of a real machine which exhibits the problem you are attempting to troubleshoot. There are three tests to perform. Firstly, the route taken by traffic from 198.51.100.87 to the test address:
ip route get 192.0.2.7 from 198.51.100.87
which should leave via
192.0.2.7 from 198.51.100.87 via 198.51.100.1 dev eth0 cache mtu 1500 advmss 1460 hoplimit 64
Secondly, the route taken by traffic from 203.0.113.144 to the test address:
ip route get 192.0.2.7 from 203.0.113.144
which should leave via
192.0.2.7 from 203.0.113.144 via 203.0.113.1 dev eth0 cache mtu 1500 advmss 1460 hoplimit 64
Finally, check the route taken by traffic from the wildcard address to the test address (as would be relevant when making outbound connections from the server):
ip route get 192.0.2.7
Either interface would be acceptable as an answer to this third test (although you may have a view as to which one you would prefer to be used). You would not, however, want the address to be unroutable.
If some or all of the tests using
ip route get gives an unexpected result then that strongly suggests a problem within the routing table.
This could be an error in one of the routes or policy rules that you have added, however if you have checked these already
then you should consider the possibility that they are being overridden by another route with a higher priority.
Correct responses from
ip route get do not exclude the possibility of a routing problem,
because the outcome could depend on some obscure property of the traffic not addressed by your testing,
but in the normal course of events it is a good indication. In that case you may want to check:
- that there is nothing in iptables or ebtables which would prevent the configuration from working, and
- that the network links behave as they should when used individually (with the other disabled and the simplest possible routing configuration).
For further guidance see:
The configuration described above can be made persistent on Debian-based systems using
pre-down options within the relevant
iface stanzas of
auto eth0 iface eth0 inet static address 198.51.100.87 netmask 255.255.255.0 gateway 198.51.100.1 post-up ip route add 198.51.100.0/24 dev eth0 table 1 post-up ip route add default via 198.51.100.1 table 1 post-up ip rule add from 198.51.100.87/32 table 1 priority 100 post-up ip route flush cache pre-down ip rule del from 198.51.100.87/32 table 1 priority 100 pre-down ip route flush table 1 pre-down ip route flush cache auto eth1 iface eth1 inet static address 203.0.113.144 netmask 255.255.255.0 post-up ip route add 203.0.113.0/24 dev eth0 table 2 post-up ip route add default via 203.0.113.1 table 2 post-up ip rule add from 203.0.113.144/32 table 2 priority 110 post-up ip route flush cache pre-down ip rule del from 203.0.113.144/32 table 2 priority 110 pre-down ip route flush table 2 pre-down ip route flush cache
network-manager is installed then you will either need to uninstall it or tell it not to manage these interfaces.)
- ip(8) (Ubuntu manpage)
- Linux Advanced Routing & Traffic Control HOWTO
- Matthew G. Marsh, Policy Routing With Linux (Online Edition)
- Policy Routing (Linux kernel documentation)