Send a UDP datagram in C
Tested on |
Debian (Lenny) |
Objective
To send an outbound UDP datagram in C
Scenario
Suppose that you wish to write a client that implements the UDP-based variant of the Daytime Protocol, as defined by RFC 867
This is a very simple protocol whereby the client sends a datagram to the server, then the server responds with a datagram containing a human-readable copy of the current date and time. The datagram from the client is not required to have any particular content.
Method
Overview
The method described here has three steps:
- Construct the remote socket address.
- Create a UDP socket.
- Send the datagram.
The following header files will be needed:
#include <errno.h> #include <string.h> #include <unistd.h> #include <netdb.h> #include <sys/socket.h> #include <netinet/in.h>
and if using sendmsg
to send the datagram:
#include <sys/uio.h>
Construct the remote socket address
To send a UDP datagram it is necessary to specify the remote IP address and port number to which the connection should be directed. The combination of these two values is treated as a single entity called the socket address, which is represented by a struct sockaddr_in
for IPv4 or a struct sockaddr_in6
for IPv6.
A local socket address may also be specified, however it is rarely necessary to do so. By default the local address is chosen automatically by the network stack.
Most common network services have an assigned port number on which they are normally expected to listen. It makes sense for the client to use this as a default, however it is important that an alternative can be selected. The user of the client will not necessarily have any control over how the server is configured, so the onus is on the client software to provide access to whichever port the server has been instructed to use.
It is often useful for the remote IP address to default to the loopback address, particularly for services such as databases where there is a good chance of the client and server being run on the same machine. Alternatively, it may be preferable to require that the destination be specified explicitly.
For most purposes the best way to construct the remote address is by calling getaddrinfo
. This takes a string containing either a hostname or an IP address, and a second string containing either a service name or a port number. These are converted into a sockaddr_in
or a sockaddr_in6
as appropriate:
const char* hostname=0; /* localhost */ const char* portname="daytime"; struct addrinfo hints; memset(&hints,0,sizeof(hints)); hints.ai_family=AF_UNSPEC; hints.ai_socktype=SOCK_DGRAM; hints.ai_protocol=0; hints.ai_flags=AI_ADDRCONFIG; struct addrinfo* res=0; int err=getaddrinfo(hostname,portname,&hints,&res); if (err!=0) { die("failed to resolve remote socket address (err=%d)",err); }
The hints
argument contains additional information to help guide the conversion. In this example:
- The address family has been left unspecified so that both IPv4 and IPv6 addresses can be returned. In principle you could receive results for other address families too: you can either treat this as a feature, or filter out any unwanted results after the call to
getaddrinfo
. - The socket type has been constrained to SOCK_DGRAM. This allows UDP but excludes TCP.
- The protocol has been left unspecified because it is only meaningful in the context of a specific address family. If the address family had been set to AF_INET or AF_INET6 then this field could have been set to IPPROTO_TCP (but it is equally acceptable to leave it set to zero).
- The AI_PASSIVE flag has not been set because the result is intended for use as a remote address, not as a local address. This causes the IP address to default to the loopback address (as opposed to the wildcard address).
- The AI_ADDRCONFIG flag has been set so that IPv6 results will only be returned if the server has an IPv6 address, and similarly for IPv4.
The res
argument is used to return a linked list of addrinfo
structures containing the address or addresses that were found. If multiple records are returned then the recommended behaviour (from RFC 1123) is to try each address in turn, stopping when a successful outcome is achieved. This assumes that you have some way to distinguish success from failure, which may not always be the case, but if you are able to do this then you should. If not then an acceptable alternative is to use the first result and discard the remainder.
The memory occupied by the result list should be released by calling freeaddrinfo
once it is no longer needed, however this cannot be done until after the datagram has been sent.
Create the client socket.
The socket that will be used to send the datagram should be created using the socket
function. This takes three arguments:
- the domain (AF_INET or AF_INET6 in this case, corresponding to IPv4 or IPv6 respectively),
- the socket type (SOCK_DGRAM in this case, meaning that the socket should provide connectionless and potentially unreliable transfer of datagrams), and
- the protocol (IPROTO_UDP in this case, corresponding to UDP).
A value of 0 for the protocol requests the default for the given address family and socket type, which for AF_INET
or AF_INET6
and SOCK_DGRAM
would be IPPROTO_UDP
. It is equally acceptable for the protocol to be deduced in this manner or specified explicitly.
Assuming you previously used getaddrinfo
to construct the remote address then the required values can be obtained from the addrinfo
structure:
int fd=socket(res->ai_family,res->ai_socktype,res->ai_protocol); if (fd==-1) { die("%s",strerror(errno)); }
Send the datagram
Datagrams can be sent using any function that is capable of writing to a file descriptor, however unless you have connected the socket to a particular remote address (as described below) it is necessary to use either sendto
or sendmsg
so that a destination address can be specified. Of these sendmsg
is the more flexibile option, but at the cost of a signficiantly more complex interface. Details for each function are given below.
Regardless of which function you choose, each function call will result in a separate datagram being sent. For this reason you must either compose each datagram payload as a single, contiguous block of memory, or make use of the scatter/gather capability provided by sendmsg
.
Send the datagram (using sendto)
To call sendto
you must supply the content of the datagram and the remote address to which it should be sent:
if (sendto(fd,content,sizeof(content),0, res->ai_addr,res->ai_addrlen)==-1) { die("%s",strerror(errno)); }
The fourth argument is for specifying flags which modify the behaviour of sendto
, none of which are needed in this example.
The value returned by sendto
is the number of bytes sent, or -1 if there was an error. UDP datagrams are sent atomically, so unlike when writing to a TCP socket there is no need to wrap the function call in a loop to handle partially-sent data.
Send the datagram (using sendmsg)
To call sendmsg
, in addition to the datagram content and remote address you must also construct an iovec
array and a msghdr
structure:
struct iovec iov[1]; iov[0].iov_base=content; iov[0].iov_len=sizeof(content); struct msghdr message; message.msg_name=res->ai_addr; message.msg_namelen=res->ai_addrlen; message.msg_iov=iov; message.msg_iovlen=1; message.msg_control=0; message.msg_controllen=0; if (sendmsg(fd,&message,0)==-1) { die("%s",strerror(errno)); }
The purpose of the iovec
array is to provide a scatter/gather capability so that the datagram payload need not be stored in a contiguous region of memory. In this example the entire payload is stored in a single buffer, therefore only one array element is needed.
The msghdr
structure exists to bring the number of arguments to recvmsg
and sendmsg
down to a managable number. On entry to sendmsg
it specifies where the destination address, the datagram payload and any ancillary data are stored. In this example no ancillary data has been provided.
If you wish to pass any flags into sendmsg
then this cannot be done using msg_flags
, which is ignored on entry. Instead you must pass them using the third argument to sendmsg
(which is zero in this example).
Variations
Sending to the IPv4 broadcast address
By default, attempts to send a datagram to the broadcast address are rejected with an error (typically EACCES, however it is not obvious from the POSIX specification which error should occur). This is a safety measure intended to reduce the risk of making unintended broadcasts. It can be overridden by setting the SO_BROADCAST socket option:
int broadcast=1; if (setsockopt(fd,SOL_SOCKET,SO_BROADCAST, &broadcast,sizeof(broadcast))==-1) { die("%s",strerror(errno)); }
Replying to a datagram
When replying to a UDP datagram the response should normally be sent to the IP address and port number from which the request originated. This can be arranged by capturing the source address of the request using recvfrom
or recvmsg
, then passing it to sendto
or sendmsg
as the destination address for the response.
There is also the question of where the response should be sent from. In most cases the best choice will be from the port and IP address to which the request was directed. This is not a requirement of the User Datagram Protocol itself, however there are several reasons why it is desirable:
- Generic firewalls and NAT gateways normally use both source and destination port numbers and IP addresses for connection tracking (as per RFC 2663) so will fail to associate the response with the request if it is not sent from the appropriate port and IP address.
- The behaviour of the
connect
function in relation to UDP strongly encourages the assumption that any response will originate from a matching IP address and port number. When a UDP socket is in the connected state, datagrams from any other source are rejected. - RFC 1123 recommends (but does not require) that when replying to a UDP datagram on a multihomed host, the response should be sent from the IP address to which the request was directed.
- Some application-layer protocols (such as DNS) explicitly require that replies be sent from a matching port.
An exception would be where the application-layer protocol explicitly requires or allows the response to originate from a different port (for example, as is the case for TFTP).
Replying from a matching port number can be achieved very easily by sending the response using the socket that received the request. This method will reply from a matching IP address if the socket is bound to a specific address, but not necessarily if it is bound to the wildcard address and the server is multihomed.
Unfortunately the POSIX API does not provide a satisfactory way to reply from a matching IP address in a portable manner. Briefly, the available options include:
- using a non-portable mechanism such as
IP_PKTINFO
or the combination ofIP_RECVDSTADDR
andIP_SENDSRCADDR
to obtain and set the local IP address, - binding a separate socket to each local IP address, having non-portably obtained a list of addresses using a mechanism such as
SIOCGIFCONF
, or - sending the response from the wildcard address in cases where use of a matching address is non-mandatory, accepting that there are some use cases in which this will fail.
This is a substantial topic in its own right and will be the subject of a future microHOWTO.
Connecting to a remote host
When exchanging many datagrams from a particular remote host it may be beneficial for a UDP socket to be connected to that host. This removes the need for the remote address to be explicitly checked every time a datagram is received, and for the address to be specified every time one is sent. The connection is made using the connect function:
if (connect(fd,remote_addr,sizeof(remote_addr))==-1) { die("%s",strerror(errno)); }
This is superficially identical to the call that would be made to establish a TCP connection, however unlike TCP there is no handshake. This has two notable consequences:
- Calling connect on a UDP socket does not (by itself) result in any network activity.
- The call to connect will succeed even if the remote machine is unreachable or nonexistant.
A UDP socket in the connected state will only receive datagrams that originate from the given remote address. It is therefore feasible to use functions such as read
or recv
in place of recvfrom
. Similarly the given remote address becomes the default for outgoing datagrams, therefore it is feasible to use write
or send
in place of sendto
. (Being connected does not, however, prevent you from sending datagrams to arbitrary destinations using sendto
if you so wish.)
See also
- Listen for and receive UDP datagrams in C
- Establish a TCP connection in C
- Send an arbitrary IPv4 datagram using a raw socket in C