🏴 send(), recv() and the corner cases
Designing, implementing and documenting corner cases is hard because it’s boring. So, we tend to think that the general description fits the corner case. Even if it doesn’t, it’s “close enough” and it doesn’t happen too often and, as long as it doesn’t crash, it’s OK. But, this is software, it has no soul and plays no favorites to any values and/or (corner) cases. Corner case should be “just another case”.
Sockets sending and receiving nothing
So, let’s take an example of the well-known sockets functions send()
and recv()
. Both take a buffer (to send or receive) and a length
(of said buffer). In C, this looks like:
int send(socket_t skt, uint8_t* buffer, size_t length);
int recv(socket_t skt, uint8_t* buffer, size_t length);
In some higher lever language an abstraction of a “vector” or some such thing might be provided, which “bundles” the buffer and its length into one value (variable). But, even these two are bundled, they still exist, so this doesn’t change the fundamentals of what we’re going to describe here.
Depending on your sockets implementation, the length might be signed
or unsigned
int
eger of various size. But, let’s say that, even if
it’s signed, passing negative values is an obvious error and we don’t
care much.
What we want to explore is what happens if the length is zero: 0
.
When you think about it, it doesn’t make sense. You want to send
nothing? Just don’t send
at all. You want to receive nothing?
Good for you, but keep it to yourself.
It doesn’t make sense, so why do it?
Well, you might have some generic code that takes the length from
somewhere and just passes it along. Actually, you have quite a few
such places in your code. You don’t want to sow if (length > 0)
all
around your calls to send()
/recv()
.
int send_b64_encoded(socket_t skt,
uint8_t* buffer, size_t length)
{
uint8_t* encoded = ab64_encode(buffer, &length);
int rslt = send(skt, encoded, length);
free(encoded);
return rslt;
}
In an isolated example such as this, adding the if
is not too bad.
But, if this thing starts to spread and you have, say,
send_ascii85_encoded()
, send_base41_encoded()
…, then, it becomes
a nuisance.
On another matter, this can come to be as a consequence of an error.
For example, it is usual to send (or receive) data in “parts”, for
various reasons. You have the whole data to send with length =
total_length
, then send it part by part, reducing length -=
length_of_sent_data
after each partial send. Obviously, by the end,
you’ll reach length == 0
, at which time you should stop. But, some
bug in the code might make you not stop and try to send with length
== 0
.
OK, stuff happens, now what?
So, what shall send()
and recv()
do in this case?
In all the sockets (or even “sockets-ish”) implementations I found, this is not documented. It’s always something like:
This function will send
length
number of bytes frombuffer
.
Sure, there’s other text there, but, in general, there’s nothing
describing what happens if length==0
. Is that an error? And what
error is reported (via errno
, in most sockets)? If it’s not an
error, what’s the behavior? Is it different depending on some socket
options, foremost the “(non-)blocking I/O”.
What’s the “big deal here”? Well, most, if not all, sockets also document this:
The function returns
-1
on error (with error code inerrno
) or the number of bytes sent (received forrecv()
). The return value (result) is0
if the socket was lost (shutdown).
The “socket was lost” obviously is meant for connection-oriented
sockets (actually, you use sendto()
for datagram sockets). Also,
sure, not all sockets use errno
, Windows uses
WSAGetLastError
and other libraries have other means of indicating the error), but,
let’s not get lost in the details here. Assume that errno
means
“actual errno or its cousin in a particular sockets implementation”.
So, if the result is the number of bytes written, and you “told” the
function to write zero bytes, it makes sense that it thinks it
succeeded and to return 0
. But, that is at odds with the idea that
0
indicates that the connection was lost.
The thing is, you’re not sure “what to think here”.
This is a perfect example of a corner case that was not thought about. At design time, a better interface could have been devised. At (post) implementation time, a better description/specification could have been done, indicating what happens.
How does it actually work?
You’re probably wondering what actually happens? I didn’t make a detailed survey, but, my limited testing shows this:
recv(length=0)
will return-1
with “WOULDBLOCK” inerrno
if the socket is non-blockingrecv(length=0)
will return0
when the socket is blocking, because it will never read0
bytes. it will essentially wait until the other side closes the socket, and then return0
.- similar goes for
send(length=0)
- I didn’t try
sento()
norrecvfrom()
Sure, from a certain POV, it makes sense, but, it’s actually bad, as the same fundamental issue, which is bad usage of the API, has different error indicators depending on some setting which has nothing to do with the actual problem at hand.
How should it work?
As to not only point to errors, let’s think about solutions.
The immediate solution would be to make length=0
an error, return
-1
always with a special error indicator BUFFER_CANT_BE_EMPTY
(a
particular sockets implementation might have something like that
already, if not, make a new one).
A better interface would have been to not have “special values” for
the result. Have the result always be error (or 0
if there’s no
error) and give the number of bytes actually sent/received back in an
(in-)out parameter. The closing of the connection would produce an
error “CONNECTION_CLOSED”. In this case, for length=0
, simply return
0
(no error).
Moral of the story
One does need to handle corner cases just like any other cases, even if they don’t “make sense” and “don’t matter much” (“who cares what happens, as long as nothing crashes”). The amount of time lost trying to makes sense of it all when you do find yourself in the corner (case) is way too big, compared to a little documentation/specification and making the code actually work per spec.