The term ‘cookie’ shall be familiar not only to programmers, but also to many of the more conscious, ordinary computer users. I’m not, unfortunately, talking about sweet pastry, but HTTP cookies.
Those cookies that we all know and love (if we’re web developers) or hate (if we’re overly paranoid about privacy) are not the only thing in computing to be known under this name, however. I learned this quite recently when talking to a friend of mine who is working in the realm of IT security. As it turns out, ‘cookie’ can refer to quite diverse array of different solutions, all unified through similar underlying concept.
Let’s start with one that belongs to relatively low level of software. Even if we have only vague understanding of software security, we should have heard at some point about the phenomenon of buffer overflow. It happens when we attempt to put more data into byte buffer (array) than it’s possible given its size.
A canonical example looks pretty much like this:
Because we’re talking about unprotected, unmanaged memory buffers, not IndexOutOfBoundsException
or out_of_range
error will be thrown. Immediately, we might not actually notice anything. And that’s bad: data that was written outside of designated buffer could reach and overwrite the return address
for current function call.
What’s wrong with that? The return address has been put on the stack prior to calling the function, and it’s intended as a “savepoint”. In other words, it indicates where the program must jump once the function ends, and normally this is the next instruction after the function call.
When buffer
overflows, it overwrites data in higher memory addresses, where the rest of the stack resides. Eventually, it can reach the return address and alter it, causing the function to “return” to completely different place than it’s supposed to. If the overflow was caused by malicious intent rather than a bug, the target address will likely contain harmful code, typically injected by the exact same means.
This is a very severe security hole and it should always be plugged by carefully checking the limits of destination buffer (e.g. by using safer variants of strcpy
, like strncpy
) and zeroing it prior to copying. There is also one more line of the defense against such attacks – and this is where we can find our cookies.
A stack cookie, or stack canary, is a marker value present on stack before the return address. It is placed there during function’s call, before we even jump to it’s code. As its name indicates, it serves as a virtual “canary” whose “death” (alteration) indicates stack corruption, just like actual canaries kept in mine shafts can warn about rise in concentration of deadly gases.
So, if the stack cookie has been changed unexpectedly, the return address is potentially tampered with and can no longer be trusted. Because the function will check the stack cookie before using that address, this technique can prevent exploiting vulnerabilities involving buffer overflows. It won’t allow the program to continue normally, though: once the return address is lost, there is nowhere we can go, so the whole process must be aborted.
Hopefully this happens only while debugging :)
For another application of cookies, we go back to network protocols. This time, though, it will be a tad lower than HTTP, for there is an interesting solution which operates at the level of TCP.
When establishing a TCP connection, a specific ritual between both peers (client and server) needs to be performed. Client initiates it by sending a SYN packet (a TCP packet with SYN flag on) and server can respond to it appropriately, acknowledging the connection. Once sides have sent such acknowledgements, new connection is ready and operational.
But this flow assumes a well-behaved client. We know, however, that Internet is a big and scary place, full of nasty software which doesn’t behave all that nicely. One of the possible degrees of malice here is causing a SYN flood: a big constellation of initial SYN packets which server must respond to, and which are not followed by the final ACK-nowledgement. With well-intentioned but naive TCP implementation, the server will quickly exhaust its resources by allocating them to service incoming connections – connections that never actually happen. Consequences of such an attack fall into the broad category of Denial of Service (DoS).
To prevent this from occurring, the server shall not prepare anything at the time of initial SYN packet, but only when the whole handshake is completed. While this makes intuitive sense, there is one problem to address: how to track state of the whole process if we cannot keep it on the server explicitly?…
Fortunately, there is a simple way. Among the dozen or so fields in TCP packet header, we have the sequence number which is used for reconstructing TCP data stream from its constituent packets. Because it’s purpose is to maintain reliability of the whole communication channel, it has to be incremented for every new packet sent by particular endpoint. The other side can later ACK-nowledge packet delivery by including it as acknowledgement number in some later packet sent by this side.
As it stands, both sides have their own, distinct sequence numbers. They also get to choose their initial values, and this is were it gets interesting. Since any value works here, server can craft it in a particularly clever way that encodes the before mentioned connection state (and also client’s address). Once done, server sends it in response for the initial SYN packet and essentially forgets about the whole affair (!).
If the client is legitimate, it will eventually finish the handshake by sending the third ACK packet, and it will contain our special value. At this point we will recognize it and proceed with TCP normal connection, preparing all resources necessary to handle it. But if the initial SYN packet was just a spoof, no harm has been done. The attacker won’t respond, obviously, but we don’t really care: we haven’t allocated anything yet. All we did was few simple operations to compute the special value for sequence number.
As you might guess, this special value is a cookie – a SYN cookie to be exact.
Both of the ingenious solutions presented above (and HTTP cookies) rely on the same core concept. The idea is to have some piece of data which is meaningful to us but opaque to the outside world (including esp. the other end of communication channel). After some time, the data is expected to “come back” to us; we can then use it as a “ticket”, to identify particular client, event, transaction, connection, etc., and validate it.
This way, we can keep track of user sessions on websites; state of new connections; integrity of call stack; and other, similar things.
Adding comments is disabled.
Comments are disabled.