Dazah API uses Redis to handle rate limiting. The goal is to limit every each client_id/user_id pair from making over 5,000 requests every 5 minutes. We use Codeigniter 3.x and it looks something like this:
$flood_control = $CI->cache->get("user_limit:{$token_obj->client_id}:{$token_obj->user_id}");
if ($flood_control === false)
{
$CI->cache->save("user_limit:{$token_obj->client_id}:{$token_obj->user_id}", 0, 300);
}
else if ($flood_control > 5000)
{
// Flood control is over the rate limit
print_json(array(
'status' => 'token_limit',
'error' => 'Rate limit exceeded: Please try your request again in a few minutes.',
));
}
else
{
// Record the request for flood control
$CI->cache->increment("user_limit:{$token_obj->client_id}:{$token_obj->user_id}");
}
From my understanding, when you increment a Redis key, it keeps the original time to live of when the key was initially set.
Unfortunately poor cereal keeps being locked out of his DaniWeb account. Upon investigating, the client_id/user_id pair that couples the DaniWeb app with cereal's user ID had its Redis value at 5001 and he was being cut off. However, a deeper investigation showed that it's not the case that he had DaniWeb open on multiple computers, etc., or was making an absurd about of requests. In fact, I had made about 10X the number of requests that he had over the past 24 hours, and I've never suffered from this problem!
So then I started logging and found that there are a very small handful of other users besides cereal that are also suffering from the same problem. A google search pointed me to a bunch of race conditions that can cause issues with API rate limiters, but I can't seem to wrap my mind around if a race condition could be at play here?
I temporarily increased the limit to 10K and have been keeping an eye on the logs I created. It is not the case that cereal's Redis key keeps increasing, increasing, increasing. In fact, it's completely hovering in the 5000-5100 range, going both up and down.
There is one strange thing that I noticed from my recent logging of both the current value of the key as well as when it's set to expire. I started logging the difference between the expiry time and the current time(). For all normal keys, the expiry time is anywhere from 1-4 minutes away, which makes sense for keys that are set to reset every 5 minutes. However, for the keys having an issue, the expiry time is always -1 second behind the current time ... as in, when I look up a key, why is it telling me its value is 5100 and it expired a moment ago, instead of restarting it?!
The only thing I can think of doing at this point is adding logic that if the expiry time is in the past, reset the value to 0 and begin anew. But why is this happening in the first place?!