[99s-extend] cowboy and chromium

Sasa Juric sasa.juric at gmail.com
Wed Apr 10 14:00:47 CEST 2013


Hi,

I have recently in my production system replaced mochiweb with cowboy. The server generally works fine, except for a bizarre behavior which I cannot quite explain, so I post here, hoping to get some pointers.

After replacing mochiweb with cowboy, I noticed that in chromium (other major browsers work fine) often (but not always) a request lasts a little more than a minute. Further inspection in chrome://net-internals showed that browser tries to send a request, times out after 60 sec, retries and then succeeds immediately. The key point is that it doesn't happen always. First couple of requests work fine, then all of a sudden one doesn't work. At the same time requests from other browsers (including chrome) on the same machine work fine.

If I revert to mochiweb, the problem disappears. Other than web server related code, everything else is the same: the rest of my code, the server setup etc... In addition, I return same responses and headers in both versions.

After many attempts and failures, I might have worked around the issue. Namely, I included <<"connection">>, <<"close">> in all responses. After this change, it seems that long requests are not occurring. In any case, I can't reproduce it anymore, whereas before the change I could have reproduce it easily.

However, I'm not sure if I have really resolved the issue, I'm also not happy with connection closes since it degrades performance. And finally, I'm not sure if I quite understand the problem. 
The only theory I have is that due to keep-alive, chromium holds the connection, while cowboy closes it (I read somewhere that hardcoded timeout is 5 seconds, right?). In this case it might happen that chromium sends a request to a non existing socket and then hangs for a minute, waiting for the response which never arrives. 
This might further be amplified by the fact that in production, between browser and cowboy, there is a proxy/load balancer, so maybe load balancer still holds the connection despite the fact that server had closed it.

This is the only theory I currently have, and I would like to hear if you guys have some other idea or any kind of helpful pointer?

Thank you very much in advance and best regards,
Sasa


More information about the Extend mailing list