30

It fucking staggers me how many backend/devops-y people don't understand what a client side "request timeout" is, versus a server side one.

What does it mean:

The client was fed up with the servers bullshit, and decided to piss off and not wait around for the server to take forever to respond, because life's too short.

How not to solve/debug this issue:

- "I've checked the API request in tool xyz, and it works fine for me"

Congratulations, you've figured out how to call an API once, in isolation to the rest of the application, and without any excessive load. And using a different client to me, with a different configuration. Lets get back to actually looking at the issue shall we?

- "I only see HTTP 200's in the logs"

Yep, you probably will in most circumstances, because its the client complaining about it taking too long, not the server. If the server was telepathetic and knew what the client was thinking/doing at all times, we wouldn't have half of the errors we do.

- "Ah ok, I understand ... so how do I solve this?"

Your asking me? I don't fucking know, I didn't build the server! Put better logging in place and figure out why sometimes it takes forever.

Jesus fucking christ

Comments
  • 6
    You guy's hiring vegetables as dev ops folks or something?
  • 3
    This is why you always log response time people
  • 4
    Well, correct me please, but any 4xx is client fault, unless the server or whatever is in the middle is designed to not respond to that client or request configuration, in which case it will show up on the server logs.

    So, there are very few scenarios where backend / ops people would be able to help, so you either give them what to check specifically or debug your network and path to the server.

    If the server is not getting anything, it is not the server's fault.

    So again, please correct me but it sounds you know what a client request tine out is, but have no clue how to fix it either.

    For all know, if I where one fo your backend engs, you either give me good shit to check, or I would blame your internet connection and even your computer.

    Chances are I will join in a debug call with you and everything will work fine.
  • 5
    A client side request timeout is very unspecific.

    What protocol?

    Who's involved?

    Proxying?

    Most likely you mean a timeout set in HTTP client, e.g. 2 mins.

    Client _should_ terminate properly the connection after 2 mins, but some implementations are fucked up.

    Server _could_ forcefully close lingering connections... Or they linger forever.

    If the server is swamped, logging won't help - overflow is most likely dropped silently. Monitoring is required.

    If a component in the middle, e.g. router / proxy / ... is at fault, the server logs will not be helpful at all as most likely the connection was dropped before it reached the server (frontend e.g. in proxy swamped, backend connection to server not established).

    The client itself can e.g. be a problem, too.

    Depending on what "client" is.

    If it's anything multithreaded / asynchronous / ... , it can be a lack of resources. Too few threads, threads block, softlock exhausting all available sockets / connections / ..., finally giving up and closing connections as nothing could be received.

    Server logs won't help here, as the client is bonkers.

    It's like playing russian roulette with a gatling gun.
  • 2
    @mundo03 HTTP 4xx are client issues, and the server only logs a 408 (server side timeout) when the server thinks the client is gone.

    If the issue is the server is taking too long to respond (e.g. database queries taking too long, bad logic causing long loops, lack of resources etc), the servers not going to log a 408. The server is going to try send the response, log the fact that it returned a HTTP 200, and then probably discover that the client is gone and may or may not log anything useful to denote this.

    The fact that the client side is reporting request timeout, and not "path not found", HTTP 404 etc, and the fact the server has a record of the request being received ... means there is not client side networking issue. The issue, is some problem on the server that sometimes the request takes 2 seconds and sometimes it takes over 30 seconds. I, as a client developer, have no idea why that is. Its up to backend guys to investigate, and not respond "I see HTTP 200's, its all fine"
  • 0
    @IntrusionCM your assuming i sent a mail to the backend / devops team saying "got a request time out please investigate" and nothing else

    I sent them the specific request, I sent them multiple timestamps for when it occurred, I sent examples of how and at what point in the app these are requested.

    As for proxying and all that other stuff, I as the client developer don't know anything about that. Backend gives us a URL to an API they built, I hit it, sometimes it takes too long. Its up to them to figure out why, and investigate the setup they put in place.

    Client side request time outs are very unlikely to be caused by multithreading / async issues. Those would likely throw "connection closed by xxxx" or a callback wouldn't fire or a null pointer exception would be thrown. A request time out means "i'm still getting keep alives, but I can't wait forever, terminate". The fact that my code sees this error and logs it, also means the threading is working fine
  • 0
    @IntrusionCM This instance is also involving 3 different clients, on 3 different platforms, using 3 different stacks and 3 different developers ... this is something outside of the client developers control.

    As suggested by the second comment above, the only way to debug this stuff is to log response times on the server side and investigate that, rather than saying "I see HTTP 200's, its all fine". It might be fine, but the issue is performance, its taking too long to return the thing thats fine
  • 0
    Part 1 / 2

    @practiseSafeHex I wasn't assuming anything at all.

    I just pointed out multiple scenarios and what might be the problem.

    And that wasn't meant to offend you in any way ;)

    I can understand your frustration, but debugging HTTP can be a pain in the arse.

    Your last comment adds useful information.

    "Client side request time outs are very unlikely to be caused by multithreading / async issues ..."

    Hahahahahahahahahahahahaha.

    Depends on the implementation of the HTTP client - now I assume you're just using a simple library like e.g. libcurl without any wrapper / abstraction or sth. like that?

    Then what you said should be true - but I've seen implementations which had so many layers of obfuscation through abstraction that it seemed like a miracle that shit worked at all.

    "A request time out means "i'm still getting keep alives, but I can't wait forever, terminate"."

    Can be. Must not be. Keep Alive isn't a "must", it's a should.
  • 0
    Part 2 / 2

    HTTP 1.0 / 1.1 are specifications, but most services like apache / nginx / haproxy / ... leave you a wide field of options to modify behaviour as needed.

    You're right, backend should have monitoring.

    Monitoring is nowadays pretty easy, too - Prometheus / Grafana, bit of configuration, be done.

    PS: The scenarios I mentioned all happened in reality.

    It's a sad reality, but while HTTP is a standard, in the end it always is a gentleman agreement whether devs / admins / ... follow the standard or not.

    Judging from experience, most do not. :)
  • 0
    @practiseSafeHex alk that is assuming your PC and internet are fine, and the path to the server too.

    So, we are right back at where we started.
  • 1
    @mundo03 again, if I have a log on my side saying I did a handskahe and it went fine, I sent a request and the server acknowledged it, the server has a log showing it was received, server has a log showing it processed at least a piece of the request, the client continued to receive keep alives but no payload .... then it is a performance issue on the server.

    I have been through this dozens of times in different companies. The obsession with backend folks to dismiss client side timeouts, as client issues, is staggering. Never once has this been a client side issue, or a connection issue. Because this error means the connection is fine, but the server is taking too long.

    As the second comment highlighted, log response times, and spend 15 seconds checking did it take too long. Or spend 6 days arguing about all the possible things it could be
  • 0
    @practiseSafeHex
    Well that is good to know.
    Do you mind pointing me to where do I get logs for the handshake and all that you mentioned?
  • 0
    @mundo03 specifically for iOS / Mac, there are several advanced networking tools available.

    Overview:
    https://developer.apple.com/documen...

    Low level network debug logging:
    https://developer.apple.com/documen...

    Setting up a packet trace:
    https://developer.apple.com/documen...

    This is all quite time consuming and cumbersome to setup, wait for a random occurrence of the issue, and view all the logs to verify that its all fine.

    Thats why its so frustrating when server side team doesn't have a means to check how long responses are taking, and inflict all this work on client side, when the error specifically says the connection is fine.
  • 0
    Disgusting.
Add Comment