6
gitpush
5y

What are the symptoms of a broken openssh server?

I completely lost access with connection reset by peer, however, earlier today I kept having my session auto disconnect on only one server, the only way I could gain access is to remove and then install openssh again

But on this particular server, I have not even changed ssh port, why did it get broken?

Comments
  • 2
    @Linux @Linuxxx @Condor any thoughts?

    Auth.log is empty and fail2ban doesn't show any sign of ban
  • 2
    it seems something else is wrong, connection disconnected again :\
  • 3
    Sounds like bad network
  • 2
    @Linux but my other servers don't have this issue, all services on that server had no interruptions and all of them on scaleway
  • 3
    @gitpush
    Node issue or switch issue, open a ticket and ask for help :)
  • 3
    @Linux so that's it. Good to know I was afraid I messed things up lol. Thanks man 😀
  • 1
    @gitpush Yup, what @linux said :)
  • 1
    Troubleshooting for me at least would be pinging the server from client (and preferably another server that's reasonable to assume to be able to reach your problematic server), and ensuring that the firewall isn't set up in a bad way. Networking knowledge applies to this one, so e.g. if you can only access the server from an authenticated (VPN) network, if you can't connect to that, you're screwed (contact support, and pray to the gods of Linux that they can help). If it's publicly accessible, is the port itself accessible (nc -vz $yerserver 22)? Can you login to an emergency shell from the Scaleway infrastructure to investigate? If so, you can continue from there. Ensure that your server can access the internet and has port 22 open.

    That said, that's just general troubleshooting of course. Connection reset by peer usually means that the port is open but that the SSH server isn't running. So be sure to check that. I'm afraid that such a thing can only be checked from an emergency shell, using Scaleway's infrastructure/support. Check/ask if the sshd service starts up properly, and if not what its error messages are. (Ask to) edit the sshd_config to get it sorted out again, as it's likely an issue there. Whenever the sshd starts up again, you can continue from an SSH session.

    (edit the first: nvm, just assumptions about inaccessible server with a stable connection)

    (edit the second: nvm, I'm an idiot, refused connections are "Connection refused".. I entirely blame the booze :v)

    (edit the third - how many of these will I need?: connection reset by peer after a successful SSH connection means that the SSH server went down. Usually happens to me when I reboot the server.. although booze guidelines apply to this statement, i.e. take this as a random brainfart of mine)
  • 1
    Oh btw, if you can connect but disconnect after a while, it's indeed probably a network issue on your client. A while ago I've been in a situation like this as well (some shitty network maintained by shitadmins) and I found mosh to be a great drop-in replacement to SSH. You essentially run a mosh service on your server (can run as regular user, minding that you can only bind to 1000+ ports, like 2022 or something like that) and connect with mosh to that. It's pretty much the same as SSH and uses those binaries and keys, but the advantage is that events aren't repeated in realtime (so you can press a key without connectivity, and whenever you connect again the client will repeat it to your server to get back in sync) and timeouts are completely nullified. I haven't tested mosh extensively at the time, but I'm fairly sure that an extremely generous timeout of several minutes applies to it. Highly recommended!
  • 1
    If anything, drunken fool me would try to reboot the server to see whether the server issue was just on the server's SSH daemon and systemd (btw Microshit, systemd is an actual term, go suck on that you dirty slut Cuntana) fucking up somehow. But given these intermittent disconnects, sober me would probably consider mosh instead.. it's a really useful little utility :)
  • 1
    @Condor Thanks man really appreciate your help. I contacted them and they said they have no reported issue like mine, but I'm not sure what happened it suddenly started I haven't touched anything on the server, only working with my docker containers.

    I'll check mosh if the issue persists and will also consider try accessing from another machine to see maybe it is an issue from my end. I read somewhere that an update for the Linux kernel caused this, not sure if from main source code or from vendor modification (if any) lol
  • 1
    @gitpush glad I could help 🙂 if it's a kernel issue, most likely downstream issue by the maintainers. Happens a lot in Manjaro and Arch in particular tbh, so I chose to go with compiling my own from upstream source. That way I can schedule kernel upgraded along with reboots, instead of having the distribution maintainers do it in my stead, and the kernel config is always good or possible to troubleshoot and rebuild.
  • 1
    @Condor well issue was solved I'm guessing it's a kernel issue got and update yesterday and all went good. I'm thinking of compiling mine too but I don't think it makes a difference I'm running on a VM and I don't want to burn my ssd lol
  • 1
    @gitpush on VM's I wouldn't really do custom builds either. Unnecessary overhead, and containers can use the host kernel so using that and compiling on the host only makes more sense to me. Anyway, if a system upgrade fixed it for you, it doesn't matter much. Glad you got it resolved 🙂
  • 1
    @Condor thanks man 😀
Add Comment