I have a fairly complicated home network setup, which should come as a surprise to absolutely nobody. I recently dealt with an issue that had been bugging me for ages, but first some background. I want to be able to connect to my home systems from pretty much anywhere, and ssh is the obvious tool for that. Occasionally I've been somewhere where they've blocked outgoing connections to port 22 (the ssh port), but I can instead connect to port 443 (the https port). So if I instead run ssh on port 443, that gets around the problem. But I also want to have a web server on port 443. Fortunately there's this neat little tool called 'sslh' that can sit and listen for connections on a port, and when something connects and sends a message to the server, it determines what protocol the client is using and forwards the connection to the appropriate program. So now I have multiple services running on the one port that should never be blocked.
But there's a problem.
The server logs for ssh and apache show the source of the connections as being from the local system, which is technically correct since they are coming from sslh on my local system. I could merge the logs from sslh into the logs for apache and ssh, but that would be a pain. What I really want is to have the original source IP to show up in the logs for the applications as if there wasn't a proxy in the middle.
Wishful thinking, right? But apparently some really smart people wished for it, so they made it happen.
There are instructions for how to make this work for sslh, and if you follow them exactly, and if you're lucky, it does work. I say you have to be lucky because there are some subtle issues that you'll hit if you think you're smarter than the instructions or try to do things a little differently. Which is exactly the sort of thing I'm obviously going to do.
The way sslh works, is it accepts connections, and then sees the first message that the client sends, which it uses to determine what application to forward the connection to. Fortunately network protocols tend to expect the client to send the first message, so for things like ssh, ssl, and http, sslh will know what to do. (However, some protocols like imap have the server send the first message upon establishing a connection, so sslh can only service one such protocol by defaulting to it using a timeout.)
So my real setup is more complicated than what I described (which should come as no surprise). The issue I hit was with connections using ssl (or really tls as the newer versions have been renamed). If I am going to have multiple services using ssl encryption, then I need to decrypt the incoming connection and then use sslh to multiplex to different applications. This is done by using the program stunnel. And it also has transparent proxy support.
So an incoming connection comes in on port 443. First sslh gets it and sees what protocol it's using. If it's SSL/TLS, it sends it to stunnel, which then sends it back to sslh, which finally sends it on to apache, ssh, imap, or whatever else I have hidden behind that port.
Support for transparent proxying is included with both sslh and stunnel, so I'm good, right?
Nope.
I can get it working with one of them. I can get it working with sslh going to stunnel and on to apache. But if I have stunnel going to sslh, it breaks badly.
Why is that?
Well here's the problem. A quick search brings up instructions on how to make the transparent proxy work, but while they give you the formula, they don't explain how it actually works. And without understanding the reasoning behind the instructions, you're stuck if something goes wrong or if you want to try some creative variation on the same concept.
So I decided to figure out what's actually happening, and here's the technical meat of my post:
There are two parts of making this work. The first is the transparent proxy has to send outgoing packets that appear to be from the original host, not from itself. The second is the network layer of the operating system has to know to route the return packets back to the proxy application, even though they'll be addressed to some other system.
int transparent=1; res = setsockopt(fd, IPPROTO_IP, IP_TRANSPARENT, &transparent, sizeof(transparent));
getpeername(fd_from, from.ai_addr, &from.ai_addrlen); res = bind(fd, from.ai_addr, from.ai_addrlen);
No comments:
Post a Comment