Monday, October 30, 2023

Apache as a Proxy

I have a number of different devices and services running on my home network with web interfaces, and I would like to be able to access them all from anywhere.  Some of these are smart devices, some are software packages, but it doesn't really matter.  I want to be able to access them all by connecting to a web page on the server running on my firewall (or wherever port 443 is forwarded).

To start with, I did an inventory of all the devices and servers on my network.  The tool 'nmap' is idea for this.  Something like this works:

nmap 192.168.0.1-254

My network is a little more complicated, but I just run the equivalent command on each subnet.  For any device showing something on port 80 or 443, I connect with a web browser and see what's there.  For other ports that might be web pages, I can construct a URL by adding 'http://' or 'https://' to the front and adding ':8443' or whatever the port is.  If the web browser brings up anything interesting, I note that, too.

For the first pass, I create a web page that simply has a link to each service of interest on my network.  Of course, this will only work from inside my network, but at least I now have a page listing everything that I want to connect to.  Some of these will be http, others https, often with bad certificates (like my printer).  That's fine, I'll eventually deal with that using my proxy.

There are two categories of applications that I want to proxy.

The simple case is web applications that for various reasons run on different web servers (either on the same system on a different port or a different system; it doesn't matter), but are already running as a subdirectory, and I just want to forward anything going to that subdirectory on the main web server to the application.

The complicated case is typically a device where it assumes it's its own web server, and everything is relative to the root of the server.  I want to proxy it from a subdirectory on my main server, and this means modifying all the links that are passed through.  Sometimes this is easy, but sometimes it's quite difficult.

For the simple case, I'll use an application called "mythweb" as an example.  This is running on an internal server at http://192.168.0.5/mythweb/.  I've set it up in my /etc/hosts file as myth.mydomain.com, so I don't have to reference the IP number.  My apache installation has my server defined in /etc/apache2/vhosts.d/mydomain.conf, and I'll be adding entries to that file.  So for mythweb, it's really quite simple:

  <Location /mythweb/>
        ProxyPass http://myth.mydomain.com/mythweb/
        ProxyPassReverse http://myth.mydomain.com/mythweb/
        # mythweb sets a global cookie; change path from / to /mythweb/
        ProxyPassReverseCookiePath "/" "/mythweb/"
  </Location>

That's it.  Since I'm going to be defining multiple proxies as virtual subdirectories, I'm defining each as a location.  The "ProxyPass" line says where to send requests for anything in this directory.  The "ProxyPassReverse" is used for rewriting any redirects that the internal server may send back to the location.  The final note is that this application was setting a global cookie, so the "ProxyPassReverseCookiePath" option restricts any cookies it sets to only be for the subdirectory so they don't get seen by any other programs.

Doing a proxy for something that isn't already packaged in a subdirectory can get much more complicated, as you have to tell Apache how to modify the files being served as well as as the URLs being requested.  This potentially means modifying .js, .css, and .html files, and what has to be done is different for each application.

Perhaps my simplest example is my NAS management.  I'm running XigmaNAS.

    <Location /nas/>
        ProxyPass http://nas.mydomain.com/
        ProxyPassReverse http://nas.mydomain.com/
        ProxyPassReverse /
        ProxyHTMLEnable On
        ProxyHTMLExtended On
        RequestHeader    unset  Accept-Encoding
        ProxyHTMLURLMap ^/ /nas/ R
        ProxyHTMLURLMap http://[w.]*mydomain[^/]*/ /nas/ R
    </Location>

This starts out the same, but there's a good bit more.  Now we need another ProxyPassReverse line for '/' to direct to the location (/nas/).  Since we're modifying file internals, we turn off the "Accept-Encoding" option to disable compression.  I use ProxyHTMLURLMap to do regular expression substitutions in links in html files.  The "extended" option tries to also modify Javascript and CSS within html files.  The two lines for changing are switching any top-level links to the subdirectory, as well as any full links.  Fortunately this program doesn't use any CSS or JS files with links in them that need to be modified.

However, there's some setup for the ProxyHTMLURLMap command to define what gets treated as a URL.  I've seen some references to including "proxy_html.conf," but my Apache didn't have that file, so I put these lines in directly to my config file.  Note that these are not specific to a location, so I put them before the location tags:

    ProxyHTMLEvents onclick ondblclick onmousedown onmouseup onmouseover onmousemove onmouseout onkeypress onkeydown onkeyup onfocus onblur onload onunload onsubmit onreset onselect onchange
    ProxyHTMLLinks  a          href
    ProxyHTMLLinks  area       href
    ProxyHTMLLinks  link       href
    ProxyHTMLLinks  img        src longdesc usemap
    ProxyHTMLLinks  object     classid codebase data usemap
    ProxyHTMLLinks  q          cite
    ProxyHTMLLinks  blockquote cite
    ProxyHTMLLinks  ins        cite
    ProxyHTMLLinks  del        cite
    ProxyHTMLLinks  form       action
    ProxyHTMLLinks  input      src usemap
    ProxyHTMLLinks  head       profile
    ProxyHTMLLinks  base       href
    ProxyHTMLLinks  script     src for
    ProxyHTMLLinks  iframe     src

For my managed network switch, I had to also do some additional modifications to the html files to get it to work.  In the process, I double-modified some URLs, so I had to undo them:

        AddOutputFilterByType SUBSTITUTE text/html
        Substitute s|action="/|action="/switch/|n
        Substitute s|="/"|="/switch/"|n
        Substitute s|location.href="/|location.href="/switch/|n
        Substitute s|"/switch/switch/|"/switch/|n

So how did I figure that out?  It's an iterative process of loading it through the proxy and looking at what files are being requested, and figuring out how the browser got the wrong information when it wasn't translating things correctly.  This isn't too hard if you open up the console (Control-Alt-I in Chrome, Control-Shift-I in Firefox). There you can see every request and response as seen by the web browser.  You can compare connecting directly and through the proxy.

From here it keeps getting more complicated, but it boils down to creating substitute rules for other types.  I had to watch carefully in the console, as in one application I had to modify application/javascript, while in another it was application/x-javascript.

Another issue that I've encountered is proxying an internal device that uses https with a bad certificate.  I have my own legitimate certificate, and I just want to ignore the internal one, and it turns out that's easy to do.  I just put the following into my config file (before the <location> tags):

    # Enable SSL proxy and ignore local certificate errors
    SSLProxyEngine On
    SSLProxyVerify none
    SSLProxyCheckPeerCN Off # Ignore certificate error
    SSLProxyCheckPeerName off
    SSLProxyCheckPeerExpire off

With that I can set a proxy to https:// instead of http:// and everything else just works.

What about security?  Many of the things I'm proxying already have password protection, but some don't.  Fortunately it's easy to add a password inside any <location> field:

        # Authentication
        AuthType Basic
        AuthName "Password"
        AuthUserFile /var/www/localhost/accounts/webfrontend
        require valid-user

That's just the same as you would do for a <directory> if you weren't doing a proxy.  Note that my server only runs on https, or I wouldn't use "basic" authentication.

Another problem is doing a proxy for something that uses websockets.  I hit this with my security camera, and the following does the trick:

        # See: https://httpd.apache.org/docs/2.4/mod/mod_proxy_wstunnel.html
        RewriteEngine on
        RewriteCond %{HTTP:Upgrade} websocket [NC]
        RewriteCond %{HTTP:Connection} upgrade [NC]
        RewriteRule ^/?(.*) "ws://camera.mydomain.com/$1" [P,L]

Unfortunately I haven't succeeded in doing a proxy of everything.  One device fails to load all the pages even though it appears Apache is modifying all the links correctly.  What I have been able to do in that case is have Apache listen on a separate port with a new entry in /etc/apache2/vhosts.d/, and that vhost is a straight proxy for the device without any link rewriting to push it into a subdirectory.  If you're having trouble getting something to work with the rewrites, that's a good first step.

The Apache proxy feature is very powerful.  It's great to be able to take all my different devices and put them in a single interface and appear as if they are all on the same server.  Unfortunately this can be very complicated in some cases.  If developers would avoid absolute links, especially in CSS and JavaScript, it would make proxying much easier.  It would also be nice if there were some community database of proxy recipes for devices and web applications.  You would think there would be a wiki for this with pages for each device or application that someone had done a proxy for.

Transparent Mode Proxies

 I have a fairly complicated home network setup, which should come as a surprise to absolutely nobody.  I recently dealt with an issue that had been bugging me for ages, but first some background.  I want to be able to connect to my home systems from pretty much anywhere, and ssh is the obvious tool for that.  Occasionally I've been somewhere where they've blocked outgoing connections to port 22 (the ssh port), but I can instead connect to port 443 (the https port).  So if I instead run ssh on port 443, that gets around the problem.  But I also want to have a web server on port 443.  Fortunately there's this neat little tool called 'sslh' that can sit and listen for connections on a port, and when something connects and sends a message to the server, it determines what protocol the client is using and forwards the connection to the appropriate program.  So now I have multiple services running on the one port that should never be blocked.

But there's a problem.

The server logs for ssh and apache show the source of the connections as being from the local system, which is technically correct since they are coming from sslh on my local system.  I could merge the logs from sslh into the logs for apache and ssh, but that would be a pain.  What I really want is to have the original source IP to show up in the logs for the applications as if there wasn't a proxy in the middle.

Wishful thinking, right?  But apparently some really smart people wished for it, so they made it happen.

There are instructions for how to make this work for sslh, and if you follow them exactly, and if you're lucky, it does work.  I say you have to be lucky because there are some subtle issues that you'll hit if you think you're smarter than the instructions or try to do things a little differently.  Which is exactly the sort of thing I'm obviously going to do.

The way sslh works, is it accepts connections, and then sees the first message that the client sends, which it uses to determine what application to forward the connection to.  Fortunately network protocols tend to expect the client to send the first message, so for things like ssh, ssl, and http, sslh will know what to do.  (However, some protocols like imap have the server send the first message upon establishing a connection, so sslh can only service one such protocol by defaulting to it using a timeout.)

So my real setup is more complicated than what I described (which should come as no surprise).  The issue I hit was with connections using ssl (or really tls as the newer versions have been renamed).  If I am going to have multiple services using ssl encryption, then I need to decrypt the incoming connection and then use sslh to multiplex to different applications.  This is done by using the program stunnel.  And it also has transparent proxy support.

So an incoming connection comes in on port 443.  First sslh gets it and sees what protocol it's using.  If it's SSL/TLS, it sends it to stunnel, which then sends it back to sslh, which finally sends it on to apache, ssh, imap, or whatever else I have hidden behind that port.

Support for transparent proxying is included with both sslh and stunnel, so I'm good, right?

Nope.

I can get it working with one of them.  I can get it working with sslh going to stunnel and on to apache.  But if I have stunnel going to sslh, it breaks badly.

Why is that?

Well here's the problem.  A quick search brings up instructions on how to make the transparent proxy work, but while they give you the formula, they don't explain how it actually works.  And without understanding the reasoning behind the instructions, you're stuck if something goes wrong or if you want to try some creative variation on the same concept.

So I decided to figure out what's actually happening, and here's the technical meat of my post:

There are two parts of making this work.  The first is the transparent proxy has to send outgoing packets that appear to be from the original host, not from itself.  The second is the network layer of the operating system has to know to route the return packets back to the proxy application, even though they'll be addressed to some other system.

To send packets with the original source IP, the transparent proxy has to do two things between creating the socket and connecting to the target.  First, it has to enable transparent mode, which requires either running as root or having the cap_net_raw capability.  This is done with a line of code like:
  int transparent=1;
  res = setsockopt(fd, IPPROTO_IP, IP_TRANSPARENT, &transparent, sizeof(transparent));
And then it has to say where the packet is to appear to originate from:
  getpeername(fd_from, from.ai_addr, &from.ai_addrlen);
  res = bind(fd, from.ai_addr, from.ai_addrlen);
Normally you only bind a socket to an address like that when listening for incoming connections, but that's how you tell the kernel what address you're sending from in transparent mode.

The other part of this is to have the operating system route packets for your application back to you.  I won't go into the details here, but there are two approaches I've seen.  One is to run your proxy as a specific user, and have a firewall rule that all packets from that user are flagged by the firewall, with firewall rules that have the return packets get sent to a different routing table that tells them to go to the local machine.  The other, and the one I prefer, is to connect to a local IP address other than 127.0.0.1, such as 127.255.0.1.  Really, anything under 127.x.x.x works besides all zeros or all 255.  (I have a separate post about localhost being 127.0.0.1/8 instead of 127.0.0.1/32, giving you a ton of addresses to use for things like this.)

Note that the above assumes that the target of the transparent proxy is on the same system.  It's probably possible to create firewall rules that will make this work with proxying for a separate internal server, but I haven't explored that, as I haven't needed it yet.

Of course, what would be really nice is to just tell the networking layer to check all local bindings before forwarding packets, but there's no option for that.  I think that would make a nasty mix of code layers in the routing code, so I can understand why the fancy firewall rules are required.


So now that I understand how it's supposed to work, why didn't it work for me with my complicated setup?

The problem was the bind() call.  That's not just binding an IP address, it's also binding a port.  What does that mean?  Every network port is bound to a combination of an IP address and a port.  This is done explicitly for any service listening for incoming connections.  For outgoing connections, it's done implicitly with the local IP and some high numbered port that is automatically assigned.  But in transparent proxy mode, that outgoing connection binding is controlled directly by the program.  And if you have multiple layers of transparent proxies, you get into trouble.  You simply can't have two connections bound to the same IP address and port.  In transparent mode, you're using the original IP and port, so a second hop doesn't work.

Except it does if stunnel is the second hop.  How does it do this?  It gets a bind error, but then retries with a new port.  This means the end application sees a different origin port number, but the IP address is right.  And in most cases, that port number isn't logged or meaningful, so it doesn't matter.  It does break the 'ident' protocol, but I don't think anyone uses that anymore; certainly not for connections from outside a firewall.

So to make it work for me, once I understood what was really happening and why it was failing, was to put in the same retry on bind failures in sslh using a new port.  And being open source, I sent my patch to the developer of the program, and my patch will be in the next release.  That's the power of open source.

Tuesday, November 2, 2021

Don't Use Modular Arithmetic With Negative Numbers

Most programmers are familiar with the "modulus" operator, which is the percent symbol in the language C and many other programming languages that have followed.  Anyone who remembers learning division in elementary school will recall that when you divide a number, say 21 by a smaller number, say 5, you get a whole number part, 4 in this case, but there's a little left over.  Later in arithmetic, you learned to continue the division and get 4.2, but when first learning, you would be told to stop at 4 and say you had a remainder of 1.  This concept turns out to be incredibly useful given that computers do a lot of work with integers.

In C, when working with integer math, you will observe:

   21 / 5 = 4
   21 % 5 = 1

When I was a math major in college, we worked with modular arithmetic quite a bit both in my number theory class and in my modern algebra class when working with ring theory.  Number theory in particular was a wonderful class for a computer scientist, as it is the mathematical basis for pretty much all integer computing, as well as the basis for cryptography.

Now in modular arithmetic, as any mathematician knows, if you're working with mod 5, you always end up with numbers from 0 through 4.  Here, 4+1 is 0, and 0-1 is 4.  You've just taken the number line and wrapped it into a tight circle.

In computers, this is used in many applications.

Anything needing random numbers typically calls a function that returns a random unsigned integer from zero to however large the integer is (about four billion for a 32-bit integer).  If you're writing code to simulate rolling dice for Dungeons and Dragons, then you'll often want a random number from 1 through 20, so you'll take your huge random number, mod it by 20 to get a number from 0 through 19, and add one.

Another common data structure is an array.  It doesn't matter here what the elements are, they could be simple integers or complex structures.  What matters is that the array has a known number of elements, and you reference elements using integer indices of zero through one less than the array length.  So if the array has 50 elements, they're 0 through 49.  Now sometimes we have an array where we are keeping a history of events, so we save the index of the next event to write, and after writing it, we add one to move to the next element.  But to avoid going past the end, we mod the index by the array length to make it wrap around at the end.

That all works great.

Now suppose we have a circular array as I described, and we want to read it backwards.  We should be able to simply keep subtracting one and taking the result mod the array size so that when we go from zero to negative one it will jump back instead to the end (49 in the above example).  But we have two problems here.

First, often programmers use unsigned integers, especially for array indices.  When you subtract 1 from 0 with an unsigned number, it's an integer underflow, which wraps around to be the largest possible integer (that 4 billion-ish number for 32-bit integers).  And unless your array size is a power of 2, that modular arithmetic will not work out right.  If you think back to how modular arithmetic is like wrapping the number line into a circle, you can see this if you imagine wrapping a 72-inch tape measure into a 5-inch circle.  This would work great most of the way, but since you would have that last two inches at the end, moving backwards from zero would be as if the tape connected instantly back to 72, so subtracting one would go to 71, which would be read mod 5 as 1, not 4 like a mathematician would expect.

Second, even if you're carefully using signed integers for the array indices, the modular operator doesn't follow arithmetic rules as mathematicians would expect.  In fact, it's not really a modular operator at all, but really a remainder operator.  But those are the same thing, right?  Not really.  The catch is how the division operator works.

What is -1/5 in integer arithmetic?

Is that 0 or -1?

If it's -1, then you would have a remainder of 4.  That's also what you would want from a modular operator, keeping the value always in the 0 through 4 range.

If it's 0, then you would have a remainder of -1.  That's not how modular arithmetic works, as you can't have a negative modular result.

In C, the answer is 0.  And the so-called modular operator will tell you that -1%5 is -1.

In C, division of negative numbers rounds towards zero instead of negative infinity, which breaks true modular arithmetic.

Fortunately, it's not hard to work around this problem.

Say you're moving backwards through a circular array, and what you want to do is:

   previous_index = ( current_index - 1 ) % number_of_elements;

But since that breaks at zero, just add in the array size:

   previous_index = ( number_of_elements + current_index - 1 ) % number_of_elements;

That simple change prevents the index from ever going negative, and except when wrapping around below zero, the added array size is just erased by the modular operator.

Tuesday, July 30, 2019

Merge Conflicts for Non-Coders

I was discussing my day at work with my family, and I had spent much of the day resolving conflicts after doing a merge of two divergent branches.  Of course, neither my wife nor my son understood what I was talking about, so I came up with a very simple analogy to explain what I was doing.

Imagine a teacher wrote out a plot outline for a book, and then assigned each student to write one chapter.  The student writing chapter 5 decides that the heroes need to use a shovel, so he make a small change in the draft of chapter 3 so that they find a shovel at the beach.  Meanwhile the student writing chapter 3 decides she doesn't like the beach scene, so she changes it so that they go for a hike in the woods instead.

Eventually the students have to put all their work together.  The conflicting changes demonstrate two types of conflicts:

First, in adding in the shovel in chapter 3, the first student made changes to a paragraph that the other student also modified in changing the scene from the beach to the woods.  When using an automated merge tool, this sort of conflict is automatically detected, and the person doing the merge is shown the conflict.

Second, if in resolving the direct conflict, it's possible that the shovel is removed from chapter 3.  The merge had been completed, but no automated tools will tell you that the shovel used in chapter 5 now comes out of nowhere.  If this were computer code, this would either show up at compile time ("Error: Reference to undefined object "shovel" in chapter 5, paragraph 12."), or worse, would introduce a bug that only shows up when the code runs under certain circumstances.  If the book merge left the shovel out, hopefully it would be found by the editor (testing by the quality assurance team for a software project), not left for readers to be confused by (a bug in the field).

The right analogy can often make it much easier to explain things.  If you like this one, or have questions about other aspects of software engineering that you would like me to find similar analogies for, please comment.

Monday, February 25, 2019

Linux With a 4K Laptop Display

I have a new laptop with a 4K display.  This is double the resolution in each direction of a typical laptop, but with the same physical dimensions.  That means each pixel takes up a quarter of the space of a single pixel on a traditional display.  By default, many things will run using the same number of pixels as on a smaller display, resulting in text too tiny to read.

Making the problem more complicated, I want to be able to connect my laptop to an external display running at a lower resolution, and be able to disconnect it, walk away, and still have things work on the built-in screen.  I also want things look right if I plug in a much larger 4K display.  In other words, I want everything to work based on the DPI of the screen, not the pixels.

The simple solution is to run the display at a lower resolution.  This quick-and-dirty hack is good enough for most people, but you lose the crispness of the 4K display, and lose the ability to play 4K video at full resolution (which I may someday care about).  It also just feels like the wrong approach.

So I set my goal to make everything work nicely regardless of whether I'm using an external display or not, and handle changing between external and internal displays on the fly as seamlessly as possible.

Many people discussing this use the term "HiDPI" for high dots-per-inch displays, so if you're searching for solutions that I don't cover here, that's one term to search on.

I'm running Gentoo Linux, so some of my comments are going to be specific to that distribution, but for the most part it's simply a matter of finding the different locations for the equivalent files in your distribution.

Another good resource that I found in researching this is the Arch Linux Wiki page on the same topic:  https://wiki.archlinux.org/index.php/HiDPI

HiDPI Linux Console


First, there are two different things to worry about.  Almost everything I've seen online focuses on X, but I also want to have the console working correctly.  Here I have three issues to get working correctly:

  1. Setting up the Grub boot menu to look right.
  2. Setting the console font to a reasonable size.
  3. Changing console fonts and resolution when changing displays.

Grub Boot Menu

After some searching, I found that I can install fonts for Grub to use.  The command I used was:

mount /boot
SIZE=48
grub-mkfont --output=/boot/grub/fonts/DejaVuSansMono-${SIZE}.pf2 \
   --size=${SIZE} /usr/share/fonts/dejavu/DejaVuSansMono.ttf
umount /boot

In Gentoo, you can select the font in /etc/default/grub, so when building the configuration file, it will use the selected font:

GRUB_FONT=/boot/grub/fonts/DejaVuSansMono-48.pf2

The only problem with this font and size is the vertical lines have slight gaps.  From what I've found, this is a font issue due to using the wrong vertical line character (ASCII vertical bar instead of a unicode vertical line, or the other way around).  I would like to figure out what the offending character is and remove it from the character set or swap it with a good one, but I haven't figured out yet which character it's using and which character I want it to use.

Another problem is that while this makes the text perfectly readable on my built-in display, the text is huge if I boot with an external monitor.  I looked into rewriting the grub scripts to select the font based on the screen resolution (from the 'videoinfo' command).  Normally the grub.cfg is generated dynamically from a script provided as part of the grub installation, but I could write my own or have a post-processing tweak script that added my changes.  Unfortunately, the grub scripting language just isn't powerful enough to do this.  There is simply no mechanism to have the script behave differently based on the output of a command (like 'videoinfo').  There's no ability to pipe and grep to test for certain output or to put the output of a command into a variable.  If there were, I could check the resolution of the active display and select from the installed fonts to get an appropriate size.  Of course, if I could do that, I could also parse the output of a directory listing and dynamically generate the menu to select from the installed kernels, eliminating the need to update the grub config every time I install a new kernel.

So for now, I'm stuck with selecting a single predetermined font for grub regardless of what screen happens to be attached at boot time.  I'm sticking with the -48 font, which looks good on the laptop screen even though it is ridiculously huge on my external display, but still has room for my menu options.  I submitted a feature request for grub to add dynamic font selection support (bug 51914).

Also, I've noticed that interactions with grub are very slow.  This is bug 46133.  It's mostly unrelated to the topic at hand, but the bug only shows up at high screen resolutions, so don't be surprised if you hit it.

Linux Console Font

For the console, I needed to find a font that looked reasonable.  I use a for loop in bash to try every font in /usr/share/consolefonts, and the only one that looked right for me was latarcyrheb-sun32.psfu.gz.  As a quick fix, 'setfont -C /dev/tty1 /usr/share/consolefonts/latarcyrheb-sun32.psfu.gz' worked, repeating for each console.  That's still a bit small, resulting in a text console of 240x67.  (Note: I've now installed the Terminus font, but the largest there is the same size; ter-i32b.psf.)

For a real solution, I could use the consolefont service on Gentoo, but that still leaves the kernel boot messages too tiny to read.  I found a great writeup of a solution for putting a font into the kernel:  https://www.artembutusov.com/modify-linux-kernel-font/  That's the real solution for booting with a good console font.  I followed his instructions to install Terminus font as the kernel font, and now all the boot messages are readable.

Still, I'm limited to a maximum size of a -32 font, which is readable but small on my screen, and I can't make it dynamically select the boot font based on the screen resolution.  That's a fundamental limitation in Linux for now.

For the Linux console, the kernel is stuck with one font as far as I can tell, but I can use the consolefont service to select the font based on the screen resolution.  All I need is to set /etc/conf.d/consolefont to check the screen size and then adjust the font if needed.  But what I really want is a solution that detects and adjusts the font if I switch between screens.

Generating a console font from a TrueType font like DejaVu Sans Mono would probably be ideal, as then a script could generate the font for exactly the right size based on the DPI.  Everything I've seen online suggests that this is difficult at best.  Supposedly fontforge can do this, but I haven't figured out how to use it.  The grub-mkfont program shows that it can be done, but that only generates grub fonts, and I haven't found anything that converts out of that format.  Some research showed that there was a ttf2bdf program with older versions of Freetype, which has been supplanted by otf2bdf.  The page for this is down, but I was able to find an ebuild and the source elsewhere.  The output of that can be sent to bdf2psf (a package available in Gentoo).  Put this all together, and we have console fonts dynamically sized to the active display.  But for some reason the resulting fonts don't work, so something is broken.

The good news is that the kernel developers are aware of this problem, and one of the features of the 5.0 Linux kernel is a new console font for HiDPI displays.  I look forward to trying this, especially seeing if they have a solution for changing monitors.

Dynamically Changing Console Resolution

The command: 'fbset -a -xres 1980 -yres 1200' corrects the resolution (provided it's connected to a 1980x1200 monitor).  The command 'hwinfo --monitor' will provide information about connected monitors, allowing this to be scripted.  Now all that's needed is a trick to activate a script anytime a monitor is added or removed.

To detect video changes, it appears that I can't rely on any system events, which is unfortunate.  You would think that connecting or disconnecting a monitor would trigger something like udev, but no such luck.  My laptop uses an external USB-C dongle for most video connections, and it would probably work for detecting the connection of the dongle (since that changes the devices), but if the video cable is connected after the dongle is plugged in, that doesn't help.  It also doesn't help if I use the built-in HDMI port on the laptop.  I want a solution that will always work and isn't dependant on other coinciding changes.

What does work is monitoring changes in /sys.  In particular, watch for changes to the directory /sys/class/drm and for changes in any file /sys/class/drm/*/status.  Unfortunately, this is a virtual file system, so the kernel only updates the files when someone looks.  If something changes in sysfs, but nobody looks, did it really change?  No, it didn't.  That means you can't use inotify to find out about changes.

So getting this to work requires a script that actively polls files in /sys to watch for resolution changes.

Of course, there will be things I will want to tweak in X on monitor changes, too, so those can go in the same script.  If requested in the comments, I'll post the script.

Dynamically Changing Console Fonts

The same script that changes the console resolution can also change the console fonts.  After playing with this, though, I decided it was a better solution to simply leave everything at the -32 font, which is a bit small on the 4K display and a bit big on other displays, but usable everywhere.


HiDPI Linux X11 Display

Running X Windows on a high DPI display is challenging because there are several different ways that applications adjust.  A few settings will cover most applications, but then others either require application-specific settings or simply don't have scaling options.  Also, changing DPI on the fly is difficult; most applications will need to be restarted to adjust.  We're in the early days of 4K displays right now, so hopefully everything that is still actively maintained will automatically adjust within the next year or two.

I expect most desktop environments like KDE and Gnome have their own ways of configuring the DPI, just like they do with other configuration options.  I don't run a desktop environment, so I'm not going to talk about them.  Some of the settings I'll discuss are hopefully adjusted under the covers by your desktop settings, but you can fall back to setting them manually like I do if that doesn't work.

Before setting the DPI, I need a good way of determining what the DPI is.  Fortunately, the screen dimensions are reported by the monitor's EDID information, and it's usually accurate.  In some cases, the information is wrong, or you'll find that things look better with a different DPI.  Until recently, a DPI of 96 was considered typical, so that's what you probably get if you don't adjust anything.

The command I use to determine the DPI extracts the screen size (in mm) from xrandr and converts it to inches:

DPI=$(xrandr | egrep ' connected .* [1-9][0-9]*mm x [1-9][0-9]*mm|[*]' | \
   sed -e N -e 's/\n/ /' | \
   sed -e 's/^[^ ]*//' -e 's/([^)]*)//' -e 's/^[^0-9]*[^ ]*//' -e 's/[^0-9]/ /g' | \
   awk '{printf("%g %g\n",$3/($1 * 0.0393701),$4/($2 * 0.0393701))}' | \
   awk '{printf("%d\n",($1+$2+0.5)/2)}' | head -n 1)

General Settings

Ideally we should set some value on the X server, and then your applications will use that value, select the right fonts, and adjust the scaling of everything automatically.  The good news is that since most programs use a small number of toolkits like gtk+, qt, and the like, we're in reasonably good shape.  The toolkits are DPI-aware, so if you tell them what DPI to use, programs based on them will just work.

Much of the configuration of the X server can now be done dynamically with xrandr.  This includes setting the DPI.  This is the most important setting, and hopefully everything will move to use this one setting in the future:

xrandr --dpi ${DPI}

For GTK, it uses a X resource.  In most systems, your ~/.Xresources file is loaded when X starts, but if you want something dynamic, the line to use is:

xrdb -override <<< "Xft.dpi: ${DPI}"

That uses a Bash "here string."  It's roughly equivalent to:

echo "Xft.dpi: ${DPI}" | xrdb -override

But it doesn't create a subshell.

That takes care of applications like Thunderbird and emacs (for the menus if compiled with gtk).

Cursors / Pointers

I've never played with modifying the cursors in X before, so I learned a lot about how they work.  There's one cursor that is displayed in the root window, which defaults to a 'x'.  This is the only one set by X itself, and you change it with the command xsetroot.  All the other cursors are set by the programs you're running (which includes the window manager).  Some programs will automatically adjust the cursor that they use based on the DPI settings they use for other scaling.  The method used by other applications is to read the information from X resources.

In playing with setting cursors, I found that you can install a bunch of different cursor themes.  I installed and tried a bunch, and rather liked the whiteglass theme, which is part of the x11-themes/xcursor-themes package in Gentoo.

xrdb -override <<< "Xcursor.theme: whiteglass"
xrdb -override <<< "Xcursor.size: $(( DPI / 6 ))"

For the root window cursor, I picked a cursor from a different theme, but stuck with the default X_cursor.  This is set using the full path (which may be different on your system or even with different themes).  X sets this cursor as part of its startup before the resources are loaded, so it can't be controlled by the resources.

xsetroot -xcf /usr/share/cursors/xorg-x11/Adwaita/cursors/X_cursor $(( DPI / 6 ))

You can change the root cursor anytime you like, but the cursors inside applications, including those set by the window manager, require that you restart your application to pick up the changes.  This means restarting your window manager when you change monitors.

Application-Specific Settings

Xterm: I've never worried about fonts in xterm before, but it turns out to be a fairly simple problem to solve.  You just select a TrueType font for xterm and set a reasonable point size.  Then it will select the font size to be correct for the DPI of the display (as of the time the terminal launches).  To make this work, I set two X resources:

XTerm*faceName: Dejavu Sans Mono Bold
XTerm*faceSize: 8

Chromium: You have to specify a command-line option for a forced scale factor.  Hopefully they'll fix this in the future, but you'll still want this if you don't like the default:

 --force-device-scale-factor=$(bc <<< "scale=5;${DPI} / 96" )

That's a little on the large size, so use a higher base DPI than 96 if it doesn't look right to you.

VNC viewer: I've been using tightvnc viewer, but it's been pretty much abandoned, and it doesn't have any scaling option.  There's a fork of the project, though, that solves this problem: ssvnc.  This is mostly a set of wrappers to tunnel VNC though ssl or ssh, but I've been doing that with my own wrappers already, so I'm ignoring that.  What's really useful is that ssvncviewer is command-line compatible with TightVNC's vncviewer, and it adds a scaling factor.  Since I'm already using a wrapper script for ssh, I just modified it to add a scale of 2 if the DPI is over 200.

ssvncviewer -scale 2 ...

twm: Yes, I'm the last person on Earth still running twm, but the information here might apply elsewhere, too.

First, your window manager has the job of launching programs, and some of those require special parameters based on the DPI.  If you don't want to have to restart your window manager when you switch monitors, the solution is to use wrapper scripts for all of those programs.  Second, you'll need some trick to have a dynamic configuration for your window manager.  I use twm, but I patched it to use popen() instead of fopen() when reading the configuration file if it finds that it's executable, so my .twmrc is already generated by script.  (Perhaps I'll add a blog entry on my hacks of twm.)

It's not surprising that twm doesn't support TTF fonts, so you can't just specify a point size and let it scale (though I've seem some references to patches, so it's not an unreasonable approach; moving to vtwm or another window manager might also make sense).  Instead, the fonts are hard-coded in the .twmrc file.  Since I already have that file dynamically generated, I can set the fonts based on the DPI.  One simpler method would be to have two config files and have a script that detects monitor changes handle setting a symbolic link to the correct one.

Others: If you find other applications that don't just work, I would suggest finding their bug tracking system and opening a bug.  Someday everything should just work.

Future Work

What if you have multiple displays at different resolutions?  Unfortunately, you can only set one DPI with xrandr or X resources at a time.  Every time someone asks about setting per-monitor DPI, the answer seems to be to scale the higher DPI display.  Making this work correctly would also require that applications supported changing DPI on the fly, so that they would adjust automatically if dragged to a different screen.  For those moving from X to Wayland, apparently it's using per-monitor scaling as its solution, too.

Tuesday, August 15, 2017

A Subtle Difference between C and C++: String Literals

For the most part, C++ is a superset of C.  You can write C code, rename the file with a .cpp extension, and the compiler will compile it in C++ mode, generating essentially the same code.  In fact, the very first C++ compilers were actually C compilers with extra pre-processing.  But there are some subtle differences, and I recently ran into one that has some important implications.

In C, string literals are not constant, but in C++ they are.


For most practical purposes, string literals are constants in C.  The compiler puts them into a data segment that the code can't modify.  All the libc string functions take (const char *) parameters.  But if you assign a (char *) pointer to the start of a string literal, the compiler won't warn you about dropping the 'const' from the type, and you can write code that modifies the string (only to have it fault at run time).  In fact, the warning about dropping the 'const' is probably the reason that newer versions of C haven't changed string literals to be constant.

Why does anyone care?

I've found two specific instances where code works when compiled as C++, but not as C because of this.

You can't use string literals as the targets of case statements.


This may seem obvious, as a case statement takes an integer type, not a string.  But suppose you want to do a switch based on the first four characters of a string (after ensuring that it's at least four characters long).  Imagine the following macro:

#define STR2INT(s) ( ((s[0]) << 24) | ((s[1]) << 16) | ((s[2]) << 8) | (s[3]) )

Now you could write code like:

   switch(STR2INT(option)) {
      case STR2INT("help"):
         ...
      case STR2INT("read"):
         ...
   }

That works in C++.  But in C, the compiler complains that the case statements aren't constant expressions.  To make it work in C, you have to have a much uglier version of the macro for the case statements:

#define CHARS2INT(w, x, y, z) (((w) << 24) | ((x) << 16) | ((y) << 8) | (z))

Then the code looks like:

   switch(STR2INT(option)) {
      case CHARS2INT('h','e','l','p'):
         ...
      case CHARS2INT('r','e','a','d'):
         ...
   }

That works in both C and C++, but is a pain to write.  At least the STR2INT macro works fine in other situations where the compiler insists on constant values.

You can't write asserts based on macro names.


In large software projects, it's not unusual to have sets of macros for specific purposes.  These macros are by convention supposed to follow some project-specific format.  There even may be a defined correlation between the name of the macro and the value.  It would be nice to be able to write asserts based on the macro name to enforce those conventions.

A quick aside on asserts:

Both C and C++ now support compile-time asserts.  It used to be that you would write code that would generate a negative shift if the expression wasn't true or something like that.  When the assert failed, you would get a compile-time error that was rather confusing until you looked at the offending line.  With the new mechanism, the compiler displays whatever message you tell it.  You use static_assert(expression,"message");  In C, you have to include <assert.h> or use _Static_assert.  This was added in C11 and C++11.

So for a trivial example, suppose we have macros like:

#define VAL_7B 0x7b

Now somewhere we use those macros:

   process_value(VAL_7B);

Obviously real code would have other parameters, but this is enough for our purposes.

To have asserts based on the macro name, what appears to be a function call must also be a macro; presumably a macro wrapper around the real function call.  Consider this definition:

#define process_value(v) \
   do { \
      _process_value(v); \
   } while(0)

That's a basic wrapper, forcing a semicolon at the end of the do-while loop.  This lets us add in asserts using the '#' preprocessor operator to stringify the input parameter:

#define CHAR2HEX(c) ( (c) <= '9' ? (c) - '0' : (c) - 'A' + 10 ) // Assumes uppercase
#define process_value(v) \
   do { \
      static_assert( (#v)[0]=='V' && (#v)[1]=='A' && (#v)[2]=='L' && (#v)[3]=='_', "Must use a 'VAL_xx' macro here" ); \
      static_assert( CHAR2HEX((#v)[4]) == ((v)>>4)  , "'VAL_xx' macro doesn't match defined value" ); \
      static_assert( CHAR2HEX((#v)[5]) == ((v)&0x0f), "'VAL_xx' macro doesn't match defined value" ); \
      static_assert( (#v)[06]==0, "'VAL_xx' macro format wrong" ); \
      _process_value(v); \
   } while(0)

In C++, that works great.  In C, you just can't do that.

And here's something interesting:  Why not change the above example to look like:

#define CHAR2HEX(c) ( (c) <= '9' ? (c) - '0' : (c) - 'A' + 10 ) // Assumes uppercase
#define process_value(v) \
   do { \
      static_assert( (#v)[0]=='V' && (#v)[1]=='A' && (#v)[2]=='L' && (#v)[3]=='_', "Must use a 'VAL_xx' macro here" ); \
      static_assert( CHAR2HEX((#v)[4]) <= 0xf  , "'VAL_xx' macro with bad hex value" ); \
      static_assert( CHAR2HEX((#v)[5]) <= 0xf  , "'VAL_xx' macro with bad hex value" ); \
      static_assert( (#v)[06]==0, "'VAL_xx' macro format wrong" ); \
      _process_value( CHAR2HEX((#v)[4])<<4 | CHAR2HEX((#v)[5]) ); \
   } while(0)

This uses the value directly out of the macro name, so you can leave the value off entirely when defining the macro, right?  Yes.  But it goes further than that.  Since the above code only uses the stringification of the parameter, it never expands it.  That means it's perfectly happy if you never define the VAL_XX macros at all, which is probably not what you want.  Be sure that the wrapper macro expands the macro somewhere if you want to be sure it's actually a defined macro.

Conclusion


So if you've followed my other writing up to this point, you're probably expecting some clever hack to make this work in C.  Sorry, but not this time.  It would probably be relatively simple to add a compiler option or #pragma directive to make string constants literal in C, but gcc doesn't have this, and I'm not aware of any other compiler that does.  (Please comment if you know otherwise.)  There are plenty of tricks you could do if you're willing to use additional tools in your build process, like an additional step between the regular preprocessor and the compiler to look for extracting characters from string literals and convert them into character constants (and you could tell the compiler to use a wrapper script to do that as the preprocessor), but that's not likely to be an acceptable option.

You just can't do that in C.

Resources

This is the test file I used to be sure my above examples were correct: c_vs_cpp_example.c

Friday, August 11, 2017

Installing Gentoo Linux on a Dell Precision 5510

From time to time I realize just how slow and outdated my computer is.  Usually that's long after the corporate policy says I can order a new one, and this time was no different.  The engineering laptop is a Dell (no surprise if you know where I work), and the default model is the Precision 5510.  New computers always come with Windows installed, so my first task is to install Linux.  I figured other people with this or similar models will want to hear about what issues I encountered, so I'm documenting the process here.

The first thing I did was update the BIOS.  There have been some low-level security issues in the press recently, so I figured I should start with the latest base.  This was easy: I booted into the installed Windows, logged in, and went go to http://dell.com/support.  From there I entered my service tag, clicked on downloads, and selected the BIOS.  As I expected, the installed BIOS was a few months older than the latest one.  I'm glad I didn't try to install this under Linux, though I assume there is a way to do so.  It was a simple process of clicking the obvious choices, and then it rebooted and completed flashing.

Now the challenges began.

This is my first laptop without a CD/DVD drive (unless you go back to the floppy days).  So to boot Linux, I needed a bootable USB flash.  I had followed some very complicated instructions to set up a flash drive, but then found that I had much better results by simply taking the Gentoo live CD and using 'dd if=source_file.iso of=/dev/sdb bs=8192k' to copy it.

So it should be simple to boot from the flash drive, right?  Wrong!  Windows set up a "fast boot" where it bypasses most of the BIOS, including the part that watches for holding down F12 for the boot menu.  After some searching, I found that you need to enter "control panel" in the Windows search thing that replaces the Start menu (this is my first time to ever run Windows 10).  From there I used one of the control panels to disable fast boot.  Even with that, it took several tries to get it to recognize the F12 for the boot menu--it checks it very early in the process and moves past it very quickly.

So I booted the Gentoo USB, and started following the installation guide.  I've done this before, so I wasn't expecting too much trouble, but I did have some surprises.  First, I wiped the partition table and set it up for Gentoo.  I chose to use 1GB VFAT for the EFI partition that is also used as my /boot partition.  When everything is good, I'll only mount it to copy over new kernels when updating.  Most of the rest of the drive is my root partition, for which I selected ext4.  The last 64GB I used as a swap partition, which is excessive, but I hate running out of memory and having the OOM killer sacrifice the one thing I wanted to keep alive.

At this point, I found that I had no network.  This is my first modern laptop without an ethernet port.  It came with three different USB-C dongles that I could use, one is just ethernet, one is ethernet, video, and USB, and one is essentially a docking port with lots of ports on it.  What I quickly learned is that USB-C isn't like traditional USB.  USB-C is the connector, but it can use several different protocols over that connector, including PCI, in which case the marketing name is Thunderbolt.  None of that was working in Linux.  More on this in a bit...

So for the time being, I tried copying over stuff using a second flash stick, but that didn't get me very far.  I found an old regular USB ethernet dongle, and that worked fine.  Once I set up my kernel to boot from EFI, everything should have been mostly good, but of course it wasn't.  I had to go back and forth between booting from USB and failing with my kernel.  This was a challenge because I couldn't get the F12 menu to come up once I had wiped the Windows partition.  I ended up switching back and forth between legacy and EFI booting in the BIOS; legacy for USB and EFI for my attempt at my installed kernel.  This probably wasn't necessary, but it worked to get me past the problem.  (I needed to go into the BIOS anyway to tell it to boot the grub loader from EFI instead of Windows.)

Getting the system to boot natively was a huge pain.  Part of the problem is that after the boot completes, Gentoo clears the screen for the login prompt, so you lose the boot messages.  I decided to fix this by commenting out the getty command for tty1 in /etc/inittab.  Now I have to switch consoles to log in, but I can always see the last screenful of the boot.  I eventually got my installed kernel to work.  It turned out that I had failed to configure tmpfs, and Gentoo fails the boot process miserably if it can't put a tmpfs file system on /run.  This is what I get for configuring my own kernels--a bit of pain, but I always learn things.

So I finally dived into the problem of the Thunderbolt devices simply not working.  The main problem was again the BIOS.  It was configured with security options for Thunderbolt.  This is a good idea--without security, any device plugged into that port can do anything it wants with your computer.  But for setting up Linux, it's not a good idea, so I turned off all the security options.  I would like to turn them back on at some point, but I'll have to do some research to see if and how Linux handles Thunderbolt security.

Issues

Now I'm able to get the system up and running for the most part, but I'm still having some issues.  I'll document each one here:


Issue:

Switching from X back to a console works, but then moving back to X doesn't

Solution:

Need to load firmware as mentioned in https://wiki.gentoo.org/wiki/Intel
Switch acceleration from Sandybridge New Architecture (SNA) to UXA
Setting 'Option "DRI" "False"' works--switching is consistent, though there's an 8-10 second black screen when switching back to X.

Issue:

The trackpad works fine in the console as long as you don't want to do a right or middle click.  If you're used to using gpm, you can't do cut-and-paste operations or anything else requiring a middle or right mouse button.

Solutions:

I have not found any solution to this problem.

Issue:

X has no middle-click on the trackpad.

Solutions:

You can set the logical location of the middle button to be in the middle of the bottom of the trackpad with the synclient command.  Unfortunately, this doesn't help with gpm on the console.  I added the following script to my X startup script:

#!/bin/bash
eval $(synclient -l | egrep '(Left|Right)Edge|RightButtonAreaTop|MiddleButtonAreaRight' | sed -e 's/ //g')
if [ "${MiddleButtonAreaRight}" != 0 ]; then
    echo Middle mouse button already set up
    exit 0
fi
# The order here is critical or it will be rejected:
synclient RightButtonAreaLeft=$(( RightEdge - (RightEdge - LeftEdge) / 3 + 1 ))
synclient MiddleButtonAreaRight=$(( RightEdge - (RightEdge - LeftEdge) / 3 ))
synclient MiddleButtonAreaTop=${RightButtonAreaTop}
synclient MiddleButtonAreaLeft=$(( LeftEdge + (RightEdge - LeftEdge) / 3 ))

You can play with that to get the buttons to work however you like.  I chose to have three equal buttons, and I didn't play with the other settings, but you can look at the output of 'synclient -l' to see what all you can fiddle with.  Do note my comment about the order.  You can't have overlapping buttons, so I had to be sure to shrink the right button before defining the middle button (the left button is implicit, so you don't have to adjust it).

Issue:

The Thunderbolt devices work fine if they're connected when the system boots.  If you unplug and replug them, the system sometimes needs a kick to rescan the PCI bus:
echo 1 > /sys/bus/pci/rescan
Takes care of it usually.  However, after several repeated disconnect/connect cycles, the PCI scan fails, and it can't use the device.

Solution:

I don't do a lot of disconnecting/reconnecting, so I haven't dug any further into this.

Kernel Config

Getting the kernel config right for all the devices can be tricky.  I'm old fashioned enough that I configure my kernel with support for exactly the devices I use, avoiding modules as much as possible.  I'm noting here some of the drivers you'll want to be sure to include:

Trackpad: From dmesg, I see it's a SynPS/2 Synaptics TouchPad.

Ethernet: The dock uses a Realtek 8153 USB ethernet chip.

Sound: The WD-15 dock uses a 0bda:4014 Realtek USB audio device

Touch Screen: It has an Elan touchscreen.  Select the "HID Multitouch panels" driver under "Special HID drivers" in the kernel configuration.  It's working, but I haven't tuned it.  At some point I may want to investigate, and this looks like a promising guide:  https://wiki.archlinux.org/index.php/Calibrating_Touchscreen

WiFi: It has an Intel 8260 chip, which uses the iwlwifi driver  It also requires firmware.  I used this guide:  https://wiki.gentoo.org/wiki/Iwlwifi

Bluetooth: It has an Intel 8087:0a2b USB Bluetooth device.  I configured it so that the kernel recognizes it and is happy, but I don't have a wireless keyboard or mouse, so I may not use it anytime soon, but I'm thinking about using it for audio.  The only catch was that before it would connect to my Amazon Echo, I had to tell it, "Alexa, pair with my phone."  A volume control shows up with "alsamixer -D bluealsa."  The guide I used is: https://wiki.gentoo.org/wiki/Bluetooth

Video: It has both Intel and Nvidia graphics, but unlike some laptops, you can't just ignore the Intel and use the Nvidia graphics--you have to access the Nvidia through the Intel GPU.  So you can't install the Nvidia drivers without first installing the Intel drivers.

Flash Reader: It's a Realtek RTS525A PCI device.  It requires the kernel to be built with CONFIG_MFD_RTSX_PCI and CONFIG_MMC_REALTEK_PCI.  Unlike my old laptop, the card sticks out when inserted, so I can't leave an empty microSD adapter in the laptop.

WebCam: I haven't looked into using the webcam.