Mad Man with a Compiler: 2017

Tuesday, August 15, 2017

A Subtle Difference between C and C++: String Literals

For the most part, C++ is a superset of C. You can write C code, rename the file with a .cpp extension, and the compiler will compile it in C++ mode, generating essentially the same code. In fact, the very first C++ compilers were actually C compilers with extra pre-processing. But there are some subtle differences, and I recently ran into one that has some important implications.

In C, string literals are not constant, but in C++ they are.

For most practical purposes, string literals are constants in C. The compiler puts them into a data segment that the code can't modify. All the libc string functions take (const char *) parameters. But if you assign a (char *) pointer to the start of a string literal, the compiler won't warn you about dropping the 'const' from the type, and you can write code that modifies the string (only to have it fault at run time). In fact, the warning about dropping the 'const' is probably the reason that newer versions of C haven't changed string literals to be constant.

Why does anyone care?

I've found two specific instances where code works when compiled as C++, but not as C because of this.

You can't use string literals as the targets of case statements.

This may seem obvious, as a case statement takes an integer type, not a string. But suppose you want to do a switch based on the first four characters of a string (after ensuring that it's at least four characters long). Imagine the following macro:

#define STR2INT(s) ( ((s[0]) << 24) | ((s[1]) << 16) | ((s[2]) << 8) | (s[3]) )

Now you could write code like:

   switch(STR2INT(option)) {
      case STR2INT("help"):
         ...
      case STR2INT("read"):
         ...
   }

That works in C++. But in C, the compiler complains that the case statements aren't constant expressions. To make it work in C, you have to have a much uglier version of the macro for the case statements:

#define CHARS2INT(w, x, y, z) (((w) << 24) | ((x) << 16) | ((y) << 8) | (z))

Then the code looks like:

   switch(STR2INT(option)) {
      case CHARS2INT('h','e','l','p'):
         ...
      case CHARS2INT('r','e','a','d'):
         ...
   }

That works in both C and C++, but is a pain to write. At least the STR2INT macro works fine in other situations where the compiler insists on constant values.

You can't write asserts based on macro names.

In large software projects, it's not unusual to have sets of macros for specific purposes. These macros are by convention supposed to follow some project-specific format. There even may be a defined correlation between the name of the macro and the value. It would be nice to be able to write asserts based on the macro name to enforce those conventions.

A quick aside on asserts:

Both C and C++ now support compile-time asserts. It used to be that you would write code that would generate a negative shift if the expression wasn't true or something like that. When the assert failed, you would get a compile-time error that was rather confusing until you looked at the offending line. With the new mechanism, the compiler displays whatever message you tell it. You use static_assert(expression,"message"); In C, you have to include <assert.h> or use _Static_assert. This was added in C11 and C++11.

So for a trivial example, suppose we have macros like:

#define VAL_7B 0x7b

Now somewhere we use those macros:

   process_value(VAL_7B);

Obviously real code would have other parameters, but this is enough for our purposes.

To have asserts based on the macro name, what appears to be a function call must also be a macro; presumably a macro wrapper around the real function call. Consider this definition:

#define process_value(v) \
   do { \
      _process_value(v); \
   } while(0)

That's a basic wrapper, forcing a semicolon at the end of the do-while loop. This lets us add in asserts using the '#' preprocessor operator to stringify the input parameter:

#define CHAR2HEX(c) ( (c) <= '9' ? (c) - '0' : (c) - 'A' + 10 ) // Assumes uppercase
#define process_value(v) \
   do { \
      static_assert( (#v)[0]=='V' && (#v)[1]=='A' && (#v)[2]=='L' && (#v)[3]=='_', "Must use a 'VAL_xx' macro here" ); \
      static_assert( CHAR2HEX((#v)[4]) == ((v)>>4)  , "'VAL_xx' macro doesn't match defined value" ); \
      static_assert( CHAR2HEX((#v)[5]) == ((v)&0x0f), "'VAL_xx' macro doesn't match defined value" ); \
      static_assert( (#v)[06]==0, "'VAL_xx' macro format wrong" ); \
      _process_value(v); \
   } while(0)

In C++, that works great. In C, you just can't do that.

And here's something interesting: Why not change the above example to look like:

#define CHAR2HEX(c) ( (c) <= '9' ? (c) - '0' : (c) - 'A' + 10 ) // Assumes uppercase
#define process_value(v) \
   do { \
      static_assert( (#v)[0]=='V' && (#v)[1]=='A' && (#v)[2]=='L' && (#v)[3]=='_', "Must use a 'VAL_xx' macro here" ); \
      static_assert( CHAR2HEX((#v)[4]) <= 0xf  , "'VAL_xx' macro with bad hex value" ); \
      static_assert( CHAR2HEX((#v)[5]) <= 0xf  , "'VAL_xx' macro with bad hex value" ); \
      static_assert( (#v)[06]==0, "'VAL_xx' macro format wrong" ); \
      _process_value( CHAR2HEX((#v)[4])<<4 | CHAR2HEX((#v)[5]) ); \
   } while(0)

This uses the value directly out of the macro name, so you can leave the value off entirely when defining the macro, right? Yes. But it goes further than that. Since the above code only uses the stringification of the parameter, it never expands it. That means it's perfectly happy if you never define the VAL_XX macros at all, which is probably not what you want. Be sure that the wrapper macro expands the macro somewhere if you want to be sure it's actually a defined macro.

Conclusion

So if you've followed my other writing up to this point, you're probably expecting some clever hack to make this work in C. Sorry, but not this time. It would probably be relatively simple to add a compiler option or #pragma directive to make string constants literal in C, but gcc doesn't have this, and I'm not aware of any other compiler that does. (Please comment if you know otherwise.) There are plenty of tricks you could do if you're willing to use additional tools in your build process, like an additional step between the regular preprocessor and the compiler to look for extracting characters from string literals and convert them into character constants (and you could tell the compiler to use a wrapper script to do that as the preprocessor), but that's not likely to be an acceptable option.

You just can't do that in C.

Resources

This is the test file I used to be sure my above examples were correct: c_vs_cpp_example.c

Friday, August 11, 2017

Installing Gentoo Linux on a Dell Precision 5510

From time to time I realize just how slow and outdated my computer is. Usually that's long after the corporate policy says I can order a new one, and this time was no different. The engineering laptop is a Dell (no surprise if you know where I work), and the default model is the Precision 5510. New computers always come with Windows installed, so my first task is to install Linux. I figured other people with this or similar models will want to hear about what issues I encountered, so I'm documenting the process here.

The first thing I did was update the BIOS. There have been some low-level security issues in the press recently, so I figured I should start with the latest base. This was easy: I booted into the installed Windows, logged in, and went go to http://dell.com/support. From there I entered my service tag, clicked on downloads, and selected the BIOS. As I expected, the installed BIOS was a few months older than the latest one. I'm glad I didn't try to install this under Linux, though I assume there is a way to do so. It was a simple process of clicking the obvious choices, and then it rebooted and completed flashing.

Now the challenges began.

This is my first laptop without a CD/DVD drive (unless you go back to the floppy days). So to boot Linux, I needed a bootable USB flash. I had followed some very complicated instructions to set up a flash drive, but then found that I had much better results by simply taking the Gentoo live CD and using 'dd if=source_file.iso of=/dev/sdb bs=8192k' to copy it.

So it should be simple to boot from the flash drive, right? Wrong! Windows set up a "fast boot" where it bypasses most of the BIOS, including the part that watches for holding down F12 for the boot menu. After some searching, I found that you need to enter "control panel" in the Windows search thing that replaces the Start menu (this is my first time to ever run Windows 10). From there I used one of the control panels to disable fast boot. Even with that, it took several tries to get it to recognize the F12 for the boot menu--it checks it very early in the process and moves past it very quickly.

So I booted the Gentoo USB, and started following the installation guide. I've done this before, so I wasn't expecting too much trouble, but I did have some surprises. First, I wiped the partition table and set it up for Gentoo. I chose to use 1GB VFAT for the EFI partition that is also used as my /boot partition. When everything is good, I'll only mount it to copy over new kernels when updating. Most of the rest of the drive is my root partition, for which I selected ext4. The last 64GB I used as a swap partition, which is excessive, but I hate running out of memory and having the OOM killer sacrifice the one thing I wanted to keep alive.

At this point, I found that I had no network. This is my first modern laptop without an ethernet port. It came with three different USB-C dongles that I could use, one is just ethernet, one is ethernet, video, and USB, and one is essentially a docking port with lots of ports on it. What I quickly learned is that USB-C isn't like traditional USB. USB-C is the connector, but it can use several different protocols over that connector, including PCI, in which case the marketing name is Thunderbolt. None of that was working in Linux. More on this in a bit...

So for the time being, I tried copying over stuff using a second flash stick, but that didn't get me very far. I found an old regular USB ethernet dongle, and that worked fine. Once I set up my kernel to boot from EFI, everything should have been mostly good, but of course it wasn't. I had to go back and forth between booting from USB and failing with my kernel. This was a challenge because I couldn't get the F12 menu to come up once I had wiped the Windows partition. I ended up switching back and forth between legacy and EFI booting in the BIOS; legacy for USB and EFI for my attempt at my installed kernel. This probably wasn't necessary, but it worked to get me past the problem. (I needed to go into the BIOS anyway to tell it to boot the grub loader from EFI instead of Windows.)

Getting the system to boot natively was a huge pain. Part of the problem is that after the boot completes, Gentoo clears the screen for the login prompt, so you lose the boot messages. I decided to fix this by commenting out the getty command for tty1 in /etc/inittab. Now I have to switch consoles to log in, but I can always see the last screenful of the boot. I eventually got my installed kernel to work. It turned out that I had failed to configure tmpfs, and Gentoo fails the boot process miserably if it can't put a tmpfs file system on /run. This is what I get for configuring my own kernels--a bit of pain, but I always learn things.

So I finally dived into the problem of the Thunderbolt devices simply not working. The main problem was again the BIOS. It was configured with security options for Thunderbolt. This is a good idea--without security, any device plugged into that port can do anything it wants with your computer. But for setting up Linux, it's not a good idea, so I turned off all the security options. I would like to turn them back on at some point, but I'll have to do some research to see if and how Linux handles Thunderbolt security.

Issues

Now I'm able to get the system up and running for the most part, but I'm still having some issues. I'll document each one here:

Issue:

Switching from X back to a console works, but then moving back to X doesn't

Solution:

Need to load firmware as mentioned in https://wiki.gentoo.org/wiki/Intel
Switch acceleration from Sandybridge New Architecture (SNA) to UXA
Setting 'Option "DRI" "False"' works--switching is consistent, though there's an 8-10 second black screen when switching back to X.

Issue:

The trackpad works fine in the console as long as you don't want to do a right or middle click. If you're used to using gpm, you can't do cut-and-paste operations or anything else requiring a middle or right mouse button.

Solutions:

I have not found any solution to this problem.

Issue:

X has no middle-click on the trackpad.

Solutions:

You can set the logical location of the middle button to be in the middle of the bottom of the trackpad with the synclient command. Unfortunately, this doesn't help with gpm on the console. I added the following script to my X startup script:

#!/bin/bash
eval $(synclient -l | egrep '(Left|Right)Edge|RightButtonAreaTop|MiddleButtonAreaRight' | sed -e 's/ //g')
if [ "${MiddleButtonAreaRight}" != 0 ]; then
    echo Middle mouse button already set up
    exit 0
fi
# The order here is critical or it will be rejected:
synclient RightButtonAreaLeft=$(( RightEdge - (RightEdge - LeftEdge) / 3 + 1 ))
synclient MiddleButtonAreaRight=$(( RightEdge - (RightEdge - LeftEdge) / 3 ))
synclient MiddleButtonAreaTop=${RightButtonAreaTop}
synclient MiddleButtonAreaLeft=$(( LeftEdge + (RightEdge - LeftEdge) / 3 ))

You can play with that to get the buttons to work however you like. I chose to have three equal buttons, and I didn't play with the other settings, but you can look at the output of 'synclient -l' to see what all you can fiddle with. Do note my comment about the order. You can't have overlapping buttons, so I had to be sure to shrink the right button before defining the middle button (the left button is implicit, so you don't have to adjust it).

Issue:

The Thunderbolt devices work fine if they're connected when the system boots. If you unplug and replug them, the system sometimes needs a kick to rescan the PCI bus:
echo 1 > /sys/bus/pci/rescan
Takes care of it usually. However, after several repeated disconnect/connect cycles, the PCI scan fails, and it can't use the device.

Solution:

I don't do a lot of disconnecting/reconnecting, so I haven't dug any further into this.

Kernel Config

Getting the kernel config right for all the devices can be tricky. I'm old fashioned enough that I configure my kernel with support for exactly the devices I use, avoiding modules as much as possible. I'm noting here some of the drivers you'll want to be sure to include:

Trackpad: From dmesg, I see it's a SynPS/2 Synaptics TouchPad.

Ethernet: The dock uses a Realtek 8153 USB ethernet chip.

Sound: The WD-15 dock uses a 0bda:4014 Realtek USB audio device

Touch Screen: It has an Elan touchscreen. Select the "HID Multitouch panels" driver under "Special HID drivers" in the kernel configuration. It's working, but I haven't tuned it. At some point I may want to investigate, and this looks like a promising guide: https://wiki.archlinux.org/index.php/Calibrating_Touchscreen

WiFi: It has an Intel 8260 chip, which uses the iwlwifi driver It also requires firmware. I used this guide: https://wiki.gentoo.org/wiki/Iwlwifi

Bluetooth: It has an Intel 8087:0a2b USB Bluetooth device. I configured it so that the kernel recognizes it and is happy, but I don't have a wireless keyboard or mouse, so I may not use it anytime soon, but I'm thinking about using it for audio. The only catch was that before it would connect to my Amazon Echo, I had to tell it, "Alexa, pair with my phone." A volume control shows up with "alsamixer -D bluealsa." The guide I used is: https://wiki.gentoo.org/wiki/Bluetooth

Video: It has both Intel and Nvidia graphics, but unlike some laptops, you can't just ignore the Intel and use the Nvidia graphics--you have to access the Nvidia through the Intel GPU. So you can't install the Nvidia drivers without first installing the Intel drivers.

Flash Reader: It's a Realtek RTS525A PCI device. It requires the kernel to be built with CONFIG_MFD_RTSX_PCI and CONFIG_MMC_REALTEK_PCI. Unlike my old laptop, the card sticks out when inserted, so I can't leave an empty microSD adapter in the laptop.

WebCam: I haven't looked into using the webcam.

Monday, June 19, 2017

Go To Statement Considered Helpful (part 5: Fun with Switches)

For my final segment on the use of gotos, let us not forget one important hidden goto. The switch statement is essentially a goto with a variable label. This means that the if(0) trick can apply here, too.

Before I get into the meat, I want to point out that many of the examples in this section aren't great code. In most cases the best way is to find a more traditional solution. Unlike the previous segments I've written on using gotos, this is mostly fun hacking food for thought, not something you're likely to include in your code.

You're probably vaguely familiar with Duff's Device. Typically you run across it as a loop unrolling technique that a professor mentions in some computer science class, but you then forget about it. Anyway, here's an example. Without Duff's Device, you have a standard loop as below:

for(i=0; i<n; ++i) do_it_once();

With Duff's Device, the loop is unrolled manually as below:

int loops = (n + 3) / 4;
switch(n%4) {
    do {
        case 0: do_it_once();
        case 3: do_it_once();
        case 2: do_it_once();
        case 1: do_it_once();
    } while ( --loops > 0 );
}

That takes a bit to make sense of. The trick is that first time in, the switch jumps to only do the odd cases, then it hits the do/while loop and goes through all the cases for as many times as is needed. The most common case cited for using that code is writing out a buffer to a register. If you're writing to a serial port or something like that, then that's the right way to do it, but if you're doing something more like a buffer copy, a simpler way that is easier to parse would be to pull the loop outside the switch as in the code below. To keep this example parallel to the Duff's device example, the odd instances are done first, though there's no reason for doing it one way or the other. (In fact, for a buffer copy, you might do odd bytes first and last to optimize for alignment, depending on the needs of the system.)

switch(n%4){
    case 3: do_it_once();
    case 2: do_it_once();
    case 1: do_it_once();
}
for(i=0;i<n/4;++i) do_it_four_times();

That's rarely used, but in very specific cases it can be a big win. For example, a memory copy routine may use something very similar to use large registers for the bulk of the data while still supporting odd numbers of bytes. Duff's Device (with the do loop inside the switch statement) has the advantage of eliminating redundant code, especially when the do_it_four_times() call needs to be the same statement four times, and when a larger number is used. If you're working at this level of manual optimization, they key is to try a number of implementations and run performance testing. When combining clever code with optimizing compilers and specific hardware, there are usually more factors than you're taking into account, and you may be surprised at the results.

Now for what I like to call Crow's Corollary to Duff's Device. I sometimes find that I have a series of cases that handle mostly the same thing, but need a slightly different setup. Other cases have completely different code. See the example below.

switch(var) {
    case A:
        buf_to_use = A_BUF;
        use_buf(buf_to_use);
        break;
    case B:
        buf_to_use = B_BUF;
        use_buf(buf_to_use);
        break;
    case C:
        buf_to_use = C_BUF;
        use_buf(buf_to_use);
        break;
    case D:
        // Completely different code from A, B, C
        ...
}

Now imagine that "use_buf()" is 15 or 20 lines of code instead of a simple function. (Yes, I'll accept that the best solution is probably to refactor those 15 to 20 lines into a simple function, but I'm having fun looking at other options.) In order to avoid repeating the code that is the same for the similar cases, you can put the case labels inside an if(0) just like with a goto! Here is the example without the repeated code.

switch(var) {
    case A:
        buf_to_use = A_BUF;
        // Fall through but skip over assignment
        if ( 0 ) {
    case B:
            buf_to_use = B_BUF;
        }
        // Fall through but skip over assignment
        if ( 0 ) {
    case C:
            buf_to_use = C_BUF;
        }
        // General code for cases: A, B, C
        use_buf(buf_to_use);
        break;
    case D:
        // Completely different code from A, B, C
        ...
}

The importance here is the if(0) construct allows you to avoid duplicating the code that is common to several of the cases without requiring that the initial portion of the cases be identical. This could be implemented with a nested switch statement to set up A, B, and C, but programmers are lazy; I've never seen anyone implement the above example with a nested switch to avoid the use_buf() duplication, even if it's 20 lines of code instead of one. Rarely will someone create a function to factor out the duplicated code. Usually I just see the 20 lines of code duplicated. That's bad. That's what if(0) is there to solve.

Of course, this can be implemented more directly using goto:

switch(var) {
    case A:
        buf_to_use = A_BUF;
        goto use_buf;
    case B:
        buf_to_use = B_BUF;
        goto use_buf;
    case C:
        buf_to_use = C_BUF;
        goto use_buf;
    // General code for cases: A, B, C
    use_buf:
        use_buf(buf_to_use);
        break;
    case D:
        // Completely different code from A, B, C
        ...
}

The former version with if(0) blocks interleaved with the case statements is more fun, but it's certainly more confusing and hard to read than simply using a goto. The goto version is also much more friendly to your text editor's auto-indent configuration. (I'm not even sure what "proper" indention is for the interleaved if(0) construct!)

And here's what you would need to write if you were being pedantic about not using gotos:

switch(var) {
    case A:
    case B:
    case C:
        switch(var) {
            case A:
                buf_to_use = A_BUF;
                break;
            case B:
                buf_to_use = B_BUF;
                break;
            case C:
                buf_to_use = C_BUF;
                break;
            default:
                flag_error(); // Should be impossible to reach
                break; // avoid warning
        }
        // General code for cases: A, B, C
        use_buf(buf_to_use);
        break;
    case D:
        // Completely different code from A, B, C
        ...
}

That's clunkier to write, harder to read, and more difficult to maintain. If you change the list of cases that use_buf() applies to, you have to modify the case statements in both switch statements, which might as well be begging for bugs. The repeated case statements in the nested switch statement still constitute duplicated code--exactly what we're trying to avoid by factoring out the use_buf() code. Keep your code clean and concise. Use the right tool for the job, and sometimes 'goto' is the right tool.

If you really want to write the same code without a goto or repeated cases, there is a way, but I don't recommend it. You can interleave another loop inside the switch statement so the breaks for the cases break the loop, not the switch. It may be a fun hack, but it's not good code:

switch(var) {
    // do loop to change where break goes
        do {
    case A:
            buf_to_use = A_BUF;
            break; // exits do loop, not switch
    case B:
            buf_to_use = B_BUF;
            break; // exits do loop, not switch
    case C:
            buf_to_use = C_BUF;
            break; // exits do loop, not switch
        } while (0);
        // General code for cases: A, B, C
        use_buf(buf_to_use);
        break; // regular switch exit
    case D:
        // Completely different code from A, B, C
        ...
}

This is the closest corollary to Duff's Device, but again, don't do that. The above example should be viewed by the compiler as being identical to the goto version. Most C programmers aren't used to mixing up switch statements with other control structures like that, which also makes it harder to understand. Unless you have some arbitrary "no gotos" rule handed down from management, the goto version is the simplest to read and understand. Use the gotos.

Go To Statement Considered Helpful (part 4: Tail Recursion)

One other place where goto can be useful is in implementing tail recursion. As you should recall, tail recursion is when a function returns with a call to itself. For example:

foo(int a)
{
    ...
    if ( recurse )
    {
        return(foo(a+1)); // tail recursion
    }
}

Most compilers will optimize that into a goto to the top of the function. However, in some cases the compiler will fail you, and you have a runtime fault when the stack overflows. Perhaps you turned off optimization to aid in debugging. Perhaps the compiler isn't as smart as it should be. Who knows? If you're running with a limited stack (and stacks are always limited), don't trust your compiler. Replace the recursion with a goto as below:

foo(int a)
{
  tail_recursion:
    ...
    if ( recurse )
    {
        a=a+1;
        goto tail_recursion; // return(foo(a)) without recursion
    }
}

Now you won't blow away your stack if the optimization settings change.

Of course, you can implement that by putting the body of the function inside a while(1){...; break;} loop, and then replace the goto with a continue. However, doing that is just as much of a hack as a goto. It also adds an unnecessary indention level, and it is definitely harder to read than the simple goto.

Unlike the previous examples, I'm not hiding the goto here in a macro. I am using a label name that should tell any experienced programmer what I'm doing. If I were to hide this in a macro, it would be less clear what's going on.

Now you might object that previously I've said to trust the compiler to optimize your code and don't worry about whether it looks like you're adding a few extra instructions. I still stand by that, but this isn't a matter of trusting optimization for a few instructions. This is a matter of your program not crashing. I ran into this exact situation in real code. When the correctness of your code requires optimization, you need to force the code to always be optimized, which is exactly why we need the goto in this case.

Go To Statement Considered Helpful (part 3: Finding Matches in a Loop)

Another time I often find use for a goto is in a for loop that is scanning an array to find a match. If a match is found, a break terminates the loop. See the standard version of this below. It repeats the condition check for the loop to see if the loop terminated early. The other option would be to use a flag to indicate if a match was found. This requires initializing the variable and then setting it if a match is found. Both of these variations are very common.

for (i=0; i<ARRAY_SIZE(data); ++i)
{
    if ( data[i].key == value ) break;
}
if ( i<ARRAY_SIZE(data) )
{
    // data[i].key is our match
    match_found:
    ...
}
else
{
    // No match was found
    ...
}

I hate repeating code; that's a source of bugs, as years later someone might need to make a change and only change one of the two copies. Adding a tracking variable is clunky. Again, goto can save the day. Consider the code below:

for (i=0; i<ARRAY_SIZE(data); ++i)
{
    if ( data[i].key == value ) goto match_found;
}
if ( 0 )
{
    // data[i].key is our match
    match_found:
    ...
}
else
{
    // No match was found
    ...
}

This eliminates the repeated comparison, so if something changes, it only has to change once. A quick reaction that many will have is that this also will be faster than the alternative, but generally compilers are very good at eliminating redundant comparisons, and processors are fast enough that it's rarely worth worrying about saving a few machine instructions; in fact, it's usually best to be willing to sacrifice machine instructions for better code.

Without the goto, you can't scope the loop index to be just the loop, or you end up using an extra boolean tracking variable. Most importantly, without the goto, the code is longer and more complicated, and that's exactly what you don't want. Of course, in most cases, you'll still need the loop index to know which item matched, but sometimes you only need to know if there was a match.

Once again, the C preprocessor comes to the rescue with a clever pair of macros below:

#define EXIT_LOOP_WITH_MATCH(name) goto _loop_match_ ## _name;
#define IF_LOOP_EXITED_WITH_MATCH(_name) if(0) _loop_match_ ## _name:

The macro works just like a regular if statement. You can use it with or without braces, and you can use a regular else condition. The only catch is that the compiler will likely complain about an unused label if you have the IF macro without a matching EXIT macro. That may well be more of a benefit than a restriction.

With the macro, our previous example is nice and clean as seen below:

for (i=0; i<ARRAY_SIZE(data); ++i)
{
    if ( data[i].key == value ) EXIT_LOOP_WITH_MATCH(my_loop);
}
IF_LOOP_EXITED_WITH_MATCH(my_loop)
{
    // data[i].key is our match
    ...
}
else
{
    // No match was found
    ...
}

Those macros may look familiar. They should. They're exactly the same as the THROW and CATCH macros I discussed previously for exception handling in C.

It is interesting to note that this and the previous example for the use of goto (breaking out of a nested loop and avoiding extra code to handle exiting a loop early) are exactly the two cases cited in Kernighan and Ritchie's The C Programming Language book when it discusses the goto statement in section 3.8.

Go To Statement Considered Helpful (part 2: Nested Loops)

If you're in a nested loop in C code, and you want to issue a break or continue statement, but want it to apply to the outer loop, you're going to have some awkward code.

  for(j=0;i<j_end;++j)
  {
      for(i=0;i<end;++i)
      {
          ...
          if (condition) next_j; // break then continue?
          ...
      }
      ...
  }

Implementing that 'next_j' statement could be done with setting a flag, issuing a break, and then checking the flag after the inner loop terminates. What we want is a "goto continue_outer_loop." And when you do this, it can be a perfect place for one of my favorite tricks, the use of "if (0)" for live code. At the end of the outer loop, put in the block of code, "if (0) { continue_outer_loop: continue; }" and you can then use your goto to implement the feature that the language is missing. This looks like:

  for(j=0;i<j_end;++j)
  {
      for(i=0;i<end;++i)
      {
          ...
          if (condition) goto continue_outter_loop;
          ...
      }
      ...
      if(0) continue_outter_loop: continue;
  }

A nice thing to do would be to imagine how the language might be extended to eliminate this need for a goto. Loops could easily be named, allowing the name of the loop to be an optional parameter to a break or continue statement. What if instead of for(i=0;i<10;++i), you could write for(i=0;i<10;++i;index_loop). Then you could use "break index_loop;" inside an inner loop with no need for a goto. Good luck getting the language committees to agree to this anytime soon. On the bright side, Java recognized this shortcoming, so the language provides for named loops; in C and C++, goto lets you implement the missing feature.

In this case, all this can be included in macros, as shown below. The first pass at trying this can result in compiler warnings about unused labels, which is why I added the extra do-nothing gotos. With even the most minimal compiler optimizations, this should provide exactly the same execution as if the language provided for named loops natively. The only restriction is that you can't use the same name for more than one loop in the same function.

/*
 * Named loops:
 *
 * Name any loop, typically the first thing after the opening brace.
 * Then in a nested loop, break or continue from it with
 * BREAK_LOOP/CONTINUE_LOOP.
 *
 * Code Copyright 2015 by Preston Crow
 * used by permission
 * (You have permission to use this as long as you keep
 *  this comment block intact.)
 */
#define NAME_THIS_LOOP(_name)            \
if(0)                                    \
{                                        \
    goto _continue_loop_ ## _name;       \
    _continue_loop_ ## _name : continue; \
}                                        \
if(0)                                    \
{                                        \
    goto _break_loop_ ## _name;          \
    _break_loop_ ## _name : break;       \
}                                        \
do { ; } while (0) /* Force a semicolon */
#define CONTINUE_LOOP(_name) goto _continue_loop_ ## _name
#define BREAK_LOOP(_name) goto _break_loop_ ## _name

Note that the naming macro can go anywhere in the loop where a continue or break statement would work as expected. Using those macros, the code is quite readable:

  for(j=0;i<j_end;++j)
  {
      NAME_THIS_LOOP(outter_loop)
      for(i=0;i<end;++i)
      {
          ...
          if (condition) CONTINUE_LOOP(outter_loop);
          ...
      }
      ...
  }

I thought I was very clever the day I came up with that. I felt a little less clever when researching revealed that others have also suggested very similar macros for the same purpose. In any case, I highly recommend that you use these macros or the like in all C and C++ projects where appropriate.

Go To Statement Considered Helpful (part 1: Exceptions)

The most obvious use of a goto statement in C is for exception handling. Many people before have noted that this is one good use of gotos.

It's not unusual to have a function with a number of checks for errors. When an error is encountered, the function must release locks and free memory acquired and allocated at the start of the function before returning an error to the caller. The simple way to do this is to have all the errors jump to a label near the end of the function where this is done. Often this is the same code that a successful exit from the function also executes. This is certainly better than repeating the same code many different times in the same function. Newer languages have recognized this and implemented this functionality in the language. In C++, you can use throw/catch. In Java, there is also the try/finally construct that can also handle many of these situations.

In C, to handle exceptions, you will often see code with a goto as below:

    buf=malloc(size);
    if ( !buf ) goto error_exit;
    ...
  error_exit:
    printf("Out of memory\n");
    ...

This code is typical enough that many coders are happy with it like that. But what if we could implement exception handling like C++ in C using macros? Well, we can. Two language tricks make this possible. First, you can use "if (0)" to create code that can only be reached by a goto. Second, the target of an if statement can be a labeled statement or block. I'll get back to that in a bit. Here are the macros:

#define THROW(_name) goto handle_exception_ ## _name;
#define CATCH(_name) if(0) handle_exception_ ## _name:

Now there's no need to complain that C lacks exception handling. This works much like the C++ version, but without the parameter. Also, they are scoped for the whole function, as they aren't connected to a “try{...}” block of statements. See the example below:

buf=malloc(size);
if ( !buf ) THROW(out_of_memory);
...
CATCH(out_of_memory) {
    printf("Out of memory\n");
}
if (buf) free(buf);
return;

Wait a second. What's going on with those macros? That's some crazy code that can't be right.

The first trick is to protect the target of the goto in the target of an if(0) statement. That means the target of the goto is in code that can't be reached without the goto. This trick reduces some of the spaghetti code issues normally associated with goto statements.

But what about that label after the if(0)?

Well, I'm taking advantage of a surprising glitch in the C language definition. When I first wrote the code, I was thinking that what I needed for my macro was to be able to write: if(0) label: { statements; } but putting the label in there certainly wouldn't be allowed. Much to my surprise, it compiled and worked! I went back and checked the language definition, and found that it is, in fact, legal code. I can't believe they intended this when defining the language, but as a lucky artifact of how they set up the parsing, it is how they defined it. (Perhaps it was because case statements in a switch work just like labels. The reason isn't particularly important here.) This allows you to macro-ize the CATCH without having any ugliness with the target and braces. You can follow the CATCH macro with a block of statements, a single statement, or just a semicolon to follow the same code that follows.

By hiding all this in the macro, I've effectively extended the C language to add a missing feature.

Go To Statement Considered Harmful (revisited)

Hopefully you've heard of the famous letter by Edsger Dykstra, "Go To Statement Considered Harmful." This was one of the most influential letters ever written in Computer Science, published in Communications of the ACM in March, 1968. This letter effectively defined structured programming for generations of coders. It has been taken as gospel and taught in so many programming classes that most developers take it for granted. The goto statement is just too low-level for modern programming.

This has been taken so far that some languages simply didn't include goto in the language. Java is a prime example. Though in that case, they were hedging their bets, as they kept a reserved word for it, and they did include it in the bytecode.

The letter itself is rather dry and academic. You can find it online, and it's worth the read. (I found it at https://www.cs.utexas.edu/~EWD/transcriptions/EWD02xx/EWD215.html under the original title; it was the ACM editor that selected the now-famous "harmful" title.) It has good points. In general, the use of goto results in spaghetti code that is hard to debug and harder to maintain. Clean code breaks things into functions, loops, and other well-defined control structures, with little obvious need for direct jumps.

I first heard of this when my world of programming consisted of Atari BASIC. The idea of not using goto was rather shocking in that context, but then most BASIC implementations of the day lacked functions, while loops, and a number of other higher-level structures that even shell scripts have today. When I moved on to Pascal and C, there seemed to be little use for the old goto, and I readily accepted the idea that it was a bad idea.

But the idea of avoiding direct jumps has been treated as a religious dogma instead of as a general guideline, and this has been harmful itself. There are many reasons why using an explicit goto is a good idea. In fact, no matter how nicely structured your code is, it's impossible to avoid that pesky goto. How do you think all those nice loops are implemented? At the assembly level, it's all comparisons and jumps.

Now you can say that that's just a technical nitpick, and you're mostly correct. Dykstra even referenced it in his original letter. Just because the good structured statements your language provides are implemented with jumps doesn't mean you should use them directly. However, what it does do is give a good consistent reason for when you should use gotos in your code.

Use a goto only when you need to implement something that your language doesn't provide for.

Following this, I've written a series of examples where a goto statement provides some useful and interesting features for C programmers.

Thursday, May 25, 2017

Nice Code Blocks in Blogger

This post is mostly for my own use, but I figure some others might like it, too. You may have noticed that I have some nice code blocks in some of my posts. I may start with something like:

function(args)
{
if ( maybe ) do_something();
}

(I set the font to offset the above code using the font-selector make it stand out a bit.)

How to I get those nice looking blocks?

To start, I use a site to format the blocks:

http://hilite.me/

That generates HTML that I can paste into my post. When editing, normally you're in "compose" mode, but near the top left, you can hit "HTML" to get into the raw code of the page. I just paste the results from the web page. Make sure you don't put it in the middle of a <div></div> pair. I often put a lone line of text where I want to paste into the page to make it easy to find. Something like this:

==========PASTE_CODE_BLOCK_HERE==========

So doing that, now I have this:

function(args)
{
    if ( maybe ) do_something();
}

After that, I hit "preview." There I'll see that with a white background, I see lots of off-white text on white. Not good. (This is because my normal page style is light text on dark, and the highlighter page assumes you have dark text on light and doesn't override it.) At the top of the HTML block I paste in, there are some HTML style options separated by semicolons. I need to add in, "color:black;" to correct the color issue:

function(args)
{
    if ( maybe ) do_something();
}

That's nice, but I don't like that it highlights spaces. This is a simple matter of removing the "<span ...> </span>" tags around the spaces. But wait, there's a reason those are there: it's based on where I'm pasting from. If I'm pasting from a web page with formatting, some of that is getting picked up by the highlighter. As a general rule, never paste into Blogger from a web page or formatted document--it will mess up all sorts of things. I have to make sure I'm posting raw ASCII text if I want a nice clean block of code. So going back to the original code, pasting it first into a text editor, and then into the highlighter, and editing the default text color:

function(args)
{
    if ( maybe ) do_something();
}

So there's just one change, but to be sure I remember it and to make it easy, I'll save a one-line command line for fixing things. I toss in stripping out stray blank lines. Since I want to buffer all the input before printing the output, I'll pipe the output to a tail command. I then add a final comment line and a blank line to keep things easier to follow when editing the page:

sed -e 's/border: *solid gray/&;color:black/' \
    -e 's@<span[^>]*> </span>@ @g'|
    grep -v '^$' |
    tail -n 10000 ; \
    echo '<!-- END HTML generated using hilite.me -->'; \
    echo ''

One simple hint: You're free to leave blank lines in the HTML version of the page outside of the block generated by the web page. This may make it simpler to find and update the blocks as needed, but Blogger sometimes removes them. In any case, they're harmless.

Another thing to watch for is that the Blogger composer sometimes gets confused as to the line breaks around code blocks. Always check the preview--there are often line breaks where the composer doesn't show them between your text and the code block.

For this post I used the "colorful" style. I think I'll use "emacs" in most of my posts. Somehow that color scheme looks natural to me.

Thursday, May 4, 2017

Digital Rights Management (DRM) and Star Wars

Many of my friends have heard me complain (OK, rant) about DRM. DRM officially stands for "digital rights management," but I prefer "digital restrictions management." The most common use of DRM is for video like Blu Ray discs and Netflix. DRM software enforces the restrictions imposed by the seller. You only get to use the media in the way they intended, no matter how much money you spent to buy it. You don't fully own that disc you just bought. Want to skip the previews? Sorry. Want to skip the copyright notice? No way.

Of course, DRM doesn't stop the professional copiers. People in various countries will take one legal disc and press duplicates that are completely identical, bypassing all the copy protections. DRM value: zero. Semi-professional pirates find various ways of bypassing the protection, including ripping from the HDMI output with hacks to remove the copy protection there, and then they share the videos online. DRM value: none. Regular owners who want to copy the movie to their iPad to watch in the car can't. Yes, some discs offer alternatives that may or may not work offline. DRM value: consumers annoyed.

But let's look at the worst impact of DRM ever: Star Wars.

In particular, have you ever wondered why Princess Leia had to send the Death Star plans with R2D2 instead of transmitting them to another ship? Or why they never made copies and sent them with multiple courriers? Clearly the plans were encumbered with the nastiest DRM ever invented. Want to copy them? No way. Want to transmit them without destroying the original? Nope, not allowed. Want to analyze them to find a design flaw allowing you to destroy the Death Star using small fighters too small to be considered a serious threat? No problem, but only if you watch the Imperial Secret Documents crawl first.

So remember, DRM isn't just about stopping those nasty consumers from watching movies they've legitimately purchased on multiple devices. DRM is also about preserving the Empire. Support DRM to help destroy the last vestiges of the Republic.

Happy Star Wars Day. May the Fourth be with you.