Implications of pure and constant functions

This article brought to you by LWN subscribers

Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

June 10, 2008

This article was contributed by Diego Pettenò

Introduction

Attributes and why you should use them

Free Software development is often a fun task for developers, and it is its low barrier to entry (on average) that makes it possible to have so much available software for so many different tasks. This low barrier to entry, though, is also probably the cause of the widely varying quality of the code of these projects.

Most of the time, the quality issues one can find are not related to developers' lack of skill, but rather to lack of knowledge of how the tools work, in particular, the compiler. For non-interpreted languages, the compiler is probably the most complex tool developers have to deal with. Because a lot of Free Software is written in C, GCC is often the compiler of choice.

Modern compilers are also supposed to do a great job at optimizing the code by taking code, often written with maintainability and readability in mind, and translating it into assembler code with a focus on performance. Code analysis for optimization (which is also used for warnings about the code) has the task of taking a semantic look at the code, rather than syntactic, and identifying various fragments of algorithms that can be replaced with faster code (or with code that uses a smaller memory footprint, if the user desires to do so).

This task is a pretty complex one and relies on the compiler knowing about the function called by the code. For instance, the compiler might know when to replace a call to a (local, static) function with its body (inlining) by looking at its size, the number of times it is called, and its content (loops, other calls, variables it uses). This is because the compiler can give a semantic value to the code for a function, and can thus assess the costs and benefits of a particular transformation at the time of its use.

I specified above that the compiler knows when to inline a function by looking at its content. Almost all optimizations related to function calls work this way: the compiler, knowing the body of a function, can decide when it's the case to replace a call with its body; when it is possible to completely avoid calling the function at all; and when it is possible to call it just once and thereby avoid multiple calls. This means, though, that these optimization can be applied only to functions that are defined in the same unit wherein they are used. These functions are usually limited to static functions (functions that are not defined as static can often be overridden both at link time and runtime, so the compiler cannot safely assume that what it finds in the unit is what the code will be calling).

As this is far from optimal, modern compilers like GCC provide a way for the developer to provide information about the semantics of a function, through the use of attributes attached to declarations of functions and other symbols. These attributes provide information to the compiler on what the function does, even though its body is not available. Consequently, the compiler can optimize at least some of its calls.

This article will focus on two particular attributes that GCC makes available to C developers: pure and const, which can declare a function as either pure or constant. The next section will provide a definition of these two kinds of functions, and after that I'll get into an analysis of some common optimizations that can be performed on the calls of these functions.

As with all the other function attributes supported by GCC and ICC, the pure and const attributes should be attached to the declarative prototype of the function, so that the compiler know about them when it finds a call to the function even without its definition. For static functions, the attribute can be attached to the definition by putting it between the return type and the name of the function:

    int extern_pure_function([...])
      __attribute__((pure));
    int extern_const_function([...])
      __attribute__((const));

    int __attribute__((pure)) static_pure_function([...]) {
      [...]
    }

    int __attribute__((const)) static_const_function([...]) {
      [...]
    }

Pure and Constant Functions

For what concerns the scope of this article, functions can be divided into three categories, from the smallest to the biggest: constant functions, pure functions and the remaining functions can be called normal functions.

As you can guess, constant functions are also pure functions, but ~~pure functions cannot be~~ not all pure functions are constant functions. In many ways, constant functions are a special case of pure functions. It is, therefore, best to first define pure functions and how they differ from all the rest of the functions.

A pure function is a function with basically no side effect. This means that pure functions return a value that is calculated based on given parameters and global memory, but cannot affect the value of any other global variable. Pure functions cannot reasonably lack a return type (i.e. have a void return type).

GCC documentation provides strlen() as an example of a pure function. Indeed, this function takes a pointer as a parameter, and accesses it to find its length. This function reads global memory (the memory pointed to by parameters is not considered a parameter), but does not change it, and the value returned derives from the global memory accessed.

A counter-example of a non-pure function is the strcpy() function. This function takes two pointers as parameters. It accesses the latter to read the source string, and the former to write to the destination string. As I said, the memory areas pointed to by the parameters are not parameters on their own, but are considered global memory and, in that function, global memory is not only accessed for reading, but also for writing. The return value derives directly from the parameters (it is the same as the first parameter), but global memory is affected by the side effect of strcpy(), making it not pure.

Because the global memory state remains untouched, two calls to the same pure function with the same parameters will have to return the same value. As we'll see, it is a very important assumption that the compiler is allowed to make.

A special case of pure functions is constant functions. A pure function that does not access global memory, but only its parameters, is called a constant function. This is because the function, being unrelated to the state of global memory, will always return the same value when given the same parameters. The return value is thus derived directly and exclusively from the values of the parameters given.

The way a constant function "consumes" pointers is very different from the way other functions do: it can handle them as both parameter and return value only if they are never dereferenced, for accessing the memory they are referencing would be a global memory access, which breaks the requirements of constant functions.

Of course these requirements have to apply not only to the operations in the given function, but also recursively to all the functions it calls. One function can at best be of the same kind of the least restrictive kind of function it calls. So when it calls a normal function it can't be but a normal function itself, if it only calls pure functions it can be either pure or normal, but not constant, and if it only calls constant functions it can be constant.

As with inlining, the compiler will be able to decide if a function is pure or constant, in case no attribute is attached to it, only if the function is static (with the exception of special cases for freestanding code and other advanced options). When a function is not static, even if it's local, the compiler will assume that the function can be overridden at link- or run-time so it will not make any assumption based on the body for the definition it may find.

Optimizing Function Calls

Why should developers bother with marking functions pure or constant, though? As I said, these two attributes help the compiler to know some semantic meaning of a function call, so that it can apply higher optimization than to normal functions.

There are two main optimizations that can be applied to these kinds of functions: CSE (Common Sub-expression Elimination) and DCE (Dead Code Elimination). We'll soon see in detail, with the help of the compiler itself, what these two consist of. Their names, however, are already rather explicit: CSE is used to avoid duplicating the same code inside a function, usually factoring out the code before branching or storing the results of common operations in temporary variables (registers or stack), while DCE will remove code that would never be executed or that would be executed but never used.

These are both optimization that can be implemented in the source code, to an extent, reducing the usefulness of declaring functions pure or constant. On the other hand, as I'll demonstrate, doing so often reduces the readability of the code by obscuring the actual algorithm in favor of making it faster. This does not apply to all cases though, sometimes, doing the optimization "manually", directly in the source code, makes it more readable, and makes the code resemble the output of the compiler more.

About Assemblers and Examples

When talking about optimization, it's quite difficult to visualize the task of the compiler, and the way the code morphs from what you read in the C source code into what the CPU is really going to execute. For this reason, the best way to write about them is to use examples, showing what the compilers generates starting from the source code.

Given the way in which GCC works, this is actually quite easy. You just need to enable optimization and append the -S switch to the gcc command line. This switch stops the compiler after the transformation of C source code into assembly, before the result is passed to the assembler program to produce the object file.

Although I suspect a good fraction of the people reading this article would be comfortable reading IA-32 or x86-64 assembly code, I decided to use the Blackfin ^[1] assembly language, which should be readable for people who have never studied a particular assembly language.

The Blackfin assembler is more symbolic than IA-32: instead of having operations named movl and addq, the operations are identified by their algebraic operators (=, +), while the registers are merely called R1, R2 and so on.

Calling conventions are also quite easy to understand: for all the cases we'll look through in the article (at most four parameters, integers or pointers), the parameters are passed through the registers, starting in order from R0. The return value of the function call is also stored in the R0 register.

To clarify the examples which will appear later on, let's see how the following C source code is translated by GCC into Blackfin code:

    int somefunction(int a, int b, int c);
    void somestringfunction(char *pA, char *pB);

    int globalvar;

    void test() {
      somestringfunction("foo", "bar");
      globalvar = somefunction(11, 22, 33);
    }

becomes:

	    .section	.rodata
	    .align 4
    L$LC$0:
	    .string	"foo"
	    .align 4
    L$LC$1:
	    .string	"bar"
    .text;
	    .align 4
    .global _test;
    .type _test, STT_FUNC;
    _test:
	    LINK 12;
	    R0.H = L$LC$0; 
	    R0.L = L$LC$0;
	    R1.H = L$LC$1;
	    R1.L = L$LC$1;
	    call _somestringfunction; 
	    R0 = 11 (X); 
	    R1 = 22 (X);
	    R2 = 33 (X);
	    call _somefunction;
	    P2.H = _globalvar; 
	    P2.L = _globalvar;
	    [P2] = R0; 
	    UNLINK;
	    rts; 
	    .size	_test, .-_test

As the Blackfin does not have 32 bit immediate load, you have to load high and low addresses separately (in whichever order); the assembler will take care of properly loading the high 16 bits of the label to the upper part of the register, and the low 16 bits to the lower part.

Once the parameters are loaded, the function is called almost identically to any other call operation on other architectures; note the prefixed underscore on symbols' names.

Integers, both constant or parameters and variables, are also loaded for calls in the registers. Blackfin doesn't have 32 bit immediate loading, but if the constant to load fits into 16 bits, it can be loaded through sign extension by appending the (X) suffix.

When accessing a global memory location, the P2 pointer is set to the address of the memory location...

... and then dereferenced to assign that memory area. Being a RISC architecture, Blackfin does not have direct memory operations.

The return value for a function is loaded into the R0 register, and can be accessed from there.

The rts command is the return from subroutine, and usually indicates the end of the function, but like the return statement in C, it might appear in any place of the routine.

In the following examples, the preambles with declarations and data will be omitted whenever these are not useful to the discussion.

Concerning optimization levels, the code will almost always be compiled with at least the first optimization level enabled (-O1). This both because it makes the code cleaner to read (using register-register copy for parameters passing, instead of saving to the stack and then restoring from that) and because we need optimization enabled to see how they are applied.

Also, most of the times I'll refer to the fastest alternative. Most of what I say, though, applies also to the smaller alternative when using the -Os optimization level. In any case, the compiler always weighs the cost-to-benefit ratio between the optimized and the unoptimized version, or between different optimized versions. If you want to know the exact route the compiler takes for your code, you can always use the -S switch to find out.

DCE and Unused Variables

One area where DCE is useful is to avoid operations that result in unused data. It's not that uncommon that a variable is defined by an operation, complex or not, and is then never used by the code, either because it is intended for future expansion or because it's a remnant of older code that has been removed or replaced. While the best thing would be to get rid of the definition entirely, users expect the compiler to produce a good result with sloppy code too, and that operation should not be emitted.

The DCE pass can remove all the code that has no side effect, when its result is not used. This includes all mathematical operations and functions known to be pure or constant (as neither are allowed to change the global state of the variables). If a function call is not known to be at least pure, it may change the global state, and its call will not be eliminated, as shown in the following code:

    int someimpurefunction(int a, int b);
    int somepurefunction(int a, int b)
      __attribute__((pure));

    int testfunction(int a, int b, int c) {
      int res1 = someimpurefunction(c, b);
      int res2 = somepurefunction(b, c);
      int res3 = a + b - c;

      return a;
    }

Which, once compiled with -O1, ^[2] produces the following Blackfin assembler:

    _testfunction:
	    [--sp] = ( r7:7 );

	    LINK 12;
	    R7 = R0;
	    R0 = R2;
	    call _someimpurefunction;
	    R0 = R7;
	    UNLINK;
	    ( r7:7 ) = [sp++];

	    rts;

As you can see, the call to the pure function has been eliminated (the res2 variable was not being used), together with the algebraic operation but, the impure function, albeit having its return value discarded, is still called. This is due to the fact that the compiler emits the call, not knowing whether the latter function has side effects on the global memory state or not.

This is equivalent to the following code (which produces the same assembler code):

    int someimpurefunction(int a, int b);

    int testfunction(int a, int b, int c) {
      someimpurefunction(c, b);

      return a;
    }

In those sources, you had to declare variables at the top of the function, and then start to check for prerequisites. If you wanted to make it explicit that some variable had to keep its value, by making it constant, you would often have to fill them before the prerequisites could be checked.

Without discussing legacy code, it is also useful when writing debug code, so that it doesn't look out of place from the use of lots of #ifdef directives. Take for instance the following code:

    #ifdef NDEBUG
    # define assert_se(x) (x)
    #else
    void assert_se(int boolean);
    #endif

    char *getsomestring(int i) __attribute__((pure));
    int dosomethinginternal(void *ctx, int code, int val);

    int dosomething(void *ctx, int code, int val) {
      char *string = getsomestring(code);

      // returning string might be a sub-string of "something"
      // like "some" or "so"
      assert_se(strncmp(string, "something", strlen(string)) == 0);

      return dosomethinginternal(ctx, code, val);
    }

The assert_se macro has different behavior from the standard assert, as it has side effects, which basically means that the code passed to the assertion is called even though the compiler is told to disable debugging. This is a somewhat common trick, although its effects on readability are debatable.

With getsomestring() pure, when compiling without debugging, the DCE will remove the calls to all three functions: getsomestring(), strncmp() and strlen() (the latter two are usually declared as pure by both the C library and by GCC's built-in replacements). This because none of these functions have a side effect, resulting in a very short function:

    _dosomething:
	    LINK 0;
	    UNLINK;
	    jump.l _dosomethinginternal;

If our getsomestring() function weren't pure, even though its return value is not going to be used, the compiler would have to emit the call, resulting in rather more complex (albeit still simple, compared with most real-world functions) assembler code:

    _dosomething:
	    [--sp] = ( r7:5 );

	    LINK 12;
	    R7 = R0;
	    R0 = R1;
	    R6 = R1;
	    R5 = R2;
	    call _getsomestring;
	    UNLINK;
	    R0 = R7;
	    R1 = R6;
	    R2 = R5;
	    ( r7:5 ) = [sp++];

	    jump.l _dosomethinginternal;

Common Sub-expression Elimination

The Common Sub-expression Elimination optimization is one of the most important optimizations performed by the compiler, because it's the one that, for instance, replaces multiple indexed accesses to an array so that the actual memory address is calculated just once.

What this optimization does is to find common operations executed on the same operands (even when they are not known at compile-time), decide which ones are more expensive than saving the result in a temporary (register or stack), and then swapping the code around to take the cheapest course.

While its uses are quite varied, one of the easiest ways to see the work of the CSE is to look at the code generated when using the ternary if operator. Let's take the following code:

    int someimpurefunction(int a);
    int somepurefunction(int a)
      __attribute__((pure));

    int testfunction(int a, int b, int c, int d) {

      int res1 = someimpurefunction(a) ? someimpurefunction(a) : b;
      int res2 = somepurefunction(a) ? somepurefunction(a) : c;
      int res3 = a+b ? a+b : d;

      return res1+res2+res3;
    }

The compiler will optimize the code as:

    _testfunction:
	    [--sp] = ( r7:4 );

	    LINK 12;
	    R7 = R0;
	    R5 = R1;
	    R4 = R2;
	    call _someimpurefunction;
	    cc =R0==0;
	    if !cc jump L$L$2;
	    R6 = R5;
	    jump.s L$L$4;
    L$L$2:
	    R0 = R7;
	    call _someimpurefunction;
	    R6 = R0;
    L$L$4:
	    R0 = R7;
	    call _somepurefunction;
	    R1 = R0;
	    cc =R0==0;
	    if cc R1 =R4; /* movsicc-1b */
	    R0 = R5 + R7;
	    cc =R0==0;
	    R2 = [FP+36];
	    if cc R0 =R2; /* movsicc-1b */
	    R1 = R1 + R6;
	    R0 = R1 + R0;
	    UNLINK;
	    ( r7:4 ) = [sp++];

	    rts;

As you can see, the pure function is called just once, because the two references inside the ternary operator are equivalent, while the other one is called twice. This is because there was no change to global memory known to the compiler between the two calls of the pure function (the function itself couldn't change it – note that the compiler will never take multi-threading into account, even when asking for it explicitly through the -pthread flag), while the non-pure function is allowed to change global memory or use I/O operations.

The equivalent code in C would be something along the following lines (it differs a bit because the compiler will use different registers):

    int someimpurefunction(int a);
    int somepurefunction(int a)
      __attribute__((pure));

    int testfunction(int a, int b, int c, int d) {

      int res1 = someimpurefunction(a) ? someimpurefunction(a) : b;
      const int tmp1 = somepurefunction(a);
      int res2 = tmp1 ? tmp1 : c;
      const int tmp2 = a+b;
      int res3 = tmp2 ? tmp2 : d;

      return res1+res2+res3;
    }

The Common Sub-expression Elimination optimization is very useful when writing long and complex mathematical operations. The compiler can find common calculations even though they don't look common to the naked eye, and act on those.

Although sometimes you can get away with using multiple constants or variables to carry out temporary operations so that they can be re-used in the following calculations, leaving the formulae entirely explicit is usually more readable, as long as the formulae are not intended to change.

Like with other algorithms, there are some advantages to reducing the source code used to calculate the same thing; for instance you can easily make a change directly to the definition of a constant and get the change propagated to all the uses of that constant. On the other hand, this can be quite a problem if the meaning of two calculations is very different (and thus can vary in different ways with the evolution of the code), and just happen to be calculated in the same way at a given time.

Another rather useful place where the compiler can further optimize code with CSE, where it wouldn't be so nice or simple to do manually in the source code, is where you deal with static functions that are inlined by the compiler.

Let's examine the following code for instance:

    extern int a;
    extern int b;

    static inline int somefunc1(int p) {
      return (p * 16) + (3 << a);
    }

    static inline int somefunc2(int p) {
      return (p * 16) + (4 << b);
    }

    extern int res1;
    extern int res2;
    extern int res3;
    extern int res4;

    void testfunc(int p1, int p2)
    {
      res1 = somefunc1(p1);
      res2 = somefunc2(p1);
      res3 = somefunc1(p2);
      res4 = somefunc2(p2);
    }

In this code, you can find four basic expressions: (p1 * 16), (p2 * 16), (3 << a) and (4 << b). Each of these four expressions is used twice in the somefunc() function. Thanks to the CSE, though, the code will calculate each of them once, even though they cross the function boundary, producing the following code:

    _testfunc:
	    [--sp] = ( r7:7 );

	    LINK 0;

	    R0 <<= 4;
	    R1 <<= 4;

	    P2.H = _a;
	    P2.L = _a;
	    R2 = [P2];
	    R7 = 3 (X);
	    R7 <<= R2;

	    P2.H = _b;
	    P2.L = _b;
	    R2 = [P2];
	    R3 = 4 (X);
	    R3 <<= R2;

	    R2 = R0 + R7;
	    P2.H = _res1;
	    P2.L = _res1;
	    [P2] = R2;
	    P2.H = _res2;
	    P2.L = _res2;
	    R0 = R0 + R3;
	    [P2] = R0;

	    R7 = R1 + R7;
	    P2.H = _res3;
	    P2.L = _res3;
	    [P2] = R7;
	    R1 = R1 + R3;
	    P2.H = _res4;
	    P2.L = _res4;
	    [P2] = R1;
	    UNLINK;
	    ( r7:7 ) = [sp++];

	    rts;

As you can easily see (the assembly was modified a bit to improve its readability, the compiler re-ordered loads of registers to avoid pipeline stalls, making it harder to see the point), the four expressions are calculated first, and stored respectively in the registers R0, R1, R7 and R3.

These kinds of sub-expressions are usually harder to see in the code and also harder to implement. Sometimes they get factored out on their own parameter, but that can be more expensive during execution, depending on the calling conventions of the architecture.

Cheats

As I wrote above, there are some requirements that apply to functions that are declared pure and constant, related to not changing or accessing global memory; not executing I/O operations; and, of course, not calling further impure functions. The reason for this is that the compiler will accept what the user declares the function to be, whatever its body is (as it's usually unknown by the compiler at the call stage).

Sometimes, though, it's possible to fool the compiler so that it treats impure functions as pure or even constant functions. Although this is a risky endeavor, as it might truly cause bad code generation by the compiler, it can sometimes be used to force optimization for particular functions.

An example of this can be a lookup function that scans through a global table to return a value. While it is accessing global memory, you might want the compiler to promote it to a constant function, rather than simply to a pure one.

Let's take for instance the following code:


    const struct {
      const char *str;
      int val;
    } strings[] = {
      { "foo", 31 },
      { "bar", 34 },
      { "baz", -24 }
    };

    const char *lookup(int val) {
      int i;
      for(i = 0; i < sizeof(strings)/sizeof(*strings); i++)
	if ( strings[i].val == val )
	  return strings[i].str;

      return NULL;
    }

    void testfunction(int val, const char **str, unsigned long *len) {
      if ( lookup(val) ) {
	*str = lookup(val);
	*len = strlen(lookup(val));
      }
    }

If the lookup() function is only considered a pure function, as it is, adhering to the rules we talked about at the start of the article, it will be called three times in testfunction(), like this:

    _testfunction:
	    [--sp] = ( r7:7, p5:4 );

	    LINK 12;
	    R7 = R0;
	    P5 = R1;
	    P4 = R2;
	    call _lookup;
	    cc =R0==0;
	    if cc jump L$L$17;
	    R0 = R7;
	    call _lookup;
	    [P5] = R0;
	    R0 = R7;
	    call _lookup;
	    call _strlen;
	    [P4] = R0;
    L$L$17:
	    UNLINK;
	    ( r7:7, p5:4 ) = [sp++];

	    rts;

Instead, we can trick the compiler by declaring the lookup() function as constant (the data it is reading is constant, after all, so at a given parameter it will always return the same result). If we do that, the three calls will have to return the same value, and the compiler will be able to optimize them as a single call:

    _testfunction:
	    [--sp] = ( p5:4 );

	    LINK 12;
	    P5 = R1;
	    P4 = R2;
	    call _lookup;
	    cc =R0==0;
	    if cc jump L$L$17;
	    [P5] = R0;
	    call _strlen;
	    [P4] = R0;
    L$L$17:
	    UNLINK;
	    ( p5:4 ) = [sp++];

	    rts;

In addition to lookup functions on constant tables, this trick is useful with functions which read data from files or other volatile data, and cache it in a memory variable.

Take for instance the following function that reads an environment variable:

    char *get_testval() {
      static char *cachedval = NULL;

      if ( cachedval == NULL ) {
	cachedval = getenv("TESTVAL");
	if ( cachedval == NULL )
	  cachedval = "";
	else
	  cachedval = strdup(cachedval);
      }

      return cachedval;
    }

This is not truly a constant function, as its return value depends on the environment. Even so, assuming that the environment of the process is left untouched, its return value will never change between calls. Even though it will affect the global state of the program (as the cachedval static variable will be filled in the first time the function is called), it can be assumed to always return the same value.

Tricking the compiler into thinking that a function is constant even though it has to load data through I/O operations, as I said, is risky, as the compiler will think there is no I/O operation going on; on the other hand, this trick might make a difference sometimes, as it allows the expression of functions in more semantic ways, leaving it up to the compiler to optimize the code with temporaries, where needed.

One example can be the following code:

    char *get_testval() {
      static char *cachedval = NULL;

      if ( cachedval == NULL ) {
	cachedval = getenv("TESTVAL");
	if ( cachedval == NULL )
	  cachedval = "";
	else
	  cachedval = strdup(cachedval);
      }

      return cachedval;
    }

    extern int a;
    extern int b;
    extern int c;
    extern int d;

    static int testfunc1() {
      if ( strcmp(get_testval(), "FOO") == 0 )
	return a;
      else
	return b;
    }

    static int testfunc2() {
      if ( strcmp(get_testval(), "BAR") == 0 )
	return c;
      else
	return d;
    }

    int testfunction() {
      return testfunc1() + testfunc2();
    }

Note: To make sure that the compiler won't reduce the three function calls to their return values right away, the static sub-functions return values taken from global variables; the meanings of those variables are not important.

Considering the above source code, if get_testval() is impure, as the compiler will automatically find it to be, it will be compiled into:

    _testfunction:
	    [--sp] = ( r7:7 );

	    LINK 12;
	    call _get_testval;
	    R1.H = L$LC$2;
	    R1.L = L$LC$2;
	    call _strcmp;
	    cc =R0==0;
	    if !cc jump L$L$11 (bp);
	    P2.H = _a;
	    P2.L = _a;
	    R7 = [P2];
    L$L$13:
	    call _get_testval;
	    R1.H = L$LC$3;
	    R1.L = L$LC$3;
	    call _strcmp;
	    cc =R0==0;
	    if !cc jump L$L$14 (bp);
	    P2.H = _c;
	    P2.L = _c;
	    R0 = [P2];
	    UNLINK;
	    R0 = R0 + R7;
	    ( r7:7 ) = [sp++];

	    rts;
    L$L$11:
	    P2.H = _b;
	    P2.L = _b;
	    R7 = [P2];
	    jump.s L$L$13;
    L$L$14:
	    P2.H = _d;
	    P2.L = _d;
	    R0 = [P2];
	    UNLINK;
	    R0 = R0 + R7;
	    ( r7:7 ) = [sp++];

	    rts;

As you can see, the get_testval() is called twice, even though its result will be identical. If we declare it constant, instead, the code of our test function will be the following:

    _testfunction:
	    [--sp] = ( r7:6 );

	    LINK 12;
	    call _get_testval;
	    R1.H = L$LC$2;
	    R1.L = L$LC$2;
	    R7 = R0;
	    call _strcmp;
	    cc =R0==0;
	    if !cc jump L$L$11 (bp);
	    P2.H = _a;
	    P2.L = _a;
	    R6 = [P2];
    L$L$13:
	    R1.H = L$LC$3;
	    R0 = R7;
	    R1.L = L$LC$3;
	    call _strcmp;
	    cc =R0==0;
	    if !cc jump L$L$14 (bp);
	    P2.H = _c;
	    P2.L = _c;
	    R0 = [P2];
	    UNLINK;
	    R0 = R0 + R6;
	    ( r7:6 ) = [sp++];

	    rts;
    L$L$11:
	    P2.H = _b;
	    P2.L = _b;
	    R6 = [P2];
	    jump.s L$L$13;
    L$L$14:
	    P2.H = _d;
	    P2.L = _d;
	    R0 = [P2];
	    UNLINK;
	    R0 = R0 + R6;
	    ( r7:6 ) = [sp++];

	    rts;

The CSE pass combines the two calls to get_testval with one. Again, this is one of the optimizations that are harder to achieve by manually changing the source code since the compiler can have a larger view of the use of its value. A common way to handle this is by using global variables, but that might require one more load from the memory, while CSE can take care of keeping the values in registers or on the stack.

Conclusions

After what you have read about pure and constant functions, you might have some concerns about the average use of them. Indeed, in a lot of cases, these two attributes allow the compiler to do something you can easily achieve by writing better code.

There are two objectives you have to keep in mind that are related to the use of these (and other) attributes. The first is code readability because sometimes the manually optimized functions are harder to read than what the compiler can produce. The second is allowing the compiler to optimize legacy or external code.

While you might not be too concerned with letting legacy code or code written by someone else get away with slower execution, a pragmatic view of the current Free Software world should take into consideration the fact that there are probably thousands lines of code of legacy code around. Some of that code, written with pre-C99 declarations, might be even using libraries that are being developed with their older interface, which could be improved by providing some extra semantic information to the compiler through use of attributes.

Also, it's unfortunately true that extensive use of these attributes might be seen by neophytes as an easy solution to let sloppy code run at a decent speed. On the other hand, the same attributes could be used to identify such sloppy code through analysis of the source code.

Although GCC does not issue warnings for all of these cases, it already warns for some of them, like unused variables, or statements without effect (both triggered by the DCE). In the future more warnings might be reported if pure and constant functions get misused.

In general, like with many other GCC function attributes, their use is tightly related to how programmers perceive their task. Most pragmatic programmers would probably like these tools, while purists will probably dislike the way these attributes help sloppy code to run almost as fast as properly written code.

My hopes are that in the future better tools will make good use of these and other attributes on different levels than compilers, like static and dynamic analyzers.

^[1] The Blackfin architecture is a RISC architecture developed by Analog Devices, supported by both GCC and Binutils (and Linux, but I'm not interested in that here).

^[2] I have chosen -O1 rather than -O2 because in the latter case the compiler performs extra optimization passes that I do not wish to discuss within the scope of this article.

Index entries for this article
GuestArticles	Pettenò, Diego

(Log in to post comments)

Implications of pure and constant functions

Posted Jun 10, 2008 21:31 UTC (Tue) by nix (subscriber, #2304) [Link]

Good article.

Just one point. You say

[...] sometimes, doing the optimization "manually", directly in the source code, makes it more readable, and makes the code resemble the output of the compiler more.

I don't really see any advantage of making the code resemble the output of the compiler. This seems more like a disadvantage to me. (Also, the output of the compiler necessarily differs between compiler versions and targets, so any resemblance will be transient at best.)

Possible advantage

Posted Jun 11, 2008 0:38 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

I'd say, from a little experience that it's an advantage during debugging. The closer the
resemblance between source and executable, the more chance you have of understanding what
you're seeing in the debugger.

If, for example, you used an unnecessary temporary, the debugger cannot show you the value of
that temporary. If you call a side-effect free function like strlen() several times the actual
code may call it just once, meaning that breaking on entry to strlen() will not do what you
expect. I recently deleted some code which read something like as follows...

int cache[CACHE_WIDTH];
if (!cache) {
  log_critical("Could not allocate cache");
}

A naive programmer might be quite surprised to see his debugger skip the last three of those
lines during single stepping, but in reality not a single byte of code was emitted or executed
for them due to a trivial optimisation.

Possible advantage

Posted Jun 13, 2008 16:17 UTC (Fri) by giraffedata (guest, #1954) [Link]

I don't really see any advantage of making the code resemble the output of the compiler.

Hear, hear. There are two distinct ways to look at a program: 1) instructions to a computer; 2) description of the solution of a computational problem. The primary audience for (1) is a computer; for (2) it's a human. In view (2), a compiler's job is to produce a machine program that computes the solution described by the source code. A lot of programmers like to do that work themselves, but I think that is an inefficient use of brain power (for everyone who works on that code).

The closer the resemblance between source and executable, the more chance you have of understanding what you're seeing in the debugger.

That's definitely my experience. But there is a middle ground. I write human-oriented code and let the compiler do its job normally. But when I debug at the machine level I add -O0 to the compile options. That's usually described as "don't optimize", but as I consider optimization to be an integral part of the compiler's job, I view it as, "Make the machine program track the source code as closely as possible."

Possible advantage

Posted Jun 17, 2008 21:38 UTC (Tue) by roelofs (guest, #2599) [Link]

But when I debug at the machine level I add -O0 to the compile options. That's usually described as "don't optimize", but as I consider optimization to be an integral part of the compiler's job, I view it as, "Make the machine program track the source code as closely as possible."

"...and watch your bug go away." :-)

That seems to happen to me more often than not. Most recently it turned out to be a gcc code-generation bug, 32-bit only, 3.4.x only. (Pretty sure not related to const, though C++ is a different beast, so I'm not 100% certain.)

Greg

Possible advantage

Posted Jun 18, 2008 2:00 UTC (Wed) by giraffedata (guest, #1954) [Link]

Of all the bugs I've analyzed with GDB, I'd say about 5% stop manifesting when I disable optimization. One was a compiler optimizer bug, one was me lying to the compiler (unintentionally, of course), and the rest were wild pointers causing stack corruption.

Of course, this is mostly with older compilers. (The new ones are too slow for me).

C89 includes blocks you know...

Posted Jun 10, 2008 21:36 UTC (Tue) by ballombe (subscriber, #9523) [Link]

The Dead Code Elimination optimization can be very helpful to reduce the overhead caused by code written to conform to C89 standard, where you couldn't mix variables (and constant) declarations with executable code. In those sources, you had to declare variables at the top of the function, and then start to check for prerequisites. If you wanted to make it explicit that some variable had to keep its value, by making it constant, you would often have to fill them before the prerequisites could be checked.

Actually C89 only requires to declare variables at the top of a block, and you could nest blocks arbitrarily, so the limitation you mention did not exist, you just had to create a new block:

fun(int n)
{
   int is0 = (n==0);
   if (is0)
     n=1;
   {
      const double z=1/((double) n);

      ....
   }
}

C99 allows to omit the extra braces, but that's about it.

Omitting extra braces is a boon

Posted Jun 10, 2008 21:56 UTC (Tue) by jreiser (subscriber, #11027) [Link]

C99 allows to omit the extra braces, but that's about it.

Allowing the braces to be omitted is a significant advantage. It's some syntactic sugar that is really worth it. The combination of requiring braces and coding style conventions can cause swift migration towards the right margin. This impedes readability. I favor placing declarations so as to include the smallest possible span of lines; this eases maintenance by clearly delimiting dataflow. If I must surround every new group of declarations with braces, then the code is often two or three indentations farther to the right. Either that, or I "cheat" on the indentation rules, by placing the braces farther to the left so as to keep the same indentation for the statements which use the new variables.

Omitting extra braces is a boon

Posted Jun 11, 2008 7:08 UTC (Wed) by jengelh (subscriber, #33263) [Link]

Allowing mixing of declarations and code the C99/C++ way I consider a disadvantage, because you have to scan the entire function to get to know what variables it will use. And some people write horribly long functions without breaking them up asterisk comes to mind. Having the code go too far to the right is a sign that you should probably break the function (see CodingStyle).

Omitting extra braces is a boon

Posted Jun 11, 2008 18:43 UTC (Wed) by bronson (subscriber, #4806) [Link]

Doesn't your editor or IDE color the variable declarations?

Omitting extra braces is a boon

Posted Jun 11, 2008 19:11 UTC (Wed) by jengelh (subscriber, #33263) [Link]

That's not the point.

Omitting extra braces is a boon

Posted Jun 12, 2008 0:08 UTC (Thu) by dododge (guest, #2870) [Link]

If your point is that people write horrible over-long functions, no argument there. The C89 rules certainly didn't stop them doing it, though, and frankly if I'm dealing with some 1000-line monstrosity of a function I'd rather be able to put the declarations on the same screen as the code using them, so that 1) I don't have to keep skipping back and forth to remember the types, and 2) I don't have to worry about what the 200 lines of code between the start of the function and the spot I'm looking at might have done with those variables.

But the main point, as already stated, is that from a practical standpoint you can't easily make your local variables const unless you either use mixed declarations or futz about with bracing and deep nesting. As someone who marks as much const as I can, and tries to limit scope as much as possible, I do make heavy use of mixed declarations and it's one of the C99 features that would be hardest to give up.

Omitting extra braces is a boon

Posted Jun 12, 2008 13:41 UTC (Thu) by IkeTo (subscriber, #2122) [Link]

> you have to scan the entire function to get to know what variables it will use.

I don't use any IDE other than plain Emacs.  But there is a search function, so the "scan"
that you mentioned is really by the editor program, not by myself.  On the other hand, I don't
find it overly useful to know what variables a function will use.  It is much more useful to
know what the function will do.  Since I don't usually trust comments in code, this is a
difficult task by itself, and it is much easier if there is not a dozen of variables springing
up at the same point of code which are not used for a dozen of lines (and I have to repeatedly
use the editor's search function to know about it, and then keep forgetting it 5 seconds after
doing that).  With C99/C++ style "declare variable on first use", I can tell easily which
assignment is an initialization and which is a modification to previously assigned value, the
latter I'll need to take much more care than the former.

> Having the code go too far to the right is a sign that you should probably break the
function

Of course.  But nothing beats a function that have just one or two levels of indentations: it
doesn't overload myself mentally.  Indentation is a necessary evil, it loads my mental
capability, but if you have a loop body you have to somehow differentiate it from the code
outside the loop, so it is still the best choice.  Using braces also for just variable
declaration leads to quite a few more levels than what I usually want to see in the code,
especially those written by somebody else.

And "break the function" is not always practicible.  Yes it is usually good, and when it is
good it is the best choice.  But at the same time usually it is not good: at times it creates
a whole brunch of functions that has little meaning by itself and has to be called from a
particular context to be useful at all, and the context is in a form of very complex
preconditions involving its arguments, sometimes involving additional structures to hold the
results, the structures, of course, are of little meaning by themselves as well.  A language
feature that provides an alternative when such "golden rule" is not practicible can only be a
good thing.

C89 includes blocks you know...

Posted Jun 10, 2008 22:01 UTC (Tue) by nix (subscriber, #2304) [Link]

Er, that variable declaration included a non-constant. That's not valid 
C89 :)

The only real advantage of intermingling variable declarations like that 
is that their declarations get moved textually closer to their uses. 
That's a significant readability advantage, surely, but C89 isn't C++ (or 
indeed C99): you can't use non-constants in the initializations, so 
there's no efficiency gain from moving declarations around like that.

(GNU C has always allowed arbitrary function calls there, but this is an 
extension to C89.)

C89 includes blocks you know...

Posted Jun 11, 2008 10:07 UTC (Wed) by Yorick (guest, #19241) [Link]

No, local (automatic) variables may have non-const initialisers in c89.

C89 includes blocks you know...

Posted Jun 11, 2008 10:23 UTC (Wed) by nix (subscriber, #2304) [Link]

Really? Everything I've found, including my reading of the text of C89, seems to say that
initializers of automatic variables must be constant (and I use a number of compilers on a
daily basis that reject non-constant initializers in C89 mode).

A cite from C89 that supports your interpretation would be nice.

C89 includes blocks you know...

Posted Jun 11, 2008 11:31 UTC (Wed) by ballombe (subscriber, #9523) [Link]

I cannot provide you with one, but info gcc document this extension seems to imply the C89 restriction only apply to aggregate initializers.

5.18 Non-Constant Initializers
==============================

As in standard C++ and ISO C99, the elements of an aggregate
initializer for an automatic variable are not required to be constant
expressions in GNU C.  Here is an example of an initializer with
run-time varying elements:

     foo (float f, float g)
     {
       float beat_freqs[2] = { f-g, f+g };
       /* ... */
     }

C89 includes blocks you know...

Posted Jun 12, 2008 5:17 UTC (Thu) by eru (subscriber, #2753) [Link]

Really? Everything I've found, including my reading of the text of C89, seems to say that initializers of automatic variables must be constant (and I use a number of compilers on a daily basis that reject non-constant initializers in C89 mode).

Well, they are broken compilers. Complain to the vendor.

A cite from C89 that supports your interpretation would be nice.

From my photocopy the original X3.159-1989:
3.5.7 Initialization
[syntax omitted]
Constraints
[...]
All the expressions in an initializer for an object that has static storage duration or in an initializer list for an object that has aggregate or union type shall be constant expressions.

Since the standard does not specify such constantness constraint for all objects with automatic storage duration, they can be runtime values.

The Rationale section for 3.5.7 makes this even more clear: According to it, the committee even considered allowing automatic aggregate initializers to consist of series of or arbitrary runtime expressions, but did not go that far in the end. The rationale also mentions that a function call that returns a structure is permitted as an initializer for an automatic variable with structure type. I have used some old compilers that had problems with this kind of structure initialization, but they still allowed automatic scalars to be initialized with runtime values. I think even K&R C allowed this.

C89 includes blocks you know...

Posted Jun 12, 2008 10:54 UTC (Thu) by nix (subscriber, #2304) [Link]

OK, broken compilers it is. (It's surprising that a language as apparently simple as C can
still trip one up with unexpected corners like this after so long.)

C89 includes blocks you know...

Posted Jun 12, 2008 16:08 UTC (Thu) by davecb (subscriber, #1574) [Link]

Interestingly, allowing the internal block structure 
can increase the difficulty of doing logic on
the code to the point where building proof tools become
NP-complete (or at least insanely hard).

This is relevant if you're tying to use static analysis
tools, not just if you're trying to prove theoroms (;-))

--dave (who proveth not, but analyzeth a lot) c-b

C89 includes blocks you know...

Posted Jun 19, 2008 17:23 UTC (Thu) by jlokier (guest, #52227) [Link]

Hardly.  C code with blocks and variables declared inside blocks is trivially convertible to a
flat function with all variables at the start, and in trivial time.  Same goes for the control
flow blocks: they are trivially replacable with if (variable) gotos.

So the proof difficulty is identical with/without blocks and local declarations in the
language.

Interesting reading... Thanks!

Posted Jun 11, 2008 0:26 UTC (Wed) by pr1268 (subscriber, #24648) [Link]

Thanks to Diego for contributing an interesting and enlightening article.

Implications of pure and constant functions

Posted Jun 11, 2008 1:18 UTC (Wed) by MisterIO (guest, #36192) [Link]

"As you can guess, constant functions are also pure functions, but pure functions cannot be
constant functions."

Shouldn't it be :
"As you can guess, constant functions are also pure functions, but pure functions can be not
constant functions." ?

Implications of pure and constant functions

Posted Jun 11, 2008 3:00 UTC (Wed) by rriggs (guest, #11598) [Link]

Try this for clarity: "but not all pure functions are constant functions."

Implications of pure and constant functions

Posted Jun 11, 2008 6:42 UTC (Wed) by nix (subscriber, #2304) [Link]

Or "functions declared 'const' are implicitly 'pure', but not vice versa."

Implications of pure and constant functions

Posted Jun 11, 2008 1:40 UTC (Wed) by rsidd (subscriber, #2582) [Link]

As MisterIO points out, the statement "pure functions cannot be constant functions" is obviously wrong.

Another point: in a language like C, surely the pureness of a function depends on all the other functions in a program? For example, you say (accurately, as far as the definition of "pure" goes): "Because the global memory state remains untouched, two calls to the same pure function with the same parameters will have to return the same value." Your example is the strlen() function. But what if some other function has tampered with the contents addressed by your pointer in between the two calls, without modifying the pointer itself?

You hint at this issue later when you say of an example: "(The pure function is called just once) because there was no change to global memory known to the compiler between the two calls of the pure function." So the compiler needs to determine whether the memory addressed could have changed. This is not always possible, unless the compiler decides that any intervening non-pure function call could have changed the memory addressed -- a drastic assumption since in practice most such calls are probably harmless.

My take is, if you care about pure functions, use Haskell :)

Implications of pure and constant functions

Posted Jun 11, 2008 2:21 UTC (Wed) by droundy (subscriber, #4559) [Link]

Indeed, the article should have stated that two consecutive calls to the same pure function
with the same arguments will give the same result, that's the difference between pure and
constant functions (constant functions always return the same for the same input).

With regard to your suggestion to just use Haskell, I'd point out that ghc doesn't support
CSE.  I love Haskell, and it's a great language, but this sort of optimization is not one of
its strengths.  Laziness interferes with its strengths as a pure functional language in this
case.  The trouble is that it's hard to determine when using the memory to store the result of
a computation is worth paying to avoid recomputing it.  For primitive data types the answer is
obvious, and we should do CSE, but ghc doesn't perform that analysis, and even if it did, I
wouldn't be happy with a CSE pass that only worked on functions returning Int/Double/Bool etc.

Of course, dead code elimination comes for free in Haskell, so that does gain us something.
But as the article points out, it's really only useful for things like conditional compilation,
which is much less of a gain than CSE.

Implications of pure and constant functions

Posted Jun 11, 2008 6:45 UTC (Wed) by nix (subscriber, #2304) [Link]

A classic example of a common class of functions that can't be optimized 
over even though you'd think it could be is *printf(). Standard printf() 
without %n can be freely optimized over, but alas glibc has an extension 
(user-defined format masks) that means printf() can call arbitrary 
functions for you.

That nobody ever uses this extension is irrelevant: the compiler must 
assume that you might be.

Std C

Posted Jun 11, 2008 8:16 UTC (Wed) by cate (subscriber, #1359) [Link]

FYI the attribute concept, along with "pure" and "const" sttribute should enter in next C1x
standard, but possibly with other name.
See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1273.pdf

Std C

Posted Jun 11, 2008 11:44 UTC (Wed) by Yorick (guest, #19241) [Link]

That document is not quite consistent. It defines the attribute stdc_pure to indicate that a function "has no side effects and depends only on the values of the arguments passed". It then gives strlen as an example of such a pure function, contradicting the definition. It is unclear whether stdc_pure would correspond to GCC's const or pure attributes.

Std C

Posted Jun 11, 2008 12:16 UTC (Wed) by cate (subscriber, #1359) [Link]

It is a working document (and a subsequent document document "const", giving better hints that
pure is allowed to read memory).
Anyway the version in standard should behave as gcc (or ev. as microsoft c compiler). They
really don't want (in next revision) to create alternate behaviors.

Std C

Posted Jun 11, 2008 16:51 UTC (Wed) by madscientist (subscriber, #16861) [Link]

I really hope you're right, but in previous versions of the standard there was no interest by
the standards committee in how GCC does things.  Basically their attitude was "if GCC wants
their compiler considered, they'll send someone to the standards committee to represent it.
If not, too bad." (I'm greatly paraphrasing comments by committee members posted on
comp.std.c).  This might be understandable if GCC was a proprietary compiler that you had to
pay for... but it isn't of course.  At the time GCC's support was confusing and no company had
the time and money to support a member at committee meetings.

Hopefully things will be better this time.

Times are changing

Posted Jun 12, 2008 8:35 UTC (Thu) by khim (subscriber, #9252) [Link]

At the time GCC's support was confusing and no company had the time and money to support a member at committee meetings.

The biggest problem at the time was RMS's position: GCC was supposed to be GNU C Compiler - and all other frontends were second-class citizens. Including C++. State-of-the-art GCC (the venerable 2.7.2.3) had awful C++ support and few developers used it as C++-compiler-of-choice. Eventually EGCS split happened, C++ support was improved to the point that now GCC is one of the C++ compilers - but all this happened after standard was finished. Today WG pays very serious attention to GCC: if some feature is flat-out rejected by GCC team it'll need A LOT OF supporters to be even considered. Thankfully GCC too pays close attention to what WG is doing so it's not a problem.

To WG difference between free compiler and proprietary one is NOT important. Difference between obscure and popular one is. As GCC's C++ compiler moved from obscurity to the compiler-of-choice it importance to WG moved similarly...

Times are changing

Posted Jun 12, 2008 13:00 UTC (Thu) by madscientist (subscriber, #16861) [Link]

Since we're talking about the C standard here, not the C++ standard, I'm not sure I understand your concern about the capabilities of C++ in GCC back "in the day". Although, you are definitely correct in your assessments, IMO.

What difference would it make to the C99 WG, what RMS's position was on C vs. C++, or the state of C++ in GCC at the time? There's no question that GCC was, even then, one of the most widely used C compilers around. My reference to proprietary compilers was meant to say that WG members couldn't/shouldn't be expected to pay for proprietary compiler licenses so they could learn about them: it's quite reasonable to expect those companies to pony up some $$ to send someone to the WG. For free software, there's nothing stopping the existing members from examining the docs and even the implementations directly, and/or asking questions on the mailing list to get details. Engaging the community that way, rather than requiring WG attendance or else ignoring the compiler, would have worked better for all concerned.

Implications of pure and constant functions

Posted Jun 12, 2008 11:57 UTC (Thu) by ekj (guest, #1524) [Link]

This low barrier to entry, though, is also probably the cause of the widely varying quality of the code of these projects.

I hope you're not suggesting that projects where entry is more closed, such as proprietary development do not experience "widely varying quality" of code.

Implications of pure and constant functions

Posted Jun 12, 2008 14:26 UTC (Thu) by Flameeyes (subscriber, #51238) [Link]

> I hope you're not suggesting that projects where entry is more closed,
> such as proprietary development do not experience "widely varying
> quality" of code.

In my experience, it isn't much varying... it usually is so bad it's not 
even funny to think of it twice.

On the other hand, proprietary software that is written badly is more 
easily discarded, as it fails to work at all in most cases - unless it's 
so proprietary you can't switch for a different one. On Free Software, 
badly written software can easily be patched up so that it works, and is 
kept in circulation of months and years.

I think the problem is in the big numbers, there is probably way more 
Free Software than proprietary software being developed by amateurs.

Implications of pure and constant functions

Posted Jun 12, 2008 20:04 UTC (Thu) by mtorni (guest, #3618) [Link]

Are there any security implications of labeling a function pure or constant? Can a malicious user introduce a security bug by checking in a change which changes a function to const? Say ssl_verify_certificate() or sasl_authenticate()? Both are fictitious functions.

Implications of pure and constant functions

Posted Jun 12, 2008 20:38 UTC (Thu) by quotemstr (subscriber, #45331) [Link]

The key here is "check in". If somebody can check a change in, he can wreak all sorts of
havoc, much of it even more subtle than function attribute changes.

Besides, using function attributes like this to be a sort of post-benchmarking magic that
undergoes strict code review.

Constant function declaration: is it cheating?

Posted Jun 13, 2008 16:31 UTC (Fri) by giraffedata (guest, #1954) [Link]

Is it really cheating or trickery to declare a function constant even though it accesses the larger environment? Or is it just helping the compiler properly classify the function when it doesn't have enough information to do it itself?

All programs make assumptions about their environment. For example, that there isn't another thread running around overwriting the memory in which the program's variables are stored. Where the user doesn't assure that assumed environment, the program's behavior is arbitrary.

So isn't it legitimate to make the environmental assumption that a certain file's contents don't change while the program runs? With that assumption, a function whose result depends upon the contents of the file is constant.

Constant function declaration: is it cheating?

Posted Jun 14, 2008 6:52 UTC (Sat) by jimparis (guest, #38647) [Link]

> Is it really cheating or trickery to declare a function constant even though it accesses the
larger environment? Or is it just helping the compiler properly classify the function when it
doesn't have enough information to do it itself?

Yes!  Well, maybe it's not cheating, but it's definitely lying, and you'd better start
expecting your code to start failing in mysterious ways when you add more stuff to your
program or you switch to a new or different compiler.
Really, if you want to help the compiler, help it directly, don't try to trick the optimizer
into doing what you want.  If you're concerned about
  if (count(foo))  bar(count(foo));
where count() isn't really constant, but in this one case you know for a fact that it the
result won't change, just write
  tmp = count(foo); if (tmp) bar(tmp);

> All programs make assumptions about their environment. For example, that there isn't another
thread running around overwriting the memory in which the program's variables are stored.

There's a difference between valid documented assumptions (that my kernel separates processes
properly) and non-guaranteed observations (hey, if I lie to the compiler here then it seems to
generate better code).  In the first case, a problem is clearly a bug in the kernel and so it
would be fixed if it showed up.  In the second, it's your own bug when the assumption proves
wrong.

> So isn't it legitimate to make the environmental assumption that a certain file's contents
don't change while the program runs? With that assumption, a function whose result depends
upon the contents of the file is constant.

Aah, now that's different.  If you can look at a function, and the way it behaves in your
program, and guarantee "at any time in my program, during proper usage, this function will
always return the same answer" then sure, call it constant, and you're not cheating or lying
to the compiler, you're just being 100% accurate and using the attribute properly.

My concern with the article it suggests you might declare a function constant, even when it's
not(!), if it helps generate better code.  And that's definitely cheating and just asking for
problems.

Constant function declaration: is it cheating?

Posted Jun 14, 2008 16:34 UTC (Sat) by giraffedata (guest, #1954) [Link]

If you can look at a function, and the way it behaves in your program, and guarantee "at any time in my program, during proper usage, this function will always return the same answer" then sure, call it constant, and you're not cheating or lying to the compiler, you're just being 100% accurate and using the attribute properly.

That's all I'm suggesting. The "constant" attribute must always mean the function is constant, not that you want the compiler to treat it as if it were constant. (It's none of your business what the compiler does with the information).

The only question is the definition of constant. The article's use of the term "cheating" suggests to me a definition of constant that is exactly what the compiler would determine on its own, e.g. if the function accesses global memory, it's not constant. But the examples given are of cases where the function really is constant under the broader definition.

Pure and constant external functions

Posted Jun 13, 2008 16:38 UTC (Fri) by giraffedata (guest, #1954) [Link]

Has anyone thought of having the compiler generate header files?

In addition to saving one from having to code every function declaration twice, it would let the compiler detect attributes like pureness that I'm too lazy to declare manually.

Alternatively, the compiler might warn when the declaration doesn't have all the apparently applicable attributes.

Pure and constant external functions

Posted Jun 19, 2008 19:38 UTC (Thu) by olecom (guest, #42886) [Link]

> Has anyone thought of having the compiler generate header files?

http://kerneltrap.org/node/16293

Pure and constant external functions

Posted Jun 21, 2008 3:28 UTC (Sat) by giraffedata (guest, #1954) [Link]

Has anyone thought of having the compiler generate header files?
http://kerneltrap.org/node/16293

The reference is to a posting about kbuild and dependencies and build issues. I don't see any mention of having the compiler generate header files.

Implications of pure and constant functions

Posted Jun 13, 2008 16:44 UTC (Fri) by etienne_lorrain@yahoo.fr (guest, #38022) [Link]

 Sometimes, I would also like an attribute to say "no content of parameter-pointers" are
changed by a function, in fact nothing important is changed in memory by a function.
 The main example would be printf(), or any function name used to print a debugging string -
with variable number of argument that cannot be prefixed by "const"
 (we cannot compile "void dbgprintf (const char *, const ...);").
 Then, if the compiler see some code like:
if (error)
    dbgprintf ("here is the context address %x %s %s\n", &s1, s1, s2);
 it has anyway to update in memory all content of pointer passed, but it does not have to
reload all those values to register after the call.
 The main problem of those dbgprintf() is that there may be a lot of them, and it would be
nice if they did not interfer too much with execution time or other optimisation. True that
for x86 there is not enough register to notice the problem, but it is bad to see a lot of
unused register on other arch because the compiler has to keep in sync the memory and
registers too often.

Implications of pure and constant functions

Posted Jun 15, 2008 12:34 UTC (Sun) by nix (subscriber, #2304) [Link]

Thanks to rarely-used glibc extensions, printf() and friends must be 
assumed by optimizers to be able to change *anything*, both because the 
information could be sent to a custom stream (-> calling arbitrary 
functions), and because they can contain specifiers for which custom 
conversion have been defined (-> calling arbitrary functions).