Q. is it better to use a pointer to navigate an array of values, or is it better to use a subscripted array name?
It's easier for a C compiler to generate good code for pointers than for subscripts.
Say that you have this:
/* X is some type */
X a[ MAX ];
/* array */
X *p;
/* pointer */
X x;
/* element */
inti;
/* index */
Here's one way to loop through all elements:
/* version (a) */
for( i =
0; i < MAX; ++i )
{
x = a[ i ];
/* do something with x */
}
On the other hand, you could write the loop this way:
/* version (b) */
for ( p = a; p < & a[ MAX ]; ++p )
{
x = *p;
/* do something with x */
}
What's different between these two versions? The initialization and increment in the loop are the same. The comparison is about the same; more on that in a moment. The difference is between x=a[i] and x=*p. The first has to find the address of a[i]; to do that, it needs to multiply i by the size of an X and add it to the address of the first element of a. The second just has to go indirect on the p pointer. Indirection is fast; multiplication is relatively slow.
This is "micro efficiency." It might matter, it might not. If you're adding the elements of an array, or simply moving information from one place to another, much of the time in the loop will be spent just using the array index. If you do any I/O, or even call a function, each time through the loop, the relative cost of indexing will be insignificant.
Some multiplications are less expensive than others. If the size of an X is 1, the multiplication can be optimized away (1 times anything is the original anything). If the size of an X is a power of 2 (and it usually is if X is any of the built-in types), the multiplication can be optimized into a left shift. (It's like multiplying by 10 in base 10.)
What about computing &a[MAX] every time though the loop? That's part of the comparison in the pointer version. Isn't it as expensive computing a[i] each time? It's not, because &a[MAX] doesn't change during the loop. Any decent compiler will compute that, once, at the beginning of the loop, and use the same value each time. It's as if you had written this:
/* how the compiler implements version (b) */
X *temp = & a[ MAX ];
/* optimization */
for ( p = a; p < temp; ++p )
{
x = *p;
/* do something with x */
}
This works only if the compiler can tell that a and MAX can't change in the middle of the loop. There are two other versions; both count down rather than up. That's no help for a task such as printing the elements of an array in order. It's fine for adding the values or something similar. The index version presumes that it's cheaper to compare a value with zero than to compare it with some arbitrary value:
/* version (c) */
for( i = MAX -
1; i >=
0; --i )
{
x = a[ i ];
/* do something with x */
}
The pointer version makes the comparison simpler:
/* version (d) */
for( p = & a[ MAX -
1]; p >= a; --p )
{
x = *p;
/* do something with x */
}
Code similar to that in version (d) is common, but not necessarily right. The loop ends only when p is less than a. That might not be possible.
The common wisdom would finish by saying, "Any decent optimizing compiler would generate the same code for all four versions." Unfortunately, there seems to be a lack of decent optimizing compilers in the world. A test program (in which the size of an X was not a power of 2 and in which the "do something" was trivial) was built with four very different compilers. Version (b) always ran much faster than version (a), sometimes twice as fast. Using pointers rather than indices made a big difference. (Clearly, all four compilers optimize &a[MAX] out of the loop.)
How about counting down rather than counting up? With two compilers, versions (c) and (d) were about the same as version (a); version (b) was the clear winner. (Maybe the comparison is cheaper, but decrementing is slower than incrementing?) With the other two compilers, version (c) was about the same as version (a) (indices are slow), but version (d) was slightly faster than version (b).
So if you want to write portable efficient code to navigate an array of values, using a pointer is faster than using subscripts. Use version (b); version (d) might not work, and even if it does, it might be compiled into slower code.
Most of the time, though, this is micro-optimizing. The "do something" in the loop is where most of the time is spent, usually. Too many C programmers are like half-sloppy carpenters; they sweep up the sawdust but leave a bunch of two-by-fours lying around.
No comments:
Post a Comment