E cient Detection of All Pointer and Array Access Errors
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Ecient Detection of All Pointer and Array Access Errors Todd M. Austin Scott E. Breach Gurindar S. Sohi Computer Sciences Department University of Wisconsin-Madison 1210 W. Dayton Street Madison, WI 53706 faustin, breach, sohig@cs.wisc.edu Abstract [MFS90], Miller et. al. injected random inputs (a.k.a \fuzz") into a number of Unix utilities. On systems from six di erent We present a pointer and array access checking technique vendors, nearly all of the seemingly mature programs could that provides complete error coverage through a simple set be coaxed into dumping core. The most prevalent errors de- of program transformations. Our technique, based on an ex- tected were memory access errors. In [SC91], Sullivan and tended safe pointer representation, has a number of novel Chillarege examined IBM MVS software error reports over a aspects. Foremost, it is the rst technique that detects all four year period. Nearly 50% of all reported software errors spatial and temporal access errors. Its use is not limited examined were due to pointer and array access errors. Fur- by the expressiveness of the language; that is, it can be ap- thermore, of these errors, 25% were temporal access errors { plied successfully to compiled or interpreted languages with an error our checking methodology is particularly adept at subscripted and mutable pointers, local references, and ex- catching. plicit and typeless dynamic storage management, e.g., C. Memory access errors are possible in languages with ar- Because it is a source level transformation, it is amenable to rays, pointers, local references, or explicit dynamic storage both compile- and run-time optimization. Finally, its per- management. Such errors are particularly dicult to detect formance, even without compile-time optimization, is quite and x because: good. We implemented a prototype translator for the C lan- guage and analyzed the checking overheads of six non-trivial, The e ects of a memory access error may not manifest pointer intensive programs. Execution overheads range from themselves except under exceptional conditions. 130% to 540%; with text and data size overheads typically The exceptional conditions which lead to the program below 100%. error may be very dicult to reproduce. 1 Introduction Once the error is reproduced, it may be very dicult to correlate the program error to the memory access error. It is not dicult to convince programmers (or employers of Consider the erroneous C function in Figure 1. This func- programmers) that programming errors are costly, both in tion can create a memory access error in the return state- terms of time and money. Memory access errors are particu- ment expression. The function will reference the word im- larly troublesome. A memory access error is any dereference mediately following the array referenced by the pointer data of a pointer or subscripted array reference which reads or if the array does not contain the token. writes storage outside of the referent. This access can either The function illustrates the three diculties in nding and be outside of the address bounds of the referent, causing a xing memory access errors. First, FindToken() will only spatial access error, or outside of the lifetime of the referent, produce an incorrect result if the word following the array causing a temporal access error. Indexing past the end of an referenced by data contains the same value as token (or is array is a typical example of a spatial access error. A typical inaccessible storage). This event is unlikely if the word con- temporal access error is assigning to a heap allocation after tains an arbitrary value. Second, when (or if) FindToken() it has been freed. creates an incorrect result, it will be dicult to recreate dur- Our own experiences as programmers as well as published ing debugging. The programmer will have to condition the evidence lead us to believe that memory access errors are an inputs of the program such that the word following the ar- important class of errors to reliably detect. For example, in ray referenced by data once again contains the same value This work was supported by grants from the National Science as token. If the value of the illegally accessed word is inde- Foundation (grant CCR-9303030) and Oce of Naval Research pendent of the value of token, the probability of success will (grant N00014-93-1-0465). be very low. Third, correlating the visible errors of the pro- gram to the incorrect actions of FindToken() may be very dicult. This connection may be very subtle and may not be visible for a long period of time. Many execution environments do provide some level of pro- tection against memory access errors. For example, in most Unix based systems, a store to the program text will cause the operating system to terminate execution of the program (usually with a core dump). Unix typically provides storage To appear in: \Proceedings of the SIGPLAN '94 Conference on Programming Language Design and Implementation"
int FindToken(int *data, int count, int token) f typedef f int i = 0, *p = data; *value; while ((i < count) && (*p != token)) f *base; p++; i++; unsigned size; g f g enum Heap=0, Local, Global storageClass; return (*p == token); int capability; /* plus FOREVER and NEVER */ g g SafePtr; Figure 1: A C function with a (spatial) memory access error. Figure 2: Safe pointer de nition. This C-like type de nition is parameterized by , the type of the pointer referent. protection on a segment granularity { the segments are the program text, data, and stack. Other, more hostile environ- 2 Safe Pointers ments such as MS-DOS, do not o er such luxuries, and stores to the program text may or may not manifest themselves as To enforce access protection, we must extend the notion of a program error. If a program error does occur, correlating a pointer value to include information about the referent. it to a fault may be dicult, if not impossible. The idea is similar to tagged pointers used in many Lisp implementations [Lee91]. Figure 2 shows our safe pointer Ideally, we would like the language execution environment representation. The de nitions of the contained elds follow: to support memory access protection at the variable level, that is, an access to a variable should only be valid if the value: The value of the safe pointer; it may contain any access is within the range (for both time and space) of the expressible address. intended variable { all other accesses should immediately ag base and size: The base address of the referent and its size an error. We call any program that supports these execution in bytes. In languages where pointers are immutable, semantics a safe program. base is redundant and may be omitted. With this in- Our solution to the memory access error problem is simple formation, we can detect all spatial access errors with a and provides ecient and immediate detection of all memory range check. access errors. We transform programs, at compile-time, to storageClass: The storage class of the allocation, either use an extended pointer representation which we call a safe Heap, Local, or Global. Using this value, it is possible pointer. A safe pointer contains the value of the pointer as to detect errant storage deallocations, e.g., it is illegal well as object attributes. The object attributes describe the to free a global or local variable. location, size and lifetime of the pointer referent. When a safe pointer value is created, either through the use of the capability: A capability to the referent. When dynamic reference operator (e.g., `&' in C) or through explicit storage variables are created, either through explicit storage al- allocation, we attach to it the appropriate object attributes. location (e.g., calls to malloc()) or through procedure As the value is manipulated, through the use of pointer op- invocations (i.e., a procedure call creates the local vari- erators, the object attributes are transferred to any new safe ables in the stack frame of the procedure), a unique ca- pointer values. Detecting a memory access error involves pability is issued to that storage allocation. The unique simply validating dereferences against the object attributes capability is also inserted into an associative store called { if the access is within the space and time bounds of the the capability store and deleted from that store when object, it is permitted, otherwise an error is agged and the the dynamic storage allocation is freed or when the pro- access error is detected immediately. cedure invocation returns (the exact mechanics of this process are discussed in a following section). Thus, the We implemented a prototype source-to-source translator collection of capabilities in the capability store represent for the C language and examined the performance of six non- all active dynamic storage. Temporal access errors oc- trivial, pointer intensive programs. The performance is quite cur whenever a reference is made through a stale pointer, good. Instruction execution overheads range from 130% to i.e., a pointer which references storage whose capability 540%, and text and data size overheads are typically below is no longer in the capability store. Two capabilities 100%. We also benchmarked our prototype system against are prede ned. FOREVER is unique and always ex- two commercially available tools that support memory access ists in the capability store; this capability is assigned to checking (Purify [HJ92] and CodeCenter [KLP88]) and found all global objects. NEVER is unique and never exists that our checking technique consistently uses less resources, in the capability store; this capability can be assigned even while providing better error coverage for memory access to invalid pointers to ensure any dereference causes an errors. error. This paper is organized as follows. Section 2 introduces our The value attribute is the only safe pointer member that extended safe pointer representation. Section 3 details the can be manipulated by the program source; all other mem- program transformations required to create safe programs, bers are inaccessible. base and size are the spatial at- and in Section 4 we discuss the translation and performance tributes. storageClass and capability are the temporal implications of providing complete error coverage. In Sec- attributes. tion 5, we present compile- and run-time optimization frame- Safe pointers can exist in three states: unsafe, invalid, and works. Section 6 describes our prototype implementation valid. If the object attributes are incorrect, we say that the and presents results of our analyses of six programs. Sec- pointer has become unsafe; dereferencing this pointer may tion 7 compares our checking technique to other published cause an undetected memory access error. It is the goal techniques. Section 8 concludes the paper. of this work to ensure that a safe pointer never becomes
unsafe. If the safe pointer is not unsafe, it is either invalid or valid, depending on whether a dereference would ag an void ValidateAccess( *addr) f error. Languages with mutable pointers allow the program if (storageClass != Global && to legally create invalid pointers; for example, iterating a !ValidCapability(capability)) pointer across all the elements of an array exits the loop FlagTemporalError(); with the pointer pointing to the memory location following if ((unsigned)(addr-base) > size-sizeof()) the last object. If the invalid pointer is never dereferenced, FlagSpatialError(); /* valid access! */ the program would not be in error. This behavior illustrates g precisely why we only place error checks at dereferences; it is not illegal to have an invalid pointer { only to use it. The initial value of a safe pointer, if not speci ed by an Figure 3: Memory access check. ValidCapability() indicates initialization expression, must be invalid. This condition en- whether or not the passed capability is currently active, i.e., in sures that a dereference before the initial assignment is de- the capability store. The Flag functions performs system speci c tected. A simple way to invalidate a pointer value is to assign handling of an access error. it the unique capability NEVER. the bounds check. If the accessed address is prior to the 3 Program Transformations start of the array, the unsigned subtraction under ows and creates a very large number, causing the test to fail. The2 Creating a safe program from its unsafe counterpart involves advantage of this expression over traditional bounds checks three transformations: pointer conversion, check insertion, is that it only requires one conditional branch to implement. and operator conversion. The rst, pointer conversion, ex- This simpli cation reduces the additional control complexity tends all pointer de nitions and declarations to include space introduced by dereference checks, which can result in better for object attributes. Check insertion instruments the pro- optimization results and better dynamic executions. gram to detect all memory access errors. Operator conver- sion generates and maintains object attributes. In this sec- 3.3 Operator Conversion tion, we also describe the run-time support. Pointer operators must interact properly with the compos- 3.1 Pointer Conversion ite safe pointer structure. When applied, they must reach into the safe pointer to access the pointer value. If the op- All pointer de nitions and declarations must be extended erator creates a new pointer value, it must include an un- to include object attributes. To make this transformation modi ed copy of the pointer operand's object attributes. For transparent, the composite safe pointer must mimic the rst example, in the C statement q = p + 6, the application of class value semantics of scalar pointers. That is, when passed the `+' operator on the pointer p creates a new safe pointer to a function, the safe pointer must be passed by value, and which is assigned to q. The new pointer value in q shares the when operators are applied to a safe pointer, the result, if a same object attributes as p. Operators which manipulate pointer, must be a new safe pointer. pointer values never modify the copied object attributes be- There is no need to add object attributes to array vari- cause changing the value of the pointer does not change the ables. Array variables (in the C sense) are merely address attributes of the storage it references. This property holds constants, and thus only exist as statically allocated objects even for pointers to aggregate structures. In this case, the or within structure de nitions; as a result, the spatial at- object attributes refer to the entire aggregate. tributes can be generated from the address constant and its The assignment operator requires special handling if the type size, and the temporal attributes can be taken from right hand side is a constant. Two common pointer constants the safe pointer to the containing object or derived from the are the NULL value and pointers to string constants (for C). array name. If the assignment value is NULL, the NULL value can be replaced by an invalid safe pointer value, e.g., one with the 3.2 Check Insertion capability NEVER. For string constants, we can generate the needed object attributes at compile-time. If the right hand Assuming the safe pointer object attributes are correct (how side of the assignment is a pointer expression, the result- to ensure this property is detailed in the following sections), ing pointer value (and its object attributes) is copied to the complete safety for all pointer and array accesses is provided pointer named on the left hand side of the assignment. by inserting1 an access check before each pointer or array Casting between pointer types does not require any spe- dereference. cial program transformations. This operation only alerts the The dereference check rst veri es that the referent is alive compiler that future pointer arithmetic or dereferences of a by performing an associative search for the referent's capa- particular pointer value should be made with respect to the bility. If the referent has been freed, the capability would no new type size. Casting to a non-pointer type requires that longer exist in the capability store and the check would fail. the object attributes be dropped (if only pointers carry ob- Because capabilities are never re-used, the temporal check ject attributes) and the cast be carried out as de ned by fails even if the storage has been reallocated. Once the stor- the language. Casting from a non-pointer type to a pointer age is known to be alive, a bounds check is applied to verify type is problematic if non-pointer types do not carry object that the entire extent of the access ts into the referent. attributes. We address this problem in Section 4. Our access check, shown in Figure 3, takes advantage of the wrap-around property of unsigned arithmetic to simplify 2 Our check is functionally equivalent to: 1 We use the term dereference as a blanket term for any indirect (addr < base || addr > base+size-sizeof()) access { either through application of the dereference operator (e.g., `*' or `->' in C) or through indexing an array or pointer which requires two conditional branches (or extra instructions to variable (e.g., `[ ]' in C). combine the boolean terms).
Handling of the reference operator, e.g., the `&' operator in the C statement q = &p->b[10], is slightly more complex void *malloc(unsigned size) f as it must generate object attributes. The reference opera- void *p; tor is applied to an expression (p->b[10], in our example) p.base = p.value = unsafe malloc(size); which names some storage. We call this expression the ac- p.size = size; cess path. The result of the operation is a new safe pointer p.storageClass = Heap; to the referent named by the expression. p.capability = NextCapability(); InsertCapability(p.capability); To generate object attributes for a reference operation bzero(p.value, size); /* capability NEVER is 0 */ (e.g., `&'), we decompose access paths into two parts, a pre x g return p; and a sux. The access path pre x is a non-empty sequence of variable names, dereferences, subscripts, eld selectors, f and pointer expressions leading to the memory object being void *calloc(unsigned nelem, unsigned elsize) referenced. The remaining part of the access path, the access g return malloc(nelem*elsize); path sux, is a possibly-empty sequence of eld selectors and subscripts (on array variables only) indicating the extent of void *realloc(void *p, unsigned size) f the memory object being referenced. void *new; We further classify access paths as direct or indirect. A new = malloc(size); direct access path refers to an object in the global or local bcopy(p.base, new.base, min(size, p.size)); space by name. An indirect access path contains at least one free(p); pointer traversal. g return new; Given a reference operator expression, we can parse the access path pre x by traversing the expression tree starting void free(void *p) f with the left-most, lowest precedence operator. The part if (p.storageClass != Heap) of the expression up to but not including the last pointer FlagNonHeapFree(); traversal is the access path pre x; the remainder of the ex- if (!ValidCapability(p.capability)) pression becomes the access path sux. If the access path FlagDuplicateFree(); does not contain any pointer traversals, the access path pre- if (p.value != p.base) FlagNonOriginalFree(); x is merely the name of the referenced variable. DestroyCapability(p.capability); To illustrate this decomposition, consider the C expression g unsafe free(p.value); &f->g[4].i[6], where g is a pointer and i is an array within a structure. The access path pre x is the sub-expression f->g[4]. The access path sux is the remainder of the ex- pression, i[6]. The access path pre x is indirect. Figure 4: Safe malloc implementation with additional checking. InsertCapability(), ValidCapability(), and The temporal attributes of the new safe pointer are de- DestroyCapability() insert, locate, and delete capabilities, re- rived from the access path pre x. If the pre x is direct, the spectively. NextCapability() returns the next unique capability. referenced object is either a global or a local variable. If unsafe malloc() andunsafe free() are interfaces to the system- global, the capability FOREVER is assigned to the new safe de ned storage allocator. pointer. If local, the capability allocated to the local vari- able's stack frame is assigned to the new safe pointer (frame 3.4 Run-Time Support capability allocation is discussed in the following section). If the access path pre x is indirect, the temporal attributes are The explicit storage allocation mechanism must be extended taken from the safe pointer named by the access path pre x. to create safe pointers. During allocation, a capability must The spatial attributes are derived from both the access be allocated for the storage, and any contained pointers must path pre x and sux. The base of the safe pointer is taken be invalidated. At deallocation, the capability given to the from the object referred to by the access path pre x, namely storage must be destroyed. the address of the named variable for a direct pre x or the Figure 4 shows how this support would be provided for malloc(), the storage allocator provided under Unix. Dur- corresponding spatial attributes of the referenced safe pointer ing allocation, malloc() generates a safe pointer using the for an indirect pre x. The value and size of the safe pointer size and location of the allocation request. The call to are computed from the access path sux. Because all mem- NextCapability() returns the next available and unused ca- bers of the referenced object (i.e., the member of any con- pability. NextCapability() can be implemented with an in- tained structure) are of a known size, the spatial attributes crementing counter or a pseudo-random number generator. of the reference can be computed at compile-time from type The capability is inserted into the capability store via the information. In the event the nal term of the sux is a call to InsertCapability(). The call to bzero() clears the subscript, the spatial attributes are set to the extent of the entire storage allocation. This action ensures that any point- entire array. This technique allows the safe pointer to be ers in the untyped allocation are initially invalid (assuming subsequently manipulated to point to other members of the the storage class of Heap and capability NEVER are both array. assigned the value of 0). The use of the access path pre x and sux to produce The implementation of realloc() is slightly more subtle. a safe pointer via the reference operator cannot subvert the This function takes an existing storage allocation and resizes checking framework. In order to maintain safe semantics, it to the requested size. The reallocated storage may move for any pointers traversed within the access path pre x must any request, either larger or smaller. If moved, the contents be validated using the techniques described in the previous of the new allocation will be unchanged up to the lesser of the subsection. new and old sizes. In our safe programming environment, we
the capabilities of all active memory. It can be implemented void Func(int a) f as a hash table with the capability as the hash key. Accesses /* procedure prologue */ to the capability store exhibit a great deal of temporal lo- unsigned frameCapability = NextCapability(); cality, so moving accessed elements to the head of the hash InsertCapability(frameCapability); table bucket chains is likely to decrease average access time. ZeroFramePointers(); /* cap. NEVER == 0 */ We close this section with two examples. Figure 6(a) . . shows a spatial access error, and Figure 6(b) demon- /* procedure epilogue, common exit point */ strates a temporal access error. Safe pointer values DestroyCapability(frameCapability); are speci ed as a 5-tuple with the following format: return; [value,base,size,storageClass,capability]. x indicates a don't g care value. In the rst example, a spatial access error is agged when the program dereferences a safe pointer whose value is less than the base of the referent. In the second Figure 5: Function frame allocation and deallocation. example, a stale pointer, q, is dereferenced. Even though ZeroFramePointers() is a system speci c function which clears the same storage has been reallocated to p, the capability all pointers in the newly allocated stack frame. originally assigned to q has been destroyed during the call to free(); thus, the temporal access error is detected. must move the storage in all cases, otherwise, there may exist safe pointers (which we cannot locate and change) whose object attributes have incorrect records of the referent size. 4 Implications of Complete Error If dereferenced, these pointers may ag errors even though Coverage the access was valid in the reallocated storage, or worse, the reallocation may have shrunk the referent, creating unsafe Our safe programming technique can detect all memory ac- pointers whose referent sizes are too large. We can solve cess errors provided that the following conditions hold: both these problems by always moving the storage. This i. Storage management must be apparent to the transla- action will force the program to update any old pointers to tor. the previous allocation. Because the reallocated storage is allocated under a new capability, any stale pointers to the ii. The referents of all pointer constants must have a known previous allocation will ag errors if dereferenced. We need location, size, and lifetime. not clear the remaining storage in the reallocation if it is larger, as the call to malloc() returns cleared storage. iii. The program must not manipulate the object attributes At calls to free(), the capability of the allocation (con- of any pointer value. tained in the safe pointer object attributes) is deleted from the capability store by the call to DestroyCapability(). Our claim to complete error coverage must be limited Our implementation also veri es that the freed storage is to storage management controlled by the safe programming indeed a heap allocation, has not been previously freed, and run-time system. If a program implements a domain speci c points to the head of the allocation (as this condition is re- allocator at the user level, some memory access errors, as quired by free()). viewed by the programmer, can be missed. The same allocation mechanism is applied to the dynamic Consider, for example, a xed size storage allocator. If a storage allocated in procedure stack frames. When a function program relies heavily on a xed size structure, storage re- is invoked, a capability must be allocated for the entire frame quirements and allocation overheads can be greatly reduced if it contains any referenced locals. Any pointers contained by applying a xed size allocation strategy. At the program in the frame must be set to an invalid state. level, the xed size allocator calls the system allocator, e.g., malloc() or sbrk(), to allocate a large memory allocation. Figure 5 shows how this rewriting would be done for a The xed size allocator then slices the system allocation into C function. The function ZeroFramePointers() serves the xed size pieces with a zero overhead for each allocation. same purpose as the call to bzero() in malloc(); it ensures Under this scheme, our safe programming technique would that any pointers in the procedure stack frame are initially ensure that no accesses to a xed size allocations are out- invalid by clearing the frame storage. Because stack frame side of the space and time bounds of the block from which allocations are strongly typed, ZeroFramePointers() could the xed size allocation was derived. This imprecision oc- be replaced by NULL assignments to all the frame pointers. curs because the translator can not distinguish the user level If the language supports non-local jumps, e.g., longjmp() storage allocation actions from other pointer related program in C, the run-time support must delete the frame capabilities activities. of any elided function frames. This operation can be simply With some programmer intervention this problem can be and portably implemented if the local capability space and overcome. Any useful safe compiler implementation will have heap capability space are kept disjoint, and function frame to include an application programmers interface, or API, capabilities are allocated using an incrementing counter. The through which systems programmers can construct and ma- allocation of frame capabilities then becomes a depth- rst nipulate the object attributes of safe pointers. In the case of numbering [ASU86] of the dynamic call graph. When a non- the xed size storage allocator, the programmer would spec- local jump occurs, all elided frame capabilities between the ify the base and size of the xed size allocation. The storage source frame and destination frame are deleted by removing class and capability would be generated from the safe pointer all frame capabilities in the capability store that are larger to the block from which the xed size allocation was derived. than the frame capability of the destination frame. This Without the second quali cation, the compiler may not mechanism only works if the source and destination frames be able to generate correct object attributes for a pointer are on the same call stack { this stipulation may not be true constant. For example, device driver code typically creates in all cases, e.g., coroutine jumps. pointers to device bu ers and registers by recasting an inte- The capability store is an associative memory containing ger to a pointer value. The translator has no way of knowing
struct { char a; p q capability store char b[100]; } x, *p; char *q; [x,x,x,x,NEVER] [x,x,x,x,NEVER] { } p = &x; [1000,1000,101,Global,FOREVER] [x,x,x,x,NEVER] { } *p; /* no error */ " " " q = &p−>b[10]; " [1011,1001,100,Global,FOREVER] " q−−; " [1010,1001,100,Global,FOREVER] " *q; " " " p −= 2; [798,1000,101,Global,FOREVER] " " *p; /* error!!! */ " " " a) p q capability store char *p, *q; [x,x,x,x,NEVER] [x,x,x,x,NEVER] { } p = malloc(10); [2000,2000,10,Heap,1] [x,x,x,x,NEVER] { 1 } q = p+6; " [2006,2000,10,Heap,1] " *q; /* no error */ " " " free(p); " " { } p = malloc(10); [2000,2000,10,Heap,2] " { 2 } *q; /* error!!! */ " " " b) Figure 6: Memory access checking examples. Figure a) is an example of a spatial access error, Figure b) is an example of a temporal access error. Safe pointer values, shown after each line is executed, are speci ed as a 5-tuple with the following format: [value,base,size,storageClass,capability]. An occurrence of x indicates a don't care value. the size and lifetime of the referent; thus, program safety one in which pointer values are never created from or manip- cannot be maintained. In C, the only well de ned pointer ulated as non-pointer values. If a program violates this rule constants are NULL, strings, and functions. For all other intentionally (e.g., through a recast), a safe compiler which cases, this problem can be avoided by supplying the pro- makes a conservative approximation as to the intended ref- grammer with an API suitable for specifying the size and erent of the new 3pointer value allows the pointer to access lifetime of problematic pointer constants. any live storage. If the rule is broken unintentionally (e.g., The second quali cation does not, however, preclude the through incorrect use of a union), the error will likely be use of recasts from non-pointer variables to pointer variables. caught because it is dicult to manufacture, accidentally, To successfully support these operations, object attributes an unsafe pointer. must be attached to all variables. In general, to provide complete safety, we need to attach object attributes to any storage that could hold a pointer value. It is our contention 5 Optimizing Dereference Checks that most \well behaved" programs will only require pointer variables to carry object attributes. In the interest of performance, it may be possible to elide The nal quali cation protects object attributes. If a pro- dereference checks and still provide complete program safety. gram can arbitrarily manipulate the object attributes of a If we can determine that the following invariant holds, the pointer value, then safety can always be subverted. For ex- check may be elided. ample, changing the storage class of a pointer from Global to Heap and then freeing the pointer would likely cause dis- A check at a dereference of pointer value v may be astrous e ects under our storage allocation scheme. elided at program point p if the previous, equivalent If object attributes are only attached to pointer values, the check executed on v has not been invalidated by some danger exists of manipulation through the use of recasts or program action. unions. With a recast, it is possible to type storage in the We can implement this check optimization either at run- referent rst as a non-pointer value, manipulate the storage time or at compile-time. Run-time check optimization has arbitrarily, and then recast the referent storage to a (possibly the advantage of being more exible. We only need to ex- unsafe) pointer. Using a union, it is possible to create a ecute the checks absolutely required to maintain program pointer value under one eld and then manipulate the object safety. However, the cost for this precision is extra safe attributes of the pointer value through another overlaid, non- pointer state which must copied, maintained, and checked at pointer eld of the union. each dereference. Compile-time check optimization, on the The only solution that we can conceive to prevent this other hand, is less exible because we must constrain the kind of manipulation is to attach object attributes to each decision to elide a check to all previous possible executions byte of allocated storage. For types larger than one byte, the leading to a program point. The advantage of compile-time object attributes would be copied to all other storage holding the allocation. In this way, any arbitrary overlaying of types 3 Note that in this case, safety can no longer be guaranteed would still not allow the object attributes to be manipulated because the intended referent is not known. Hence, we cannot at the program level. bind the object attributes of a live variable to the new pointer In reality, we can provide a high margin of safety for \well because the program may have manipulated the pointer value to behaved" programs by attaching object attributes only to point outside of the intended referent prior to recasting it to a pointer values. We consider a well behaved program to be non-pointer value.
void ValidateAccess( *addr) f : A ow graph G with blocks B with gen[Bi ] and Input if (freeCount != currentFreeCount) f kill[Bi ] computed for each block Bi 2 B. gen[Bi] is the if (storageClass != Global && set of check expressions generated in Bi . kill[Bi ] is the !ValidCapability(capability)) set of check expressions killed in Bi . The entry block is B1 . FlagTemporalError(); Output: A ow graph G with redundant checks deleted. freeCount = currentFreeCount; Method : The following procedure is executed twice, once g for spatial check optimization and again for temporal if (lastDerefAddr != addr) f check optimization. if ((unsigned)(addr-base) > size-sizeof()) FlagSpatialError(); /* initialize out sets */ lastDerefAddr = addr; in[B1 ] = ;; g out[B1 ][= gen[B1 ]; /* valid access! */ U = Bi 8B gen[Bi ]; g for Bi 2 B ; B1 do out[Bi ] = U ; kill[Bi ]; /* compute availability of checks, in sets */ Figure 7: Memory access check with run-time check opti- change = true; mization. The variable currentFreeCount is a global counter while change do begin incremented each time storage is deallocated. change = false; T for Bi 2 B ; B1 do begin check optimization is that no additional overhead is required in[Bi ] = P 2Pred[B] out[P ]; at run-time to determine if a check may be elided. oldout = out[Bi ]; out[Bi ] = gen[Bi] [ (in[Bi ] ; kill[Bi]); 5.1 Run-Time Check Optimization 6 oldout then if out[Bi ] = change = true; We have designed and implemented a framework for dynami- end cally eliding spatial and temporal checks. Spatial checks have end no side e ects, thus we can employ memoization [FH88] (or /* elide redundant checks */ function caching) to elide their evaluation. We store the for Bi 2 B ; B1 do begin operands to the last check in the safe pointer object at- for c 2 gen[Bi ] do begin tributes, which amounts to the e ective address of the last if c 2 in[Bi ] then dereference. At any dereference, the spatial check may be elide check c; end elided if the e ective address since the last check has not end changed. This test is shown in Figure 7 in the if statement surrounding the bounds check. It may be useful to memoize more than one set of operands. In our implementation, we Figure 8: Compile-time check optimization algorithm. memoize both the e ective address of the last dereference, i.e., use of the C operator `*', and the e ective address of In the rst phase, the algorithm seeds the data- ow anal- the last subscript operation, i.e., use of `[ ]'. Changes in ysis by approximating all out sets. For all blocks except the the former can be tracked with only a single \dirty" bit, set entry block, the value of out[Bi ] is set to all check expressions when the pointer value is changed. Changes in the latter are less those killed by the block Bi , i.e., U ; kill[Bi ]. For the tracked by retaining a copy of the last index applied to the program entry block, B1 , we must assume that no checks are pointer value. available, hence, in[B1 ] is set to empty and out[B1 ] is set to To elide temporal checks, we keep a copy of a global the checks generated in the entry block B1 . counter, incremented when storage is deallocated, in the safe In the second phase, the data- ow framework is solved to pointer. If this counter, which we call the free counter, has determine where check expressions reach in the program. For not changed since the last temporal check, the referent has a check expression to reach a node Bi , it must be available at not been freed and the temporal check can be safely elided. Bi for all executions, that is, it must be available in the out In our implementation, we keep separate counters for heap sets of all predecessors to block Bi . This requirement is pre- and stack deallocations. cisely why the con uence operator is intersection. After the data- ow computation converges on a solution, i.e., change 5.2 Compile-Time Check Optimization == false, the set in[Bi ] contains all checks that reach block Bi . We have also designed (and are currently implementing) a In the third phase, the in sets are used to elide redundant compile-time optimization framework like that proposed by checks. Checks may be elided wherever a lexically identical Gupta [Gup90]. Our algorithm implements a forward data- or equivalent (if more powerful tests are applied) check is ow framework similar to that used by common subexpres- available in the block (i.e., the same check is in the in set of sion elimination [ASU86]. However, our algorithm extends the block). previous work to include eliding of temporal error checks, The de ning feature for each analysis (spatial and tem- and because of our simpli ed bounds check, there is no need poral) is the speci cation of what constitutes a kill. A spa- to split the optimization into upper and lower bounds check tial check is killed by any assignment to a check operand, elimination. which includes assignment to the pointer variable or any of Our optimization algorithm is shown in Figure 8. The the operands of the index expression (if the pointer was in- algorithm is run twice, once for optimization of spatial checks dexed in the check expression). A temporal check is killed and again for temporal checks. The algorithm executes in by any deallocation of the referent storage. If the referent three phases. of a free can be determined to be di erent than the check
referent (e.g., through alias analysis), the free need not kill the check. Program Instructions Insts per Dereference While performing these analyses, we must also be wary Static Dynamic Static Dynamic of kills that may occur through function calls or aliases. In 3 (x10 ) 6 (x10 ) either case, we must make a conservative approximation if insucient information is available and assume that a kill Anagram 10.0 19.4 106.3 7.6 does occur. Backprop 10.8 122.4 148.5 8.9 6 Experimental Evaluation GNU BC 19.5 12.2 15.5 7.6 Min-Span 11.9 13.3 48.7 5.9 We evaluated our safe programming methodology by imple- Partition 13.5 21.1 62.4 3.7 menting a semi-automatic source-to-source translator and YACR-2 18.5 546.2 37.1 14.0 examining the run-time, code and data size overheads for six non-trivial programs. For each program, we analyzed its performance without optimization and with run-time re- Table 1: Analyzed programs. solved optimizations; we did not consider compile-time opti- mizations (our current implementation does not support this base, 4 byte size, a 1 byte storage class speci er, and a 2 byte technique, though work in this direction is in progress). capability. For run-time check optimization, we added a 1 byte dirty ag, a 4 byte last index, and a 2 byte free counter 6.1 Experimental Framework for a total size of 22 bytes (450% overhead). Due to a bug in the C++ compiler, we could not use sizeof() in the safe We translated C programs to their safe counterparts by rst pointer implementation if the referent referred to itself; as rewriting all pointer and array declarations, calls to malloc() a result, BC, Min-Span, and Partition all required the size and free(), and references (use of the `&' operator) to use of the referent to be stored in the safe pointer, which added our Safe-C macros. These macros, when passed through the a 4 byte overhead for these programs. There were no space C preprocessor (CPP), produce either the original C program overheads for array variables, as all required object attributes or a Safe-C program. A Safe-C program has all pointer and are known at compile-time. We only rewrote the actual pro- array declarations changed to type parameterized C++ class gram code, all system library routines remained unchecked. declarations. Using operator overloading in the C++ class We did, however, perform interface checking. Whenever a de nition, we implement the extended safe pointer and array system library is called, any pointer arguments are validated semantics as described in Section 3. against the time and space bounds expected by the library All explicit storage allocation, i.e., calls to malloc() and routine. For example, if a call were made to fread(), the free(), call wrapper functions which create safe pointers interface check would ensure that the destination of the read from the standard library routines. Our malloc() implemen- was live storage and that the entire length of the read oper- tation clears all allocated storage, so any contained pointers ation would t into the referent. start in the invalid state. If a local in a function is used as a pointer referent, we also rewrite the function to allocate a 6.3 Results capability for the frame. Any pointer in the stack frame of a function is initialized to an invalid state in the constructor Figure 9 shows the execution overheads for the analyzed of the C++ safe pointer class. Application of the reference programs. The Unopt columns show total dynamic instruc- operator calls a function which creates a safe pointer from tion counts for executions with no optimization, and the Opt the decomposed access path. columns show instruction counts with run-time resolved op- timization. 6.2 Analyzed Programs For the run-time optimized executions, the normalized in- struction counts range from 2.3 (YACR-2) to 6.4 (BC). This We analyzed six programs, selected because each exhibits overhead re ects program performance without any compile- a high frequency of indirect references. The programs in- time optimization. While this performance degradation will clude an anagram generator (Anagram), a neural net sim- likely be acceptable for the development cycle of short or ulator (Backprop), an arbitrary precision calculator (GNU medium length program executions, it may still be pro- BC), a minimum spanning tree generator (Min-Span), a hibitively expensive for very long running programs, and it graph partitioning tool (Partition), and a VLSI channel is certainly too costly a price to pay for in- eld instrumen- router (YACR-2). Table 1 details the programs that we ana- tation of a program. Examining more closely the breakdown lyzed. For each, we show the code size (Instructions/Static), of the execution overheads yields much insight into how the the number of instruction executed without checking (In- performance of our checking methodology could be improved. structions/Dynamic), the frequency of dereferences in the For each program, we break down the overhead costs into program text (Insts per Dereference/Static), and the dy- ve categories. We measured this cost by compiling and run- namic frequency of dereferences executed (Insts per Deref- ning the program repeatedly with incrementally more func- erence/Dynamic). tionality in the safe pointer implementation. Original Pro- All programs were compiled and executed on a DECsta- gram is the instruction count for the unchecked program, tion 3100 using AT&T USL cfront version 3.0.1. The output always normalized to one. of cfront (C code) was compiled using MIPS cc version 2.1 User De ned Ptr is the cost in our framework for imple- at optimization level `-O2'. All instruction counts were ob- menting all pointers as structures at the user level. The pri- tained with QPT [Lar93]. mary factors a ecting performance here are increased loads, For all analyses, object attributes were only attached to stores, and function calls. The rst factor is due to the MIPS pointer values. We used a 15 byte safe pointer (275% over- cc compiler's handling of structure variables; once wrapped head) in the unoptimized case: 4 byte pointer value, 4 byte in a structure, the eld variables are no longer eligible for reg-
12 4.5 4 10 3.5 Normalized Instruction Count Normalized Text Size 8 3 2.5 6 2 4 1.5 1 2 0.5 0 0 Unopt Opt Unopt Opt Unopt Opt Unopt Opt UnoptOpt Unopt Opt Unopt Opt Unopt Opt Unopt Opt Unopt Opt UnoptOpt Unopt Opt Anagram Backprop BC Min-Span Partition YACR-2 Anagram Backprop BC Min-Span Partition YACR-2 Original Program User Defined Ptr Spatial Data Original Program User Defined Ptr Spatial Data Spatial Checks Temporal Data Temporal Checks Spatial Checks Temporal Data Temporal Checks Figure 9: Execution overheads. Figure 10: Text overheads. ister allocation. Wrapping attributes around pointers also in- performance advantage. Second, our run-time resolved op- creases the costs of procedure call parameter passing. MIPS timization of temporal checks is very e ective. Temporal cc compiler passes most scalar arguments through registers; checks are rarely required, even for BC and Min-Span, both however, composite structures are always passed through of which free storage often. In the case of Backprop, adding memory (on the stack). The second major factor a ecting run-time optimization for temporal checks resulted in an in- performance is an increased number of function calls. The creased execution overhead. Backprop has only one dynamic AT&T C++ compiler simpli es complex expressions created object, an array, so temporal checking is relatively cheap during template instantiation by extracting portions of the without any optimization (the capability is always at the expression into static functions. This cost is only a side-e ect head of the hash bucket chain). In this case, the cost of of our implementation. maintaining the extra storage required for the free counter Spatial Data is the cost of maintaining and copying spatial outweighs the cost of executing all temporal checks. object attributes. For the optimized executions, this over- Adding checking code reduces the e ectiveness of many head includes the cost of maintaining the pointer dirty bits traditional compiler optimizations. We inline all check code and previous index values. Spatial Checks is the cost of per- except for calls to ValidCapability() and abort(). These forming spatial checks. Temporal Data is the cost of main- functions are both externally de ned, so the compiler must taining and copying temporal object attributes. For the op- make conservative assumptions as to what actions they take. timized executions, this overhead includes the cost of main- This conservative approximation has the e ect of limiting the taining the additional counter variable. Temporal Checks is e ectiveness of many optimizations such as invariant code the cost of performing temporal checks. motion, register allocation, copy propagation, and common For BC, Min-Span, and Partition, run-time resolved opti- subexpression elimination. Neither of these functions pro- mization paid o with a slightly lower execution cost for spa- duce any side-e ects for normal executions. Hence, better tial checking. For Anagram, Backprop, and YACR-2, adding compiler integration, i.e., providing a special channel of com- run-time checks resulted in a higher cost for spatial access munication between the safe program generator and the com- checking; and in the case of Backprop, a higher overall exe- piler optimizer, would certainly increase the performance of cution overhead. our safe executions.4 These programs demonstrate the trade-o s involved in Text size overheads are shown in Figure 10. All check- providing run-time resolved optimization. Run-time opti- ing code, except the capability routines and what the C++ mization adds the extra overhead of copying, maintaining, compiler extracts for expression simpli cation, is inlined into and checking the extra safe pointer state. If this added over- the original program text. Surprisingly, the text overheads head, plus the overhead of the required checks, is greater than are quite small; 35% to 300% for the unoptimized executa- doing all the checks, there is no advantage to run-time check bles and 41% to 340% for the run-time optimized programs. optimization. With faster checks, compile-time optimization, The text sizes for the run-time optimized programs are larger and spatially complex programs, this trade-o becomes even due to additional code required for maintaining, copying, and more acute. Since Anagram, Backprop, and YACR-2 must checking the extra object attributes. As shown by comparing execute many of their checks (39%, 67%, and 86% respec- Table 1 and Figure 10, there is a strong correlation between tively), they do not bene t from the run-time optimizations. static dereference density and the resulting text overhead. For YACR-2, the e ects are much less pronounced because The data size overheads, shown in Figure 11, are mea- dereferences are much less frequent (as shown in Table 1). sured as the total size of initialized (.data) and uninitialized The second e ect to observe when comparing the opti- (.bss) data segments plus the size of the heap segment when mized to unoptimized execution costs is that the greatest the program terminates execution. The data size overheads bene t of run-time check optimization always comes from on the stack were not measured. All programs, except Min- eliding temporal checks. In fact, adding run-time optimiza- tion for temporal checks caused a signi cant decrease in all 4 Many compilers, e.g. GNU gcc, already understand the spe- execution overheads except Backprop. There are two facets cial semantics of abort() and use this inter-procedural information to this result. First, temporal checks are very expensive to improve optimizations. We should be able to achieve the same (requiring an associative search), so eliding one has a great results for ValidCapability().
4.5 7 Related Work Our rst attempt at creating a safe programming environ- 4 3.5 ment for C employed reference chaining. The technique is similar to that used by many \smart pointer" implementa- Normalized Data Size 3 tions [EP91, Gin92]. The idea is to insert any pointer which 2.5 is generated either through use of explicit storage allocation, 2 e.g., malloc(), the reference operator (`&'), or assignment into a reference chain rooted at the referent. When a pointer 1.5 value is destroyed, e.g., through assignment, storage deallo- 1 cation, or at procedure returns, the pointer is removed from the reference chain. This technique has a number of useful 0.5 properties. First, it is possible to ensure temporal safety by 0 destroying all pointer values when a referent is freed { just march down the reference chain assigning NULL to all point- Unopt Opt Unopt Opt Unopt Opt Unopt Opt UnoptOpt Unopt Opt Anagram Backprop BC Min-Span Partition YACR-2 ers. Second, if a destructed pointer value is the last value in Original Program User Defined Ptr Spatial Data Temporal Data the referent's reference chain, a storage leak has occurred and it is detected immediately. Unfortunately, this tech- nique cannot be made to work reliably in C. It is relatively Figure 11: Data overheads. easy for the programmer to subvert the checking mechanism through recasting and typeless calls to free(). Storage leak Span, have data size overhead below 100%. Backprop has the detection also fails in the presence of circular references. The lowest overhead (less than 5%) because most of its storage safe programming technique described in this paper is signif- is large global arrays which do not require any object at- icantly more reliable because its correctness does not rely on tributes. Min-Span has the highest overhead (330%), which tracking pointer values. stems from the high density of pointers in its heap alloca- Some researchers have recently proposed providing com- tions, most of which contain eight pointers and three inte- plete program safety through limiting the constructs allowed gers. Some of the run-time optimized programs have slightly in the language. The main thrust of this work is the de- larger overheads due to the additional object attributes. sign of languages that support garbage collection reliably To summarize the main points of our results: and portably. In [ED93], a safe subset of C++ is de ned. Execution overheads, even without compile-time opti- The safe subset does not permit any invalid pointers to be mization, are low enough to make our methodology use- created; this restriction, for example, precludes the use of ful during program development. However, the over- any explicit pointer arithmetic. The safe subset also requires heads are not likely low enough that programmers would some checking, but much less than our checking technique re- release software with checking enabled. We are currently quires. Languages which can easily be made totally safe have exploring the use of compile-time optimization and bet- existed for a long time; for example, many FORTRAN imple- ter compiler integration as means of increasing the per- mentations provide complete safety through range checking. formance of our approach. However, these languages tend to be less expressive than in- trinsically unsafe languages such as C or C++. We felt that The largest contributing factors to execution overhead it was important not to restrict the expressiveness available are 1) safe pointer structures are not register allocated, to the programmer. Our checking technique is not limited and 2) many traditional optimizations fail with the addi- by the language upon which it is applied, it can be applied tion of checks. Other performance losses are attributed successfully to compiled or interpreted languages with sub- to the C++ compiler simplifying expressions through scripted and mutable pointers, local references, and explicit the use of static functions, and, due to a bug in the and type-less dynamic storage management. C++ compiler, the need to include the type size of the Table 2 details our work (Safe-C) and ve other published referent in the object attributes. None of these dicul- systems that support memory access checking. ties are without recourse, however. Better integration Hastings' and Joyce's Purify [HJ92] is a commercially between the safe compiler and the optimizer could x available memory access checking tool. It is particularly con- most problems. venient to use because it does not require program source Dynamically eliding spatial checks is generally ine ec- { all transformations are applied to the object code. Pu- tive, primarily because maintaining the extra state, and rify supports both spatial and temporal access error check- checking it, quickly outweighs the cost of executing all ing to heap storage only through the use of a memory state checks. Our spatial check is very cheap to execute, and map which is consulted at each load and store the program pointer intensive programs tend to execute most of the executes. Purify also provides uninitialized read detection, checks anyway. and storage leak detection through a conservative collector [Boe93, BW88]. Spatial access errors are detected by brack- Temporal checks, on the other hand, are very expensive eting both ends of any heap allocation with a \red zone". to perform and are rarely required, so run-time opti- These zones are marked in the memory state map as inac- mization shows to be bene cial in most cases. cessible. When a load or store touches a red zone a memory access error is agged. Temporal access errors are detected The text and data size overhead are generally quite low. by setting the memory state of freed storage to inaccessible. The text overheads for all programs with run-time op- Purify cannot detect all memory access errors. For exam- timization, range from 41% to 340%, with all but two ple, accessing past the end of an array into the region of the below 100%. Data overheads range from 5% to 330%, next variable, or accessing freed storage that has been real- with all but one below 100%. Run-time optimized exe- located cannot be detected. These limitations occur because cutions have slightly larger text and data sizes. Purify does not determine the intended referent of memory
You can also read