Lightning Talks
#### Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

**Resolving the almost decade old checker dependency issue in the Clang Static Analyzer **

Kristóf Umann

As checkers grew in numbers in the Static Analyzer, the problem of certain checkers depending on one another was inevitable. One particular problem, for example, is that a checker called MallocChecker, which despite its name does all sorts of memory allocation and de- or reallocation related checks, depends on CStringChecker to model calls to strcmp. While these checkers are completely separate entities, the Static Analyzer also contains large checker classes that in fact expose multiple checkers to the user: For example, IteratorChecker has a modeling part, and it exposes 3 iterator related checkers, and enabling any of the three will also enable the unexposed modeling part. Having both of these structures makes it difficult to find a solution where the developer (or the experienced user) can easily see what checkers are enabled, as these dependencies are only expressed in the implementation.

This talk is going to discuss elegant solutions as to how these rather fragile checker structures can be preserved by declaring these dependencies in TableGen files, how checker developers (and users) can ensure that when the analyzer is invoked, only the requested checkers will be enabled, and also take a very brief look at what other features the analyzer gained thanks to these issues being resolved.

**Adopting LLVM Binary Utilities in Toolchains **

Jordan Rupprecht

Although many projects have migrated from GCC-based toolchains to Clang-based ones, tools from the GNU Binutils collection are still widely used despite having equivalents in the LLVM project. The problems faced when attempting to use LLVM tools range anywhere from simple command line syntax differences to unimplemented or buggy features. In this talk, I will describe some of the types of challenges we faced when adopting LLVM tools, as well as some of the strategies we used to test the toolchain.

**Multiplication and Division in the Range-Based Constraint Manager**

Ádám Balogh

The default constraint manager of the Clang Static Analyzer is a simple range-based constraint manager: it stores and manages the valid ranges for the values of symbolic expressions. Upon new assumptions it further constrains these ranges which often results in an empty range which tells the analyzer that the assumption is impossible. Until now the constraint manager could handle basic assumptions: A <rel> m, A + n <rel> m and A - n <rel> m where A is a symbolic expression, n and m integer constants and <rel> a relational operator. In the latter two cases where a constant is added or subtracted from the symbolic expression the range of the additive expression is calculated by adjusting the range circularly by the constant. However, it could not cope with division and multiplication, thus not even the range for A*2 could be deduced from the range of A. This shortcoming lead to both false positives and missed true positives.

To improve the true positive/false positive ratio of the analyzer we extended the range-based constraint manager to be able to handle expressions of the format A <mul> k <add> n <rel> m, where A is a symbolic expression, k, m and n integer constants, <mul> a multiplicative operator (* or /), <add> an additive operator (+ or -) and <rel> a relational operator. The main challenge in our work was to correctly scale the ranges in the circular arithmetic: for example in case of signed 8 bit types in A * 2 == 56 the value of A could not only be 28, but also -100. Similarly, in A / 3 == 4 the value of A is not necessarily 12, but anything in range [12..14]. To ensure full correctness we also proved our solution: first we generated every range for every constants in both the 8 bit signed and unsigned arithmetic, then we tested whether the scaling algorithm calculates exactly the same ranges. Finally we extrapolated this algorithm to wider integer types and ported it to the range-based constraint manager. According to our measurements there is no significant change in the performance and in the talk we will present numbers of lost false positives and new true positives.

**Statistics Based Checkers in the Clang Static Analyzer **

Ádám Balogh

In almost every development project there are some conventions that the return value of some functions in an external library must be compared to some extremal value, such as zero. For example, many integer functions return negative number in case of error similarly to pointer functions returning null pointers. In a large project with many external functions it is virtually impossible to formalize all these rules explicitly: they are either unwritten or only exist in a natural language. To help enforcing these rules, we created checkers in the Clang Static Analyzer to explore these rules on statistical base and check the code for them. We currently support two kinds of extremal values: negative numbers for functions returning integers and null pointers for functions returning pointers.

Example:

int i = may_return_return_negative();

v[i]; // error: negative indexing

Exploration and checking for these rules happens in two phases: in the first phase we check every function call and create a summary for each function recording the percentage the return value is checked for negativeness (integer functions) or nullness (pointer functions). If this percentage is above a defined threshold (85% by default) we assume that the rule for the function exists. The second phase is the usual execution of the analyzer where a checker checks the code for violations of the rule: it splits the execution path to two branches at the call of the listed functions, where the return value in one branch is an extremal value (negative for integers or null for pointers) and non-extremal value on the other branch. Other checkers (e.g. the null-pointer dereference checker) are expected to find errors on the extremal-value branch if they are not terminated in the code by checking for the extremal-value. The performance impact of the state-split is low: in at least 85% of the cases the extremal-value branch is terminated quickly, in the remaining cases we expect another checker to create a sink-node because of an error. The new checker is under evaluation on open-source projects. We found some false positives, however their amount can be reduced by involving the arguments into the statistics.

**Flang Update **

Steve Scalpone

An update about the current state of Flang, including a report on OpenMP 4.5 target offload, Fortran performance and the new f18 front end.

**Speakers**

Kristóf Umann

As checkers grew in numbers in the Static Analyzer, the problem of certain checkers depending on one another was inevitable. One particular problem, for example, is that a checker called MallocChecker, which despite its name does all sorts of memory allocation and de- or reallocation related checks, depends on CStringChecker to model calls to strcmp. While these checkers are completely separate entities, the Static Analyzer also contains large checker classes that in fact expose multiple checkers to the user: For example, IteratorChecker has a modeling part, and it exposes 3 iterator related checkers, and enabling any of the three will also enable the unexposed modeling part. Having both of these structures makes it difficult to find a solution where the developer (or the experienced user) can easily see what checkers are enabled, as these dependencies are only expressed in the implementation.

This talk is going to discuss elegant solutions as to how these rather fragile checker structures can be preserved by declaring these dependencies in TableGen files, how checker developers (and users) can ensure that when the analyzer is invoked, only the requested checkers will be enabled, and also take a very brief look at what other features the analyzer gained thanks to these issues being resolved.

Jordan Rupprecht

Although many projects have migrated from GCC-based toolchains to Clang-based ones, tools from the GNU Binutils collection are still widely used despite having equivalents in the LLVM project. The problems faced when attempting to use LLVM tools range anywhere from simple command line syntax differences to unimplemented or buggy features. In this talk, I will describe some of the types of challenges we faced when adopting LLVM tools, as well as some of the strategies we used to test the toolchain.

Ádám Balogh

The default constraint manager of the Clang Static Analyzer is a simple range-based constraint manager: it stores and manages the valid ranges for the values of symbolic expressions. Upon new assumptions it further constrains these ranges which often results in an empty range which tells the analyzer that the assumption is impossible. Until now the constraint manager could handle basic assumptions: A <rel> m, A + n <rel> m and A - n <rel> m where A is a symbolic expression, n and m integer constants and <rel> a relational operator. In the latter two cases where a constant is added or subtracted from the symbolic expression the range of the additive expression is calculated by adjusting the range circularly by the constant. However, it could not cope with division and multiplication, thus not even the range for A*2 could be deduced from the range of A. This shortcoming lead to both false positives and missed true positives.

To improve the true positive/false positive ratio of the analyzer we extended the range-based constraint manager to be able to handle expressions of the format A <mul> k <add> n <rel> m, where A is a symbolic expression, k, m and n integer constants, <mul> a multiplicative operator (* or /), <add> an additive operator (+ or -) and <rel> a relational operator. The main challenge in our work was to correctly scale the ranges in the circular arithmetic: for example in case of signed 8 bit types in A * 2 == 56 the value of A could not only be 28, but also -100. Similarly, in A / 3 == 4 the value of A is not necessarily 12, but anything in range [12..14]. To ensure full correctness we also proved our solution: first we generated every range for every constants in both the 8 bit signed and unsigned arithmetic, then we tested whether the scaling algorithm calculates exactly the same ranges. Finally we extrapolated this algorithm to wider integer types and ported it to the range-based constraint manager. According to our measurements there is no significant change in the performance and in the talk we will present numbers of lost false positives and new true positives.

Ádám Balogh

In almost every development project there are some conventions that the return value of some functions in an external library must be compared to some extremal value, such as zero. For example, many integer functions return negative number in case of error similarly to pointer functions returning null pointers. In a large project with many external functions it is virtually impossible to formalize all these rules explicitly: they are either unwritten or only exist in a natural language. To help enforcing these rules, we created checkers in the Clang Static Analyzer to explore these rules on statistical base and check the code for them. We currently support two kinds of extremal values: negative numbers for functions returning integers and null pointers for functions returning pointers.

Example:

int i = may_return_return_negative();

v[i]; // error: negative indexing

Exploration and checking for these rules happens in two phases: in the first phase we check every function call and create a summary for each function recording the percentage the return value is checked for negativeness (integer functions) or nullness (pointer functions). If this percentage is above a defined threshold (85% by default) we assume that the rule for the function exists. The second phase is the usual execution of the analyzer where a checker checks the code for violations of the rule: it splits the execution path to two branches at the call of the listed functions, where the return value in one branch is an extremal value (negative for integers or null for pointers) and non-extremal value on the other branch. Other checkers (e.g. the null-pointer dereference checker) are expected to find errors on the extremal-value branch if they are not terminated in the code by checking for the extremal-value. The performance impact of the state-split is low: in at least 85% of the cases the extremal-value branch is terminated quickly, in the remaining cases we expect another checker to create a sink-node because of an error. The new checker is under evaluation on open-source projects. We found some false positives, however their amount can be reduced by involving the arguments into the statistics.

Steve Scalpone

An update about the current state of Flang, including a report on OpenMP 4.5 target offload, Fortran performance and the new f18 front end.

Monday April 8, 2019 2:35pm - 3:05pm CEST

Theatre

Theatre