How to optimize the content and the scope of the analysis for Kiuwan Local Analyzer (KLA)

Select the right source code to analyze

Before you start an analysis with KLA, you have to provide a source code directory. All the files available in this directory will be analyzed. The size of the source code to be analyzed affects proportionally the time and memory used for the analysis execution.

Avoiding analyzing unneeded code is the first approach to reduce time and memory.

See our guide on Setting Source Code Filters with KLA.

As a rule of thumb, big source files are good candidates to be excluded from the analysis, for example:

Auto-generated code;
Library components;
Database exports.

To identify these large files,find the discovery.diagnosis.txt file in the temp directory of your analysis. It will show:

The number of files to analyze for every technology;
Any files bigger than a preconfigured threshold (200Kb).

Execute a ruleset according to your needs

Rules analysis step executes all your model’s active rules for every file.

The default model (CQM) contains aprox 900 rules, being active aprox 700.

This means that for every file, 700 rules will be executed on its source code.

Are all active rules needed ?

In a large analysis (for example, with thousand of files), you most probably will only be interested in “important” defects.

A large set of rules will generate defects for high importance rules as well as for very-low ones.

Low priority rules will generate thousands of non-important defects that will increase the resources needed for your analyses.

Try to focus on your analysis needs. Avoid generate more defects than needed.

Use a model that best suit your needs, activating only those rules that are really important for you.

To activate only important rules is the most efficient way to execute the analyses as well as to “consume” the produced results.

Mute vs deactivate a rule

Reasons for mute defects can be of different nature, being the most common to hide defects that are considered false positives.

But muting a defect is supposed to be something ocasional.

Bear in mind that muting a rule only “hides” its defects, but the rule is still being executed.

If you are muting too many false positives, you should immediately contact Kiuwan Technical Support (and deactivate that rule).
If the reason to mute a rule is because the discovered defects do not apply to your application or because are not of your interest, deactivate the rule.

You will speed up the analysis process and make your analyses more manageable.

Please visit https://www.kiuwan.com/docs/display/K5/Models+Manager+User+Guide on how to deactivate rules and managing Kiuwan models

Process JSP in Java analyses

If you are analyzing Java, there’s a configuration option that has a considerable impact on analysis performance and memory needs:

process JSP as Java servlets?

If this option is set to true (the default value), for every JSP Kiuwan will internally generate its java servlet code and will execute the java rules to it.

This servlet code generation consumes a considerable amount of time and memory.

The advantage to generate it is a higher precision in detecting Code Security vulnerabilities spread between JSPs and Java files (mainly XSS).
If this is not your concern, you can set this property to false and the execution will be faster and will run with less memory needs.

Pay attention to SQL analyses

Kiuwan associates source files and technologies through file extensions.

And this association is used by KLA to execute the adequate engine on the source files.

See https://www.kiuwan.com/docs/display/K5/Kiuwan+Supported+Technologies for a full detail on extensions and technologies.

But there are some extensions that are commonly associated to more than one technology:

.sql is a typical example, it matches PL_SQL, Transact and Informix,
.c/.h are also the case for C, C++ and Objective-C

When running in GUI mode, KLA detects such ambiguous situations and asks the user to resolve it by selecting the adequate technology. Then, for example, the user might select plsql because he/she knows that it’s analyzing an Oracle application.

Instead, when running in CLI mode, by default KLA will execute (in the sql case) the three available sql engines, wasting time and resources and producing confusing results (as will generate defect information from all those engines and corresponding rules).

An easy way to avoid unnecessary processing is specifying supported.technologies parameter with only the proper technologies when invoking KLA in CLI mode.

If you know that you are analyzing PL_SQL, be sure to delete Transact and Informix from the list of supported technologies.

For further info please visit Command Line Interface - SupportedTechnologies

Another example, it’s quite common to analyze applications that include export/import SQL scripts.

These scripts are usually huge files. If you do not exclude those script files, and do not change default sql configuration, Kiuwan will analyze those huge files with all the sql engines.

You can imagine the waste of time and resources ...

As general rules:

be careful to specify only the adequate sql engine in supported.technologies parameter.
be sure to exclude export/import script files from the analysis

Duplicated code analysis

Duplicated code analysis (aka clone detection) is also quite a memory and cpu intensive task.

Nevertheless, it allows to be configured to modify its working mode, then reducing time and memory requirements.

There’s a couple of aspects that affects resource consumption (mainly memory and execution time):

how to manage literals and identifiers
the minimum number of tokens a clone must have

This article (https://www.kiuwan.com/blog/avoid-duplicated-code-with-clone-detector/) explains how clone detector works and the different ways of configuring it.

As you can read in the above article, ignoring literals and identifiers its a “smart” way to find clones, but in many circumstances it’s not obvious to understand.

Most of the times, we want to identify duplicated code as “identical” code.

You can set this way of working (i.e. only detecting identical code blocks) by specifying the following properties:

{language}.ignore.literals=false
{language}.ignore.identifiers=false

Also, the minimum number of tokens of a clon (200 by default) can be changed.

If clone detector raises many duplicated blocks, increase the number of tokens.

Doing so, there will be less clones, reducing this way the amount of memory needed to execute the clone detection process.

Just in case you are not interested at all in duplication code analysis, you can make Kiuwan not to execute it.

To do it, in KLA CLI mode, specify ignore=clones at the command line.

Page tree

Optimize the Content and the Scope of the Analysis for Kiuwan Local Analyzer