Viewing domain data

The system provides three dimensions to view the domain data:

Overview
A high level summary of data collected for the domain, including Top 10 URLs by Hit, Violations triggered by anomalies, HMM learning process, Event Dashboard.
Tree View
Display the entire URL directory of the domain in a tree view. You can click the URL path to view its violation statistics.
Parameter View
Display statistics related with parameters, such as HMM learning stages, boxplots, distribution of anomalies. You can also rebuild parameters or set the strictness level for anomalies.

To view the collected domain data:

Click Machine Learning > Anomaly Detection.
Double-click a server policy that contains the desired anomaly detection profile.
Scroll down to the bottom of the Edit Anomaly Detection Configuration page.
In the Action column, click (View Domain).

Overview

The Overview tab provides a summary of data collected for the domain through the use of the anomaly detection profile. It reports information about the entire domain, including the domain overview, Top 10 URLs by Hit, HMM Learning Progress, Violations Triggered by Anomalies, and Events Dashboard.

Domain overview

The top of the Overview page provides a high-level summary of the data that the machine-learning module has learned about the domain.

Parameters	Description
Access Frequency	Indicates how frequent this application is being accessed.
Start Time	The date and time when the machine-learning module started to learn about the domain.
URL Number	The total number of URLs that the machine-learning module has learned.
Action (Alert/Block)	The total number of the alerts, including both Alert action and Alert & Deny action, that has been issued since the start time up to the present moment, as well as the percentage of each in the total number of requests.
Service(HTTP/HTTPS)	The total amount of the HTTP and the HTTPS traffic from the start time up to now.
Page Charset	The charset of URLs in the domain, such as UTF-8.

Top 10 URLs by Hit

The Top 10 URLs by Hit chart displays the top 10 URLs for page hits counts.

HMM Learning Progress

This chart displays the statistics of HMM learning states of all parameters in the domain.

Parameters	Description
Collecting	Indicates that the learning progress of parameters is in the sample collecting stage.
Building	Indicates that, after successfully collected the samples, the anomaly detection module has begun to build all the needed mathematical models for the parameters. This is the mathematical models-building stage.
Testing	Indicates that, after successfully built the mathematical models, the models are being tested. All models are required to be tested against a certain number of samples until they have proved to be stable.
Running	Indicates that the mathematical models of the parameters are stable, and the anomaly detection model is running. Requests triggering an anomaly will move into the second anomaly detection layer to check whether they are actual threats.
Discarded	Indicates that FortiWeb has determined that it cannot build a mathematical model for these parameters, and therefore will not use anomaly detection to protect them.

Violations Triggered by Anomalies

This chart displays the total number of the potential anomalies and definite anomalies found by the anomaly detection pofile.

Event Dashboard

This chart displays the anomaly detection events, such as sample collection, model running, building and testing, along with the time periods when these events take place.

Tree View

The Tree View displays the entire URL directory of the domain in a tree view. You can choose either one of the URLs to view its violation statistics.

Web site directory

The left panel of the Tree View page shows the directory structure of the website. The / (backslash) indicates the root of the site. You can click a URL in the directory tree, then the violation statistics of this URL will be displayed on the right side of the Tree View page. You can also click a directory, then click Rebuild Directory to rebuild anomaly detection models for all the URLs under the selected directory.

URL-specific data

This part of the Tree View page shows the statistics of a specific URL.

Parameters	Description
Access Frequency	The frequency at which this URL was accessed in last 24 hours. The frequency is divided into 7 levels, as defined below: Level1 ( over 500 requests ) Level2 ( over 1000 requests ) Level3 ( over 1500 requests ) Level4 ( over 2000 requests ) Level5 ( over 2500 requests ) Level6 ( over 3000 requests ) Level7 ( over 3500 requests )
Model Initialization Date	The date and time when the mathematical model of this URL was initialized. It shows when FortiWeb began to learn about the data of this URL.
Action (Alert/Block)	The actions taken for this URL for all requests in last 24 hours, including the number of requests alerted and blocked.

Parameters

Description

Access Frequency

The frequency at which this URL was accessed in last 24 hours. The frequency is divided into 7 levels, as defined below:

Level1 ( over 500 requests )
Level2 ( over 1000 requests )
Level3 ( over 1500 requests )
Level4 ( over 2000 requests )
Level5 ( over 2500 requests )
Level6 ( over 3000 requests )
Level7 ( over 3500 requests )

Model Initialization Date

The date and time when the mathematical model of this URL was initialized. It shows when FortiWeb began to learn about the data of this URL.

Action (Alert/Block)

The actions taken for this URL for all requests in last 24 hours, including the number of requests alerted and blocked.

Violation Trend

This chart shows the trend of violations in last 24 hours, including the number of violations alerted and blocked.

Triggered Violations Based on Anomaly Type

This chart shows the number of violations triggered by anomaly type in the last 24 hours.

Rebuild URL and Import buttons

The Tree View page also provides two control buttons: Rebuild URL and Import.

Rebuild URL—Click this button to discard all existing learning results of the URL, including the method of the URL and the mathematical model of all arguments, and then relearn all data about the URL.
Import— Click this button to import an existing mathematical model of a specific parameter. For information on exporting data of a parameter, see Actions you can take on any parameter.

Parameters

Parameters tab shows the HMM learning states of all the parameters attached to the URL. For example, if the URL is http://www.demo.com/1.php?user_name=jack, then user_name is the parameter. An URL can contain multiple parameters. Click the (View HMM Details) icon to view details on this parameter.

Column	Description
Parameter Name	The name of the parameter attached to this URL.
HMM Learning Stage	The stage which the HMM learning process is in. It can be one of the following: Collecting—The system is collecting data samples. Building—Sample collection is completed, and is building the mathematical models. Note: This phase last only a few seconds. Testing—In this phase, the system collects 500 inputs for this argument, and tests them against the mathematical model. If 5% of the inputs for this argument are recognized as anomalies, this mathematical model is considered invalid. The system will discard the learning results and rebuild the mathematical model. Running—The system enters this stage after the testing has completed successfully. FortiWeb will use this mathematical model to evaluate all new inputs for this argument. If the inputs are anomalies, the system will employ the second anomaly detection layer to verify whether the anomaly is an attack and take the corresponding action. Discarded—FortiWeb has determined that it cannot build a mathematical model for these parameters, and therefore will not use anomaly detection to protect them.
HMM Details	Click the (View HMM Details) icon to view the probability boxplots and distribution of anomalies triggered by HMM. Note: The boxplots and anomaly distribution chart are available only parameter status is in testing or running stage. See the discussions below.

Column

Description

Parameter Name

The name of the parameter attached to this URL.

HMM Learning Stage

The stage which the HMM learning process is in. It can be one of the following:

Collecting—The system is collecting data samples.
Building—Sample collection is completed, and is building the mathematical models. Note: This phase last only a few seconds.
Testing—In this phase, the system collects 500 inputs for this argument, and tests them against the mathematical model. If 5% of the inputs for this argument are recognized as anomalies, this mathematical model is considered invalid. The system will discard the learning results and rebuild the mathematical model.
Running—The system enters this stage after the testing has completed successfully. FortiWeb will use this mathematical model to evaluate all new inputs for this argument. If the inputs are anomalies, the system will employ the second anomaly detection layer to verify whether the anomaly is an attack and take the corresponding action.
Discarded—FortiWeb has determined that it cannot build a mathematical model for these parameters, and therefore will not use anomaly detection to protect them.

HMM Details

Click the (View HMM Details) icon to view the probability boxplots and distribution of anomalies triggered by HMM.

Note: The boxplots and anomaly distribution chart are available only parameter status is in testing or running stage. See the discussions below.

Allow Method

You can set the HTTP request methods that are allowed to access the URL.

There are two ways to set the allow method: By Machine Learning, Customized.

Method	Description
By Machine Learning	If you choose By Machine Learning, the system will automatically set the HTTP request methods in the Allow Method Settings based on the result of machine learning. The system collects samples of HTTP requests for this URL. The system refers to the Trust or Black IP list configured in the Anomaly Detection profile to decide whether to collect samples from a certain client. If the content type of the request is HTML or Text, the system collects 1024 samples for this URL. For other content types, the system collects 256 samples. You can set the sample collection time period using the following command. config waf machine-learning-policy edit <policy-id> set method-learning-time next end The system will not stop collecting samples unless the expected number of samples are collected and the collection has lasted for the specified time period. If an HTTP request method is used by more than 1% requests of the overall requests, the anomaly detection model will allow this method in the Allow Method Settings. Click the Rebuild Method button to rebuild the methods if you think the methods learned by machine leaning model are not reasonable.
Customized	This approach allows you to customize the allow methods.

Method

Description

By Machine Learning

If you choose By Machine Learning, the system will automatically set the HTTP request methods in the Allow Method Settings based on the result of machine learning.

The system collects samples of HTTP requests for this URL. The system refers to the Trust or Black IP list configured in the Anomaly Detection profile to decide whether to collect samples from a certain client.

If the content type of the request is HTML or Text, the system collects 1024 samples for this URL. For other content types, the system collects 256 samples.

You can set the sample collection time period using the following command.

config waf machine-learning-policy

edit <policy-id>

set method-learning-time

end

The system will not stop collecting samples unless the expected number of samples are collected and the collection has lasted for the specified time period.

If an HTTP request method is used by more than 1% requests of the overall requests, the anomaly detection model will allow this method in the Allow Method Settings.

Click the Rebuild Method button to rebuild the methods if you think the methods learned by machine leaning model are not reasonable.

Customized

This approach allows you to customize the allow methods.

To set a custom allowed method:

Click the Customized tab.
Select any method(s) of interest.
Click Apply.

To switch back to the default allowed method (machine learning):

Click the By Machine Leaning tab.
Click Apply.

Parameter View

Parameter View displays anomaly detection statistics for all the parameters. You can click Add Filter at the top left of the page, and filter the parameters by name or learning status.

Probability Boxplots

Applications change frequently as new URLs are added and existing parameters provide new functions. This means the mathematical model of the same parameter might be different than what FortiWeb originally observed during the collection phase. In this case, FortiWeb needs to re-learn the parameter and then updates the mathematical model for it.

First of all, FortiWeb needs to determine that the functions of the parameter have changed. To do that, it uses boxplots to depict numerical data and the probability distribution of a certain number of parameter values.

Every time the system observes 500 valid parameter values, it generates one boxplot to display the probability distribution of these values. During sample collection period, the system generates 2 or 4 boxplots (sample boxplots). After anomaly detection model is built, the system will keep on generating new boxplots to display the probability distribution of the new inputs. The following is an example of the boxplot diagram. The new boxplot is shown in blue, whereas the sample boxplots are brown. The system displays at most five new boxplots. With new inputs coming in and new boxplot generated, the system will remove the oldest one at the left to spare a place for the new boxplot.

In the boxplot diagram, the median rectangular area in the boxplot where most of the data is located is called the notch area, whereas the entire area containing all the data from the maximum value to the minimum value is called the entire data distribution area. Depending on the Application Change Sensitivity you set in the anomaly detection profile, when the system observes different extent of overlapping area between the new boxplot and sample boxplots, it determines that the functions of the parameter have changed and then updates mathematical model for this parameter (i.e., re-collect samples and build model).

Low—The system triggers model update only when the entire data distribution area of the new boxplot doesn't have any overlapping part with that of the sample boxplots.
Medium—The system triggers model update if the notch area of the new boxplot doesn't have any overlapping part with the entire data distribution areas of the sample boxplots.
High—The system triggers model update as long as the notch area of the new boxplot doesn't have any overlapping part with that of the sample boxplots.

The number of boxplots do not overlap configuration in anomaly detection profile is also a key factor to consider. For example, if you set 2 in this option, the system triggers model update when 2 new boxplots don't overlap with the sample boxplots.

Distribution of Anomalies triggered by HMM

Distribution of Anomalies triggered by HMM displays the potential or definite anomalies in red and the normal requests collected during sample collection phase in blue. The system judges whether a request is normal or not based on its probability and the length of the parameter value.

Manage anomaly-detecting settings

This section of the page shows the settings the system uses to detect anomalies. You can either click the Inherit global setting tab to use the anomaly detection settings for this anomaly detection profile, or click the Custom settings tab to define settings specific to this parameter. Both tabs use the same settings to detect anomalies:

Strictness Level for Potential Anomaly
Strictness Level for Definite Anomaly

These two settings control how strict you wish to detect the anomalies. The value can range from 0.1 to 1.0. The higher the value, the more strict the detection of anomalies. For example, 0.1 means that 0.1% of all samples with the largest HMM probability and length will be treated as anomalies.

Changing the value of strictness here will cause changes in the Distribution of Anomalies triggered by HMM chart.

Definite anomalies are far more serious than potential anomalies. Therefore, the Strictness Level for Definite Anomaly must be lower than the Strictness Level for Potential Anomaly.

To set the anomaly detection settings:

Click either the Inherit global setting or the Custom settings tab.
Set the Strictness Level for Potential Anomaly.
Set the Strictness Level for Definite Anomaly.
Click Apply.

Actions you can take on any parameter

There is a configuration button which, when clicked, will open a drop-down menu with three options, as illustrated below.

Menu option	Description
Rebuild Parameter	Clears the preceding mathematical model for the parameter, and then begins to collect samples and build the models again. Use this option when you think that the current model can not meet your needs. For example, it creates some false positives or fails to detect some attacks.
Discard	Discards this parameter and does not re-build it. This will disable the learning for this parameter and bypass anomaly detection all together for this parameter.
Export	Export the mathematical model for this parameter to a file. You can import the model to arbitrary URL. See Import under Rebuild URL and Import buttons

Anomaly Samples

The samples which have been recognized as potential anomalies and definite anomalies. The list may change as new strictness settings are applied.

Additional Samples

These are the samples manually added from the attack logs. For more information, see Add additional sample from attack logs.

Top 10 Source IPs

Top 10 Source IPs of the samples used for building the anomaly detection model. The percentage in the Percent column equals to the sample count from this source IP divided by the total sample count.

If Server Objects > X-Forwarded-For is set and referred in Web Protection Profile, the system will record the source IPs based on the X-Forwarded-For policy configuration. If FortiWeb is deployed behind a proxy/load balancer which applies NAT, it's recommended to enable X-Forwarded-For, otherwise the source IPs recorded by anomaly detection models will be the IP address of the proxy or load balancer, not the original client.

Events

The anomaly detection events, such as sample collection, model running, building and testing, along with the time periods when these events take place. These events are also displayed in the anomaly detection Events dashboard in Overview tab.