The system provides three dimensions to view the domain data:
A high level summary of data collected for the domain, including Top 10 URLs by Hit, Violations triggered by anomalies, HMM learning process, Event Dashboard.
Display the entire URL directory of the domain in a tree view. You can click the URL path to view its violation statistics.
Display statistics related with parameters, such as HMM learning stages, boxplots, distribution of anomalies. You can also rebuild parameters or set the strictness level for anomalies.
To view the collected domain data:
- Click Machine Learning > Anomaly Detection.
- Double-click a server policy that contains the desired anomaly detection profile.
- Scroll down to the bottom of the Edit Anomaly Detection Configuration page.
- In the Action column, click (View Domain).
The Overview tab provides a summary of data collected for the domain through the use of the anomaly detection profile. It reports information about the entire domain, including the domain overview, Top 10 URLs by Hit, HMM Learning Progress, Violations Triggered by Anomalies, and Events Dashboard.
The top of the Overview page provides a high-level summary of the data that the machine-learning model has learned about the domain.
Indicates how frequent this application is being accessed.
The date and time when the machine-learning module started to learn about the domain.
The total number of URLs that the machine-learning module has learned.
The total number of the alerts, including both Alert action and Alert & Deny action, that has been issued since the start time up to the present moment, as well as the percentage of each in the total number of requests.
The total amount of the HTTP and the HTTPS traffic from the start time up to now.
The charset of URLs in the domain, such as UTF-8.
The Top 10 URLs by Hit chart displays the top 10 URLs for page hits counts.
This chart displays the statistics of HMM learning states of all parameters in the domain.
Indicates that the learning progress of parameters is in the sample collecting stage.
Indicates that, after successfully collected the samples, the anomaly detection module has begun to build all the needed mathematical models for the parameters. This is the mathematical models-building stage.
Indicates that, after successfully built the mathematical models, the models are being tested. All models are required to be tested against a certain number of samples until they have proved to be stable.
Indicates that the mathematical models of the parameters are stable, and the anomaly detection model is running. Requests triggering an anomaly will move into the second anomaly detection layer to check whether they are actual threats.
Indicates that FortiWeb has determined that it cannot build a mathematical model for these parameters, and therefore will not use anomaly detection to protect them.
This chart displays the total number of the potential anomalies and definite anomalies found by the anomaly detection pofile.
This chart displays the anomaly detection events, such as sample collection, model running, building and testing, along with the time periods when these events take place.
The Tree View displays the entire URL directory of the domain in a tree view. You can choose either one of the URLs to view its violation statistics. Please note that only the URLs with parameters are included in the Tree View directory.
The left panel of the Tree View page shows the directory structure of the website. The / (backslash) indicates the root of the site. You can click a URL in the directory tree, then the violation statistics of this URL will be displayed on the right side of the Tree View page. You can also click a directory, then click Relearn Directory or Rebuild Directory to relearn or rebuild anomaly detection models for all the URLs under the selected directory.
This part of the Tree View page shows the statistics of a specific URL.
The frequency at which this URL was accessed in last 24 hours. The frequency is divided into 7 levels, as defined below:
|Model Initialization Date||
The date and time when the mathematical model of this URL was initialized. It shows when FortiWeb began to learn about the data of this URL.
The actions taken for this URL for all requests in last 24 hours, including the number of requests alerted and blocked.
The anomalies detected by the machine learning model.
This chart shows the trend of violations in last 24 hours, including the number of violations alerted and blocked.
This chart shows the number of violations triggered by anomaly type in the last 24 hours.
The Tree View page also provides two control buttons: Rebuild URL and Import.
- Rebuild URL—Click this button to clear the preceding mathematical model for the parameters in this URL, and then begin collecting new samples and build the models again. The samples collected for the previous model will be discarded.
- Relearn URL—Click this button to clear the preceding mathematical model for the parameters in this URL, and then begin collecting more samples to build the model. The samples collected for the previous model will be not discarded. They will be reused to build the new model.
- Import— Click this button to import an existing mathematical model of a specific parameter. For information on exporting data of a parameter, see Actions you can take on any parameter.
Parameters tab shows the HMM learning states of all the parameters attached to the URL. For example, if the URL is http://www.demo.com/1.php?user_name=jack, then user_name is the parameter. An URL can contain multiple parameters. Click the (View HMM Details) icon to view details on this parameter.
Parameter View displays anomaly detection statistics for all the parameters. Click the parameter name in the left-side navigation bar to see details for this parameter.
Parameter Name: The name of the parameter.
HMM Learning Stage: The stage which the HMM learning process is in. It can be one of the following:
- Collecting—The system is collecting data samples.
- Building—Sample collection is completed, and is building the mathematical models. Note: This phase last only a few seconds.
- Testing—In this phase, the system collects 500 samples for this argument, and tests them against the mathematical model. If 5% of the samples for this argument are recognized as anomalies, this mathematical model is considered invalid. The system will discard the learning results and rebuild the mathematical model.
- Running—The system enters this stage after the testing has completed successfully. FortiWeb will use this mathematical model to evaluate all new samples for this argument. If the samples are anomalies, the system will employ the second anomaly detection layer to verify whether the anomaly is an attack and take the corresponding action.
- Discarded—FortiWeb has determined that it cannot build a mathematical model for these parameters, and therefore will not use anomaly detection to protect them.
Collected Samples: The number of samples collected during the sample collection period.
Please note that the diagrams introduced below are available only when the status is in testing or running stage.
Applications change frequently as new URLs are added and existing parameters provide new functions. This means the mathematical model of the same parameter might be different than what FortiWeb originally observed during the collection phase. In this case, FortiWeb needs to re-learn the parameter and then updates the mathematical model for it.
First of all, FortiWeb needs to determine that the functions of the parameter have changed. To do that, it uses boxplots to depict numerical data and the probability distribution of a certain number of parameter values.
Every time the system observes 500 valid parameter values, it generates one boxplot to display the probability distribution of these values. During sample collection period, the system generates 2 or 4 boxplots (sample boxplots). After anomaly detection model is built, the system will keep on generating new boxplots to display the probability distribution of the new inputs. The following is an example of the boxplot diagram. The new boxplot is shown in blue, whereas the sample boxplots are brown. The system displays at most five new boxplots. With new inputs coming in and new boxplot generated, the system will remove the oldest one at the left to spare a place for the new boxplot.
In the boxplot diagram, the median rectangular area in the boxplot where most of the data is located is called the notch area, whereas the entire area containing all the data from the maximum value to the minimum value is called the entire data distribution area. Depending on the Application Change Sensitivity you set in the anomaly detection profile, when the system observes different extent of overlapping area between the new boxplot and sample boxplots, it determines that the functions of the parameter have changed and then updates mathematical model for this parameter (i.e., re-collect samples and build model).
- Low—The system triggers model update only when the entire data distribution area of the new boxplot doesn't have any overlapping part with that of the sample boxplots.
- Medium—The system triggers model update if the notch area of the new boxplot doesn't have any overlapping part with the entire data distribution areas of the sample boxplots.
- High—The system triggers model update as long as the notch area of the new boxplot doesn't have any overlapping part with that of the sample boxplots.
The number of boxplots do not overlap configuration in anomaly detection profile is also a key factor to consider. For example, if you set 2 in this option, the system triggers model update when 2 new boxplots don't overlap with the sample boxplots.
This diagram displays the potential or definite anomalies in red and the normal requests collected during sample collection phase in blue. The system judges whether a request is normal or not based on its probability and the length of the parameter value.
The system uses the following formula to calculate whether a sample is an anomaly:
The probability of the anomaly > μ + the strictness level * σ
If the probability of the sample is larger than the value of "μ + the strictness level * σ", this sample will be identified as anomaly.
μ and σ are calculated based on the probabilities of all the samples collected during the sample collection period, where μ is the average value of all the parameters' probabilities, σ is the standard deviation. They are fixed values. So, the value of "μ + the strictness level * σ" varies with the strictness level you set. As shown in the following diagram, the dotted red line (that is, the value of "μ + the strictness level * σ") stays at the position where the strictness level is set to 3, as in μ + 3σ. If the strictness level is set to a smaller value, then the dotted red line will move closer to the center, which may cause some samples to be detected as anomaly. In a word, the smaller the value of the strictness level is, the more strict the anomaly detection model will be.
You can use the following options to experiment on the strictness levels.
Inherit global settings: Select this option if you want this parameter to inherit the strictness level you have set for the domains in the anomaly detection policy.
Custom settings: Select this option if you want a different strictness level for this parameter. Specify different values and observe the movement of dotted red line in the Anomaly Strictness Level Details diagram. Choose an appropriate value to get the most optimistic detection accuracy, meanwhile the normal samples are not be falsely detected as anomalies.
Test Sample : Click Test Sample, then enter a parameter value to verify whether it will be detected as an anomaly at the current strictness level.
There is a configuration button which, when clicked, will open a drop-down menu with the following options.
|Rebuild Parameter||Clear the preceding mathematical model for the parameter, and then begin collecting new samples and build the models again. The samples collected for the previous model will be discarded.|
Clear the preceding mathematical model for the parameter, and then begin collecting more samples to build the model. The samples collected for the previous model will be not discarded. They will be reused to build the new model.
|Discard||Discards this parameter and does not re-build it. This will disable the learning for this parameter and bypass anomaly detection all together for this parameter.|
|Export||Export the mathematical model for this parameter to a file. You can import the model to arbitrary URL. See Import under Rebuild URL and Import buttons|
The abnormal samples detected during the sample collection period. They are excluded from the samples used to build the anomaly detection model.
The samples which have been recognized as anomalies. The list may change as new strictness settings are applied.
These are the samples manually added from the attack logs. For more information, see Add additional sample from attack logs.
The anomaly detection events, such as sample collection, model running, building and testing, along with the time periods when these events take place. These events are also displayed in the anomaly detection Events dashboard in Overview tab.