Version:

Version:

Version:

Version:

Version:

Version:

Version:

Version:

Version:

Version:

Version:

Version:

Version:


Table of Contents

Administration Guide

Diagnosing memory leak issues

When you find the memory usage is very high and increases very fast in a short time period, it might be a memory leak issue, and you can analyze by the following steps.

Please note memory increase does not always mean a memory leak. A memory leak issue usually has these phenomena:

  • Very fast and abnormal memory increase (usually with common or low traffic level)

  • Continuous memory increase without deallocated

  • Used memory are not deallocated even after traffic drops or stopped

The most important thing for troubleshooting a memory leak issue is to locate which module, process or function causes the memory increase.

  1. Check history logs to see memory resource status:

    Log&Report > Event > Filter > Action > check-resource

    failure msg="mem usage raise too high,mem(67)

  2. Check if there are some memory related print outputs in the console.
  3. Check connection amounts to see if memory increase is possibly caused by too many concurrent connections.

    /# netstat -nat | awk '{print $6}' | sort | uniq -c | sort -r

        319800 ESTABLISHED

        330 FIN_WAIT2

        251 LISTEN

        7 TIME_WAIT

        1 established)

        1 SYN_SENT

        1 Foreign

    If there are too many TIME_WAIT or FIN_WAIT2 connections, it may be abnormal because connections are not closed normally.

    If memory usage still does not decrease when TIME_WAIT or FIN_WAIT2 are released, it may mean memory leak.

  4. Execute “diagnose debug memory” several times, then compare the diff of the output to find which part/module/process has the most increase.

    According to the memory increment speed, you may adjust the interval to execute the command and collect the output.

  5. Use diagnose debug jemalloc-heap & diagnose system jeprof to trace and analyze memory occupation and cause of memory usage over a period of time.
    • If the jemallc profile is activated and the memory usage exceeds the configured threshold, the heap file will be generated in directory /var/log/gui_upload.
    • You can use jemalloc-heap to show or clear the heap files. At most 10 heap files are kept on the device.
    • You can use jeprof to parse the heap file via jeprof tool
    • The jemalloc commands don't give us useful information when the memory doesn't increase.

      1) Enable jemalloc profile

          FortiWeb# diagnose debug jemalloc-conf proxyd enable

      2) if memory increases quickly, execute below command to generate dump files.

          E.g., you can wait the memory usage to increase 10% and execute below commands; and it’s better to repeat this commands for several times when memory increases every 10%:

          FortiWeb# diagnose debug jemalloc proxyd dump

      3) Check the dump heap file generated:

          FortiWeb # diagnose debug jemalloc-heap show

          jeprof.out.28279.1641342474.heap

          jeprof.out.4973.1641276249.heap

      4) After getting a few heap file, execute below command to parse the heap file

         FortiWeb # diagnose system jeprof proxyd

         Using local file /bin/proxyd

         Using local file /var/log/gui_upload/jeprof.out.28279.1641342474.heap

         Total: 124422365 B

         34403589  27.7%  27.7% 34403589  27.7% ssl3_setup_write_buffer

         34262011  27.5%  55.2% 34262011  27.5% ssl3_setup_read_buffer

         18062121  14.5%  69.7% 18062121  14.5% CRYPTO_zalloc

         17011023  13.7%  83.4% 17011023  13.7% _http_init

         9905760   8.0%  91.3%  9905760   8.0% BUF_MEM_grow

         3195135   2.6%  93.9%  3195135   2.6% buffer_new

         1583640   1.3%  95.2% 18857320  15.2% http_substream_process_ctx_create

         …

         Using local file /bin/proxyd

         Using local file /var/log/gui_upload/jeprof.out.4973.1641276249.heap

         Total: 576387295 B

         175840569  30.5%  30.5% 175840569  30.5% ssl3_setup_write_buffer

         175415833  30.4%  60.9% 175415833  30.4% ssl3_setup_read_buffer

         81823328  14.2%  75.1% 81823328  14.2% CRYPTO_zalloc

         72087699  12.5%  87.6% 72612307  12.6% _http_init

         8578052   1.5%  89.1% 84473564  14.7% http_substream_process_ctx_create

         7654262   1.3%  90.5%  7654262   1.3% asn1_enc_save

         7311586   1.3%  91.7%  7311586   1.3% http_get_modify_value_by_name

         6855757   1.2%  92.9%  6855757   1.2% pt_stream_create_svrinfo

         5851046   1.0%  93.9%  5851046   1.0% _hlp_parse_cookie

         5136808   0.9%  94.8%  5136808   0.9% http_process_ctx_create

      5) Use graph tool to analyze the function call relationship from .heap files

          This tool is for internal R&D investigation only. Just for reference.

    • Generate a .dot file on FortiWeb backend shell:

      jeprof --dot /bin/proxyd jeprof.out.4973.1641276249.heap > 1641276249.dot

    • Copy 1601044510.dot to ubuntu;
    • Install graphviz on Ubuntu:

      apt install graphviz

    • Generate a png picture:

      dot -Tpng 1641276249.dot -o 1641276249.png

          A png image will be generated as below, indicating the top memory usage functions, and function call relationship. Taking the case below for example, one can check if HTTPS traffic load increased or related configuration is changed.

      6) You can also download the jeprof.out files and provide them to support team for further investigation:

      /var/log/gui_upload# ls jeprof.out* -l

      -rw-r--r--    1 root     0           109251 Sep 27 18:30 jeprof.out.11164.1632789019.heap

      -rw-r--r--    1 root     0           111975 Dec 22 12:22 jeprof.out.3777.1640200954.heap

      Note: In jeprof.out.3777.1640200954.heap:

      3777 is the PID of proxyd

      1640200954 is the UNIX timestamp; one can use online tools to convert it to a human-readable date so as to just pay attention to recent dump files. This is useful to confirm the recent & current coredump files if there are many files.

      E.g.:

      Epoch Converter - Unix Timestamp Converter

  6. As stated in point 2, after 6.4.0 GA release, a regular monitoring file is generated as /var/log/gui_upload/debug_memory.txt. One can set a memory boundary for it: if the memory usage reaches the boundary and proxyd or ml_daemon is the top 10 high memory usage, it will enable their jemalloc debug function automatically.

    FortiWeb # show full system global

    config system global

      set debug-memory-boundary 70    #memory usage percentage, 1%-100%

    End

Diagnosing memory leak issues

When you find the memory usage is very high and increases very fast in a short time period, it might be a memory leak issue, and you can analyze by the following steps.

Please note memory increase does not always mean a memory leak. A memory leak issue usually has these phenomena:

  • Very fast and abnormal memory increase (usually with common or low traffic level)

  • Continuous memory increase without deallocated

  • Used memory are not deallocated even after traffic drops or stopped

The most important thing for troubleshooting a memory leak issue is to locate which module, process or function causes the memory increase.

  1. Check history logs to see memory resource status:

    Log&Report > Event > Filter > Action > check-resource

    failure msg="mem usage raise too high,mem(67)

  2. Check if there are some memory related print outputs in the console.
  3. Check connection amounts to see if memory increase is possibly caused by too many concurrent connections.

    /# netstat -nat | awk '{print $6}' | sort | uniq -c | sort -r

        319800 ESTABLISHED

        330 FIN_WAIT2

        251 LISTEN

        7 TIME_WAIT

        1 established)

        1 SYN_SENT

        1 Foreign

    If there are too many TIME_WAIT or FIN_WAIT2 connections, it may be abnormal because connections are not closed normally.

    If memory usage still does not decrease when TIME_WAIT or FIN_WAIT2 are released, it may mean memory leak.

  4. Execute “diagnose debug memory” several times, then compare the diff of the output to find which part/module/process has the most increase.

    According to the memory increment speed, you may adjust the interval to execute the command and collect the output.

  5. Use diagnose debug jemalloc-heap & diagnose system jeprof to trace and analyze memory occupation and cause of memory usage over a period of time.
    • If the jemallc profile is activated and the memory usage exceeds the configured threshold, the heap file will be generated in directory /var/log/gui_upload.
    • You can use jemalloc-heap to show or clear the heap files. At most 10 heap files are kept on the device.
    • You can use jeprof to parse the heap file via jeprof tool
    • The jemalloc commands don't give us useful information when the memory doesn't increase.

      1) Enable jemalloc profile

          FortiWeb# diagnose debug jemalloc-conf proxyd enable

      2) if memory increases quickly, execute below command to generate dump files.

          E.g., you can wait the memory usage to increase 10% and execute below commands; and it’s better to repeat this commands for several times when memory increases every 10%:

          FortiWeb# diagnose debug jemalloc proxyd dump

      3) Check the dump heap file generated:

          FortiWeb # diagnose debug jemalloc-heap show

          jeprof.out.28279.1641342474.heap

          jeprof.out.4973.1641276249.heap

      4) After getting a few heap file, execute below command to parse the heap file

         FortiWeb # diagnose system jeprof proxyd

         Using local file /bin/proxyd

         Using local file /var/log/gui_upload/jeprof.out.28279.1641342474.heap

         Total: 124422365 B

         34403589  27.7%  27.7% 34403589  27.7% ssl3_setup_write_buffer

         34262011  27.5%  55.2% 34262011  27.5% ssl3_setup_read_buffer

         18062121  14.5%  69.7% 18062121  14.5% CRYPTO_zalloc

         17011023  13.7%  83.4% 17011023  13.7% _http_init

         9905760   8.0%  91.3%  9905760   8.0% BUF_MEM_grow

         3195135   2.6%  93.9%  3195135   2.6% buffer_new

         1583640   1.3%  95.2% 18857320  15.2% http_substream_process_ctx_create

         …

         Using local file /bin/proxyd

         Using local file /var/log/gui_upload/jeprof.out.4973.1641276249.heap

         Total: 576387295 B

         175840569  30.5%  30.5% 175840569  30.5% ssl3_setup_write_buffer

         175415833  30.4%  60.9% 175415833  30.4% ssl3_setup_read_buffer

         81823328  14.2%  75.1% 81823328  14.2% CRYPTO_zalloc

         72087699  12.5%  87.6% 72612307  12.6% _http_init

         8578052   1.5%  89.1% 84473564  14.7% http_substream_process_ctx_create

         7654262   1.3%  90.5%  7654262   1.3% asn1_enc_save

         7311586   1.3%  91.7%  7311586   1.3% http_get_modify_value_by_name

         6855757   1.2%  92.9%  6855757   1.2% pt_stream_create_svrinfo

         5851046   1.0%  93.9%  5851046   1.0% _hlp_parse_cookie

         5136808   0.9%  94.8%  5136808   0.9% http_process_ctx_create

      5) Use graph tool to analyze the function call relationship from .heap files

          This tool is for internal R&D investigation only. Just for reference.

    • Generate a .dot file on FortiWeb backend shell:

      jeprof --dot /bin/proxyd jeprof.out.4973.1641276249.heap > 1641276249.dot

    • Copy 1601044510.dot to ubuntu;
    • Install graphviz on Ubuntu:

      apt install graphviz

    • Generate a png picture:

      dot -Tpng 1641276249.dot -o 1641276249.png

          A png image will be generated as below, indicating the top memory usage functions, and function call relationship. Taking the case below for example, one can check if HTTPS traffic load increased or related configuration is changed.

      6) You can also download the jeprof.out files and provide them to support team for further investigation:

      /var/log/gui_upload# ls jeprof.out* -l

      -rw-r--r--    1 root     0           109251 Sep 27 18:30 jeprof.out.11164.1632789019.heap

      -rw-r--r--    1 root     0           111975 Dec 22 12:22 jeprof.out.3777.1640200954.heap

      Note: In jeprof.out.3777.1640200954.heap:

      3777 is the PID of proxyd

      1640200954 is the UNIX timestamp; one can use online tools to convert it to a human-readable date so as to just pay attention to recent dump files. This is useful to confirm the recent & current coredump files if there are many files.

      E.g.:

      Epoch Converter - Unix Timestamp Converter

  6. As stated in point 2, after 6.4.0 GA release, a regular monitoring file is generated as /var/log/gui_upload/debug_memory.txt. One can set a memory boundary for it: if the memory usage reaches the boundary and proxyd or ml_daemon is the top 10 high memory usage, it will enable their jemalloc debug function automatically.

    FortiWeb # show full system global

    config system global

      set debug-memory-boundary 70    #memory usage percentage, 1%-100%

    End