System commander Script for troubleshooting Fortinet Infrastructure AP reboot/radio reset issues

Its sometime get tough troubleshooting an AP Reboot issue. While troubleshooting Fortinet WiFi APs, You might need to capture the AP flash event log to identify the cause of reboot. However, some time AP flash space ran out  already storing Maximum logs files(This could happen if AP is Up and running log time in production or there are many events happening in AP that filled up the space soon). In this case you might need to clear those flash log space, so it can store the latest/ fresh log files.

This is needed because you making sure that those are latest logs of the crash/reboot happened.

#You run the following command on AP to clear the old flash logs:

file areaerase logk0 

file areaerase logk1

#You run the following command to read the flashlogs in the AP:

flashcmds show logk0

flashcmds show logk1

However, Incase of big deployments where you cannot run this on each AP separately and  also without causing WIFI service disruption you have to run this at-least on a bunch of AP to save your time. We have something called “system commander” where you will be executing a script to clear those AP flash logs.

Step1:

To use the system commander tool you have to enable telnet on the controller.
FortiWLC# configure terminal

FortiWLC(config)# telnet enable

Step2:

Please create a folder on the c drive of your computer and copy these two files shared on the link below:
https://www.dropbox.com/s/82ir713iegxj0tt/SystemCommandExecutor_Ver14.exe?dl=0

https://www.dropbox.com/s/otrkiobxmhnhq4y/SystemCommands_AP.txt?dl=0

Step3:
Run the following commands from your computer command prompt window.

SYNTAX: cfolder_name SystemCommandExecutor_Ver14.exe -c ipaddressofcontroller-username-password -o 4 -a APMODEL -l 1

Example for AP822s: 

C:\PRAVEEN>SystemCommandExecutor_Ver14.exe -c 192.168.10.3-admin-admin -o 4 -a AP822 -l 1

Explanation:

The folder name is the name of the folder in which you have saved the files(SystemCommands_AP.txt) on c drive. The username and password are the login credentials of the controller. The logs will be saved in the same folder on c drive once the loop is completed.

If you look at the syntax it has AP model mentioned in that so please choose the AP model accordingly, depending on for which AP you want to collect the logs.
* For any and all AP reboots we need to run this to gather AP reboot logs (file areaerase logk0/logk1)
* After collecting Disable Telnet

1. On the command prompt we need to specify AP model. Since we can only collect with particular AP type. If customer has mixed AP environment we need to collect separately for each AP model.
2. If we have any AP which is disable offline. Make sure to delete that entry before running System commander, since loop will end at that point and we don’t have option to start again from there. (Ex: If we have 100 AP on network and AP ID 44 is offline, loop will end at AP Id 44)

Addon :You can add more commands on the script file(SystemCommands_AP.txt) to execute and get to collect a dump of those command output’s too.

 

 

How to check why FortiAP got Offline from FortiGate

If the AP lost its channel connection with FortiGate you can check to see if the AP has just lost the contact with firewall missing the heartbeat or if has got rebooted for any reason.

Points to remember:

*Forti AP reboot only if has any power issue.

*FortiAP had any Software crash or kernal panic.

>>Following command on FortiGate can give you an idea why the AP is offline in FortiGate:

 

#Scene 1 :If the AP is offline because of any operation initiated from controller/FortiGate (changes that needs a AP reboot)

——————————-WTP 1—————————-
WTP vd : root
vfid : 0
id : FP221B3X12007124
mgmt_vlanid : 0
region code : N
regcode status : invalid
refcnt : 2 own(1) wtpprof(1)
plain_ctl : disabled
deleted : no
admin : enable
cfg-wtp-profile : praveen_wifi_integrated
override-profile : enabled
oper-wtp-profile : resv-dflt-FP221B3X12007124
wtp-mode : normal
wtp-group :
name :
location :
led-state : enabled
ip-frag-prevent : TCP_MSS
tun-mtu : 0,0
split-tunneling-local-ap-subnet : disabled
active sw ver : FP221B-v5.2-build0254
local IPv4 addr : 192.168.242.63
board mac : 00:09:0f:7c:1a:70
join_time : Tue Jan 17 13:51:56 2017
mesh-uplink : ethernet
mesh hop count : 0
parent wtp id :
connection state : Disconnected
image download progress: 0
last failure : 8 — AC daemon reset timer expired –<change caused AP reboot>     
last failure param: N/A
last failure time: Tue Jan 17 13:52:01 2017
station info : 0/0

 

#Scene 2: On the other hand FortiGate is reporting that the Heatbeat timed out and so the AP went offline.

#diagnose  wireless-controller  wlac -c wtp

——————————-WTP 1—————————-
WTP vd : root
vfid : 0
id : FP221B3X12007124
mgmt_vlanid : 0
region code : N
regcode status : invalid
refcnt : 3 own(1) wtpprof(1) ws(1)
plain_ctl : disabled
deleted : no
admin : enable
cfg-wtp-profile : praveen_wifi_integrated
override-profile : enabled
oper-wtp-profile : resv-dflt-FP221B3X12007124
wtp-mode : normal
wtp-group :
name :
location :
led-state : enabled
ip-frag-prevent : TCP_MSS
tun-mtu : 0,0
split-tunneling-local-ap-subnet : disabled
active sw ver : FP221B-v5.2-build0254
local IPv4 addr : 192.168.242.63
board mac : 00:09:0f:7c:1a:70
join_time : Tue Jan 17 13:41:18 2017
mesh-uplink : ethernet
mesh hop count : 0
parent wtp id :
connection state : Connected
image download progress: 0
last failure : 14 — ECHO REQ is missing        … <heatbeat missed>
last failure param: N/A
last failure time: Tue Jan 17 13:40:39 2017     …<Failure time>
station info : 0/0
geo : World (0)
LLDP : disabled
Radio 1 : AP

So FortiGate just reported its a heatbeat miss from AP that cause AP go offline and Wifi service interrupted.

*Here we need to find the reason if its the network or the AP itself didn’t sent out the heatbeat.

*Log into the AP and check to see if the AP got rebooted or even AP reports that WTP is  what its has  has to reconnect.

*To TELNET from FortiGate into the AP,Command ## execute telnet <dest>    IP address.

*Check the Uptime on AP,#cw_diag uptime

Log1:

FP221B3XXXXXXXXX # cw_diag uptime
Could not open fsm RUN uptime file /tmp/uptime_fsm_run.
Current uptime : 1567338
WTP daemon start uptime : 1565549                                         <Ap never got rebooted>
WTP daemon RUN uptime : 1567338
Time since WTP daemon started : 1789   
Time since WTP daemon connected : 0                        <Did loose the contact with FGT>

Watchdog timer triggered : 0
Watchdog timer action : 3
Watchdog timer time : 27

Log2:

FP221B3XXXXXXXX # cw_diag uptime
Could not open fsm RUN uptime file /tmp/uptime_fsm_run.
Current uptime : 78                                                                     <AP got rebooted>
WTP daemon start uptime : 31
WTP daemon RUN uptime : 78
Time since WTP daemon started : 47
Time since WTP daemon connected : 0

Watchdog timer triggered : 0
Watchdog timer action : 3
Watchdog timer time : 29

*By this way you could narrow down the issues and so next time could help to find Route Cause of the issue.

Other Handy AP commands:

>cfg -s
>fap-get-status
>cw_diag uptime
>cw_diag sys-performance
>iwconfig
>diag_debug_crashlog read
>cw_diag -c wtp-cfg
>cw_diag -c radio-cfg
>cw_diag -c vap-cfg
>cw_diag kernel-panic
>dmesg
>rcfg
>klog

 

 

 

First universal APs @fortinet

These U- FAPs can be managed by Infrastructure controllers or Fortigate Firewalls or via Cloud based management.

Like other Fortinet infrastructure Access points it support  Single channel Architecture/ Virtual Cell or Micro cell/Native cell too.

#802.11AC#Wave2#4*4#universal

Configuring your Fortigate for Higher cipher and SSL/TLS protocol

From version 5.4 onwords you  can control on setting  Encryption and Decryption to Highest Cipher for SSLVPN

FG08XXXXXXXXXX # config vpn ssl settings
FG080XXXXXXXXX (settings) #
FG080XXXXXXXXX (settings) # set banned-cipher
RSA         Ban the use of cipher suites using RSA key.
DH          Ban the use of cipher suites using DH.
DHE         Ban the use of cipher suites using authenticated ephemeral DH key agreement.
ECDH        Ban the use of cipher suites using ECDH key exchange.
ECDHE       Ban the use of cipher suites using authenticated ephemeral ECDH key agreement.
DSS         Ban the use of cipher suites using DSS authentication.
ECDSA       Ban the use of cipher suites using ECDSA authentication.
AES         Ban the use of cipher suites using either 128 or 256 bit AES.
AESGCM      Ban the use of cipher suites AES in Galois Counter Mode (GCM).
CAMELLIA    Ban the use of cipher suites using either 128 or 256 bit CAMELLIA.
3DES        Ban the use of cipher suites using triple DES
SHA1        Ban the use of cipher suites using SHA1.
SHA256      Ban the use of cipher suites using SHA256.
SHA384      Ban the use of cipher suites using SHA384.

#To set the SSL/TLS protocol versions for ADMIN and SSL VPN

 

>>Allow only TLS 1.2:
       # config system global
       # set admin-https-ssl-versions tlsv1-2
       # end
>>Disable everything except TLS 1.2 as go to high algorithm:
       # config vpn ssl settings
       # set tlsv1-0 disable
       # set tlsv1-1 disable
       # set sslv3 disable
       # set algorithm high
       # end
>>Whats with setting the algorithm on HIGH/LOW/MEDIUM:
The default option of Medium at RC4 (128 bits) is acceptable, but the High option, AES (128/256 bits) and 3DES is more secure. The Low option, RC4 (64 bits), DES and higher does not meet PCI DSS requirements
>>Configure the system to use strong crypto:
   # config system global
       # set strong-crypto enable
       # end
Note: Enabling strong crypto will disable using SSLV3 and TLSv1.0. So its  TLSv1.1 and TLSv1.2

Fortigate Firewall HA master election

 There are 2 conditions for MASTER election.
1) (ha-override disable) :
Master is selected in the following sequence
1. Number of “UP” monitored port
2. HA-uptime
3. HA priority
4. FGT serial number
2) (ha-override enable) :
Master is selected in the following sequence
1. Number of “UP” monitored port
2. HA priority
3. HA-uptime
4. FGT serial number
If ha overried is disabled then:
# you can just do fail-over  by reseting the HA uptime on the Master
So, no reboot (or) unplugging cable for a HA failover
Reference link:

Fortinet FAP discovery issues and troubleshooting process(For integrated Wi-Fi AP models only)

>By default Fortigate supported AP models  uses CAPWAP for discovery and tunnel traffic

UDP port 5246 for discovery
UDP port 5247 for control/data traffic

>In cases where APs not able to discover the fortigate wifi controller, you need to look port 5246 to see if really any traffic hitting your foritgate wifi controller
Note:
>After you confirmed that foriAPs are running on supported version of firmware for fortiOS
>AP is able to get an DHCP IP address, then you could check for the following:

step1:
Try running a capture on fortigate CAPWAP  enabled interface

#command:
diag sniffer packet lan “port 5246 ” 4 0 a

2016-11-01 09:16:13.754033 lan — 192.168.242.59.5246 -> 192.168.242.63.5246: udp 81
2016-11-01 09:16:14.759783 lan — 192.168.242.63.5246 -> 192.168.242.59.5246: udp 177
2016-11-01 09:16:14.762178 lan — 192.168.242.59.5246 -> 192.168.242.63.5246: udp 81
2016-11-01 09:16:15.751710 lan — 192.168.242.63.5246 -> 192.168.242.59.5246: udp 177
2016-11-01 09:16:15.753937 lan — 192.168.242.59.5246 -> 192.168.242.63.5246: udp 81
2016-11-01 09:16:16.751617 lan — 192.168.242.63.5246 -> 192.168.242.59.5246: udp 177

Step2 :
Try to run a diagnose on application to see if the discovery is hitting and whether fortigate wifi controller is not responding or if responding with an error.

#commands:
di debug application cw_acd 4
diagnose debug console tim enable
diagnose debug enable

>In our case your see a working discovery log below:

2016-11-01 17:33:47 32067.065 CAPWAP Control Header Dump:
2016-11-01 17:33:47 32067.065 msgType : 9 WTP_EVENT_REQ 192.168.242.63:5246
2016-11-01 17:33:47 32067.066 seqNum : 87
2016-11-01 17:33:47 32067.066 msgElemLen : 25
2016-11-01 17:33:47 32067.066 flags : 0
2016-11-01 17:33:47 32067.066 ws (0-192.168.242.63:5246) cwAcProcPlainCtlMsg: received WTP_EVENT_REQ from ws (0-192.168.242.63:5246)
2016-11-01 17:33:47 32067.066 ws (0-192.168.242.63:5246) <msg> WTP_EVENT_REQ (87) <== ws (0-192.168.242.63:5246)
2016-11-01 17:33:47 32067.066 ws (0-192.168.242.63:5246) <aev> – CWAE_WTP_EVENT_REQ_RECV ws (0-192.168.242.63:5246)
2016-11-01 17:33:47 32067.067 ws (0-192.168.242.63:5246) acWtpSessionThread: SSL_read() returned -1 ssl_err 2
2016-11-01 17:33:47 32067.067 ws (0-192.168.242.63:5246) =====================cwAcFsmThread ws (0-192.168.242.63:5246) 16 1====================
2016-11-01 17:33:47 32067.067 ws (0-192.168.242.63:5246) CWAS_RUN_enter: sending WTP_EVENT RESP msg.
2016-11-01 17:33:47 32067.067 ws (0-192.168.242.63:5246) acDtlsWrite: sending 24 bytes to ws (0-192.168.242.63:5246)
2016-11-01 17:33:47 32067.068 CAPWAP Hdr: P/T=0/0 len=2 RID=0 WBID=1 T=0 F=0 L=0 W=0 M=0 K=0 resv=0 frag=0/0 resv=0
2016-11-01 17:33:47 32067.068 CAPWAP Control Header Dump:
2016-11-01 17:33:47 32067.068 msgType : 10 WTP_EVENT_RESP 192.168.242.63:5246
2016-11-01 17:33:47 32067.068 seqNum : 87
2016-11-01 17:33:47 32067.068 msgElemLen : 11
2016-11-01 17:33:47 32067.068 flags : 0
2016-11-01 17:33:47 32067.069 ws (0-192.168.242.63:5246) acDtlsWrite: SSL_write() was successful
2016-11-01 17:33:47 32067.069 ws (0-192.168.242.63:5246) =====================cwAcFsmThread ws (0-192.168.242.63:5246) 16 2====================
2016-11-01 17:33:47 32067.069 ws (0-192.168.242.63:5246) acProcOutCipherCtrlMsg: sent msg (81 bytes) to ws (0-192.168.242.63:5246)
2016-11-01 17:33:47 32067.314 ws (0-192.168.242.63:5246) cwAcProcCipherCtlMsg: wrote 93 to ws (0-192.168.242.63:5246) socket 27, rc: 93 (success)
2016-11-01 17:33:47 32067.314 ws (0-192.168.242.63:5246) acWtpSessionThread: SSL_read() returned 38 ssl_err 0
2016-11-01 17:33:47 32067.314 ws (0-192.168.242.63:5246) cwAcProcPlainCtlMsg: meInfo->log_disable 0
2016-11-01 17:33:47 32067.315 CAPWAP Control Header Dump:
2016-11-01 17:33:47 32067.315 msgType : 9 WTP_EVENT_REQ 192.168.242.63:5246
2016-11-01 17:33:47 32067.315 seqNum : 88
2016-11-01 17:33:47 32067.315 msgElemLen : 25
2016-11-01 17:33:47 32067.315 flags : 0
2016-11-01 17:33:47 32067.315 ws (0-192.168.242.63:5246) cwAcProcPlainCtlMsg: received WTP_EVENT_REQ from ws (0-192.168.242.63:5246)
2016-11-01 17:33:47 32067.316 ws (0-192.168.242.63:5246) <msg> WTP_EVENT_REQ (88) <== ws (0-192.168.242.63:5246)
2016-11-01 17:33:47 32067.316 ws (0-192.168.242.63:5246) <aev> – CWAE_WTP_EVENT_REQ_RECV ws (0-192.168.242.63:5246)
2016-11-01 17:33:47 32067.316 ws (0-192.168.242.63:5246) acWtpSessionThread: SSL_read() returned -1 ssl_err 2
>In cases were AP lost the communication with foritgate wifi controller, you would start seeing the following timout in the debug.
2016-11-01 17:34:56 32137.455 ws (0-192.168.242.63:5246) FSM AC (1) -> WTP State: CWAS_RUN (12) accept 3 live 120 dbg 00000004 pkts 5740 0
2016-11-01 17:34:56 32137.455 ws (0-192.168.242.63:5246) <aev> – CWAE_RPT_INTERVAL_EXPIRE ws (0-192.168.242.63:5246)
2016-11-01 17:34:56 32137.456 ws (0-192.168.242.63:5246) <aev> – CWAE_10_SEC_EXPIRE ws (0-192.168.242.63:5246)
2016-11-01 17:34:56 32137.456 ws (0-192.168.242.63:5246) =====================cwAcFsmThread ws (0-192.168.242.63:5246) 16 1====================
2016-11-01 17:34:56 32137.456 ws (0-192.168.242.63:5246) =====================cwAcFsmThread ws (0-192.168.242.63:5246) 16 2====================
2016-11-01 17:34:56 32137.457 ws (0-192.168.242.63:5246) =====================cwAcFsmThread ws (0-192.168.242.63:5246) 16 1====================
2016-11-01 17:34:56 32137.457 ws (0-192.168.242.63:5246) =====================cwAcFsmThread ws (0-192.168.242.63:5246) 16 2====================
2016-11-01 17:34:57 32138.454 ws (0-192.168.242.63:5246) FSM AC (1) -> WTP State: CWAS_RUN (12) accept 3 live 121 dbg 00000004 pkts 5740 0
2016-11-01 17:34:58 32139.454 ws (0-192.168.242.63:5246) FSM AC (1) -> WTP State: CWAS_RUN (12) accept 3 live 122 dbg 00000004 pkts 5740 0
2016-11-01 17:34:59 32140.454 ws (0-192.168.242.63:5246) FSM AC (1) -> WTP State: CWAS_RUN (12) accept 3 live 123 dbg 00000004 pkts 5740 0
2016-11-01 17:35:00 32141.454 ws (0-192.168.242.63:5246) FSM AC (1) -> WTP State: CWAS_RUN (12) accept 3 live 124 dbg 00000004 pkts 5740 0
2016-11-01 17:35:01 32142.454 ws (0-192.168.242.63:5246) FSM AC (1) -> WTP State: CWAS_RUN (12) accept 3 live 125 dbg 00000004 pkts 5740 0
2016-11-01 17:35:02 32143.454 ws (0-192.168.242.63:5246) FSM AC (1) -> WTP State: CWAS_RUN (12) accept 3 live 126 dbg 00000004 pkts 5740 0
2016-11-01 17:35:03 32144.454 ws (0-192.168.242.63:5246) FSM AC (1) -> WTP State: CWAS_RUN (12) accept 3 live 127 dbg 00000004 pkts 5740 0
2016-11-01 17:35:04 32145.454 ws (0-192.168.242.63:5246) FSM AC (1) -> WTP State: CWAS_RUN (12) accept 3 live 128 dbg 00000004 pkts 5740 0
2016-11-01 17:35:05 32146.454 ws (0-192.168.242.63:5246) FSM AC (1) -> WTP State: CWAS_RUN (12) accept 3 live 129 dbg 00000004 pkts 5740 0
2016-11-01 17:35:06 32147.454 ws (0-192.168.242.63:5246) FSM AC (1) -> WTP State: CWAS_RUN (12) accept 3 live 130 dbg 00000004 pkts 5740 0
2016-11-01 17:35:06 32147.454 ws (0-192.168.242.63:5246) <aev> – CWAE_AC_ECHO_INTV_TMR_EXPIRE ws (0-192.168.242.63:5246)
2016-11-01 17:35:06 32147.455 ws (0-192.168.242.63:5246) <aev> – CWAE_10_SEC_EXPIRE ws (0-192.168.242.63:5246)
2016-11-01 17:35:06 32147.455 ws (0-192.168.242.63:5246) =====================cwAcFsmThread ws (0-192.168.242.63:5246) 16 1====================
2016-11-01 17:35:06 32147.455 ws (0-192.168.242.63:5246) <aev> – CWAE_WTP_ECHO_REQ_FAIL ws (0-192.168.242.63:5246)
2016-11-01 17:35:06 32147.455 ws (0-192.168.242.63:5246) =====================cwAcFsmThread ws (0-192.168.242.63:5246) 16 2====================
2016-11-01 17:35:06 32147.455 ws (0-192.168.242.63:5246) =====================cwAcFsmThread ws (0-192.168.242.63:5246) 16 1====================
2016-11-01 17:35:06 32147.456 ws (0-192.168.242.63:5246) =====================cwAcFsmThread ws (0-192.168.242.63:5246) 16 2====================
2016-11-01 17:35:06 32147.456 ws (0-192.168.242.63:5246) =====================cwAcFsmThread ws (0-192.168.242.63:5246) 16 1====================

>And its entry gets flushed from online AP table

2016-11-01 17:35:06 32147.456 ws (0-192.168.242.63:5246) cwAcStaFlush_ws: free 0 sta from cw_sta_tree for ws (ws (0-192.168.242.63:5246))
2016-11-01 17:35:06 32147.457 ws (0-192.168.242.63:5246) cwAcIhapdDelWtp remove ifaces 0x5662750 for ws (0-192.168.242.63:5246) rId 0
2016-11-01 17:35:06 32147.457 ws (0-192.168.242.63:5246) cwAcStaFlush_ws: free 0 sta from cw_sta_tree for ws (ws (0-192.168.242.63:5246))
2016-11-01 17:35:06 32147.457 ws (0-192.168.242.63:5246) cwAcIhapdDelWtp remove ifaces 0x5663ce0 for ws (0-192.168.242.63:5246) rId 1
2016-11-01 17:35:06 32147.457 ws (0-192.168.242.63:5246) cwAcStaFlush_ws: free 0 sta from cw_sta_tree for ws (ws (0-192.168.242.63:5246))
2016-11-01 17:35:06 32147.458 ws (0-192.168.242.63:5246) cwAcPendingMsgClr: Clear pending msg queue Cnt 0 H 20 T 20 last txMsg unknown
2016-11-01 17:35:06 32147.458 ws (0-192.168.242.63:5246) cwAcDtlsSessionStop_chan: data channel dtls ctx does not exist
2016-11-01 17:35:06 32147.458 ws (0-192.168.242.63:5246) cwAcDtlsSessionStop_chan free ssl 0x55abfe0
2016-11-01 17:35:06 32147.459 ws (0-192.168.242.63:5246) cwAcDtlsSessionStop_chan: deregistering acProcOutCipherCtrlMsg
2016-11-01 17:35:06 32147.459 ws (0-192.168.242.63:5246) cwAcDtlsSessionStop_chan: deregistering acWtpSessionThread
2016-11-01 17:35:06 32147.459 ws (0-192.168.242.63:5246) <aev> – CWAE_WTP_SESSION_RELEASED ws (0-192.168.242.63:5246)
2016-11-01 17:35:06 32147.459 ws (0-192.168.242.63:5246) acRemoveIPv4SessionHashEntry: removing ws 0x567c1e8 ws (0-192.168.242.63:5246) at index2 233
2016-11-01 17:35:06 32147.459 ws (0-192.168.242.63:5246) acRemoveIPv4SessionHashEntry: removing 0x567c1e8 from 0’th entry (next:(nil))
2016-11-01 17:35:06 32147.460 ws (0-192.168.242.63:5246) acRemoveIPv4SessionHashEntry: removing ws 0x567c1e8 ws (0-192.168.242.63:5246) at index 231
2016-11-01 17:35:06 32147.460 ws (0-192.168.242.63:5246) acRemoveIPv4SessionHashEntry: removing 0x567c1e8 from 0’th entry (next:(nil))