Edgerouter / EdgeOS IPsec Site to Site Troubleshooting
I have a client setup with multiple Edgerouter’s in an IPSec Site to Site configuration. I’ve setup a Policy based IPsec site to site configuration using this guide here. However, sometimes they just refuse to connect, with no real reason as to why. Edgerouters use StrongSwan for its VPN, so some of its troubleshooting information should be useful to us. Below are some troubleshooting steps I go through whenever an issue pops up.
Reboot Often!
For some strange reason, rebooting both sides sometimes can easily fix the issue. Also rebooting often after configuration changes is a good idea, as the commands to restart the VPN’s seems either to not work or take a long time. I can’t tell you how many times I’ve sat and waited for a tunnel to come up after a config change, only to give up and reboot for it to magically work.
Check open ports
Even though you have the open ports box ticked, it is a good idea to check if the ports are open for ipsec. They are UDP 500 and 4500(these might just be fore l2tp). You will also want to allow any esp traffic through the firewall. You can test this with the nmap command “nmap $IP_OR_FQDN -Pn -sU -p $port” changing the variables to your condition. If they are not open, make sure to add those rules manually.
Check the Config
There are a few ways to check the config on the Edgerouter devices. One is through the GUI in the vpn – ipsec site to site tab. I recommend this first, but it does not tell you everything. You want to go to the config tree and check the contents of the ipsec tab. Make sure the config is correct. You can also manually delete it from the config and reconfigure it again. There are some cli commands that are useful for checking the configuration too. They are:
See the IPsec Site to Site routing policy: shown vpn ipsec policy
Sometimes removing the VPN from the config tree, rebooting, then resetting up the VPN gets it to connect again, However, the issue causing it is not fixed. Check the logs section and the Other things section for some hints on what could be causing it.
Downloading the backup file and looking at the config.boot is also very helpful.
Some security things you should set that wont effect performance too much is to change the DH group to 16. This makes the keysize 4096 bits vs 2048 bits. If your connection is not too high speed, change the encryption to AES-256 as its a bit harder to hack. See HERE and for throughput expectations.
Firmware 2.x and above, you need to login to the CLI and run the below commands
configure set vpn ipsec allow-access-to-local-interface enable commit ; save
Keep in mind that any GUI config change to the VPN will overwrite any cli changes, like the one set above
Read the Logs
There are a few logs you can get at through the cli. For some reason, they don’t seem to update that well? not really sure what that is about but dont take them as definitive. Ive seen one site’s logs say connection established, only for the other site to say nothing. The commands are
Show the Swanctl log (the actualy ipsec package): sudo swanctl --log Show all VPN Logs (includes l2tp and openvpn): show vpn log
Make sure to read the logs at both ends, as they sometimes only show up on one side for some reason. SSH is your friend for this, just make sure you can scroll.
Verifying tunnel states
You can use the following commands to see if anything is connected and how it is connected.
See the active ipsec tunnels: show vpn ipsec status See the connected peer information: show vpn ipsec sa See connected ESP tunnels: show vpn ipsec state
Other things to look out for
Firmware: Ive noticed that sometimes lower firmware versions can conflict sometimes with the newer ones. This might just be nothing, but keeping the firmware the same on the devices sometimes helps.
DNS issues: Sometimes the DNS records can be slow propagating to services like Cloudflare and google. I would set your DDNS provider’s DNS servers as the first system dns resolver in the system tab in the EdgeOS firmware.
IPsec tunnel not actually restarting on config change: As mentioned above, The only real official way to fix this is rebooting. However, as it is just a linux machine, you can drop to a root shell and do it manually. It did not really work for me, but I could be doing it wrong.
Firewall Rules not taking: There are 2 ways to fix this issue. Rebooting sometimes makes it work, also manually doing it via the config tree, then rebooting works for me. The gui is hit or miss.
L2TP server: Ive noticed that running the L2TP server on one of the routers causes occasional disconnects and refusal of sites to connect. I have no idea why, but my guess is they use the same daemon and something conflicts. Id recommend disabling and removing the L2TP configs from the config tree. Use openvpn on a RaspberryPi or a VM to get into the network, its more secure and faster anyways.
Vyatta: Edgerouters use StrongSwan for its VPN, so any log output queries should be directed at them, in addition to EdgeOS. The EdgeOS Software is a fork of the open source software vyatta 6.3, so some of the questions and configs should overlap however Ubiquity has customized and updated packages, so your mileage may vary.