Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Building And Integrating Virtual Private Networks With Openswan (2006).pdf
Скачиваний:
78
Добавлен:
17.08.2013
Размер:
4.74 Mб
Скачать

Debugging and Troubleshooting

Host Issues

Most problems on the host itself become apparent when installing or starting Openswan. Openswan logs precise details of anything that goes wrong. If Openswan suddenly fails to start, check whether some automated update has changed something.

For example, a new kernel could have been installed that does not support KLIPS, where you had previously compiled KLIPS for the kernel. Perhaps someone changed a few firewall rules, some IP address on the host changed, or you are using a nexthop value, and your ISP has changed your default gateway. These problems should all have very clear error messages in the log files, and should be very obvious, though fixing them might involve some work.

A more subtle error can be related to the host clock. Especially when X.509 Certificates are used, which have a time-limited validity, accurate time becomes important. A clock skew can suddenly cause something to fail that worked before for months. The best defense is to set up an NTP service on the machine, in combination with an ntpdate command executed during the host's startup sequence. Be aware that you cannot run the ntpdate command when the NTP daemon is running. And the NTP daemon will not allow a large time correction (where large means more than a few seconds). So if your clock is off by a few hours, it will take a very long time for the NTP daemon to drift towards the proper time, and you might need to expedite things by stopping the NTP daemon, running ntpdate once, and then restarting the NTP daemon again.

If your new certificate does not work and complains about validity despite an accurate clock, you may be the victim of localization. If you live in a time zone that has a negative GMT offset (for example North America), and the host on which your certificate is generated lives in GMT or GMT+x (for example Europe or Japan), then your certificate will not yet be valid. The easiest solution for this is to just go to bed and try again tomorrow.

Configuration Problems

Most of the time, when Openswan fails to correctly set up a tunnel, you will see IKE errors in the system logs, and a configuration error is at fault.

A configuration error in this context does not necessarily mean an error in the Openswan configuration files, it could simply be that the two IPsec endpoints do not agree. If the connection has Openswan at both ends, this can be fairly easily verified, since most if not all parameters will be the same on each side. If you control both ends, which is often the case for Openswan-to- Openswan connections, then configuration errors are fairly easily spotted and corrected by reading and comparing the log entries on both ends.

Do not enable debugging in ipsec.conf to debug configuration issues.

Connection Names

When the first packet of an IKE exchange is received, it is not always clear which connection definition applies to it. Pluto will pick a connection name that could match. Once it knows more about the connection being attempted, it could change the connection name, which can look confusing in the logs.

266

Chapter 12

Interoperability

When the other end of the IPsec connection is not an Openswan machine, things are a little bit more complex. First of all, it is likely that you have no control over the other endpoint. You are also likely to be crossing an administrative domain, where you will need to talk to another system administrator, one who likely favors the IPsec vendor used by their end, and might even look down on this free software thing. Try to avoid getting drawn into a debate about why one is better than another, but of course do not hide the fact that you are using Openswan either. Remember that without a proper interop between system administrators, two devices will never achieve interoperability themselves. Put some effort into getting along with the other system administrator.

Often, there is also confusion about terminology. No one outside the Openswan world will know what left or right signifies. In other products, these are usually called local and remote. Some do not name the local part explicitly, and call the remote the domain or the security domain.

Some products call Authentication Header 'Medium' security, some call Perfect Forward Secrecy anti-replay protection. The shared secret (PSK) probably has the largest number of different names, such as password, passphrase, secret, netgroup, groupsecret, or combinations of these words. When talking to the other system administrator, try to use the same terms. It might even be worth checking the Openswan website or Wiki, or using a search engine, to find out what terms are used by the vendor you are trying to interoperate with.

Hunting Ghosts

Another common mistake is failing to realize that the current configuration files no longer represent the current state of Openswan. This can happen if someone has edited a file, but did not restart the connection or the Openswan subsystem. If there is any doubt, restart Openswan before trying to debug the problem.

Remember that if a change is made in the 'config setup' section, Openswan must be restarted completely. Changes in the conn definitions, ca definitions, or in any of the other files do not require a complete restart. A changed connection can be reloaded using ipsec auto –replace connname. A changed secret can be re-read using the ipsec secrets command. See the relevant chapters for a full list of commands.

Most hardware routers never reset their Phase 1 if you change their configuration, and still try to re-use the current ISAKMP SA. Using ipsec auto --delete connname may or may not successfully terminate the Phase 1 connection. If possible though, always reboot these hardware routers after making any changes in their IPsec configuration, to prevent accidentally re-using an old ISAKMP SA (Phase 1).

Another ghost hunt could result from the other end changing its configuration, or some router somewhere on the path between the two endpoints suddenly behaving differently, for instance due to a new firewall rule, or a change in maximum packet size of some intermediate router. A phone call to the other administrator can save you a lot of time in debugging these problems.

267