Ticket ID: SIXXS #752441 Ticket Status: User PoP: (not applicable)
nlams04 replying with ICMP port unreachable to heartbeat
Shadow Hawkins on Tuesday, 24 June 2008 11:22:07
Since this morning nlams04.sixxs.net always replies with ICMP port unreachable messages to heartbeat:
11:03:51.863721 IP 213.204.193.2 > 192.168.0.1: ICMP 213.204.193.2 udp port 3740 unreachable, length 121
0x0000: 4580 008d 129b 0000 2e01 21dd d5cc c102 E.........!.....
0x0010: c0a8 0001 0303 a1d2 0000 0000 4500 0071 ............E..q
0x0020: 0ee2 4000 3311 e121 c0a8 0001 d5cc c102 ..@.3..!........
0x0030: 807f 0e9c 005d 968c 4845 4152 5442 4541 .....]..HEARTBEA
0x0040: 5420 5455 4e4e 454c 2032 3030 313a 3936 T.TUNNEL.2001:96
0x0050: 303a 323a 3561 303a 3a32 2073 656e 6465 0:2:5a0::2.sende
0x0060: 7220 3132 3134 3239 3832 3331 2036 3438 r.1214298231.648
0x0070: 3765 6538 3762 3836 6439 3063 3735 6362 7ee87b86d90c75cb
0x0080: 6338 3265 3966 6137 3131 6535 66 c82e9fa711e5f
Setup:
NIC: BPR1-SIXXS
Unchanged since ticket #606528 (except that with slightly adjusted time the heartbeat messages were accepted by nlams04 until yesterday)
Is the heatbeat server down on nlams04 or has something changed on that side?
State change: user
Jeroen Massar on Tuesday, 24 June 2008 11:24:29
The state of this ticket has been changed to user
nlams04 replying with ICMP port unreachable to heartbeat
Jeroen Massar on Tuesday, 24 June 2008 11:25:07
How many heartbeat clients are you trying to run? It seems you well overran the ratelimit.
nlams04 replying with ICMP port unreachable to heartbeat
Shadow Hawkins on Tuesday, 24 June 2008 11:55:06
There is just one which is sending at 60s interval. When restarting the heartbeat client daemon though a few messages might happen to be sent in a short time-frame (due to the way init respawns dying processes)
Around 10:30 CEST this morning I did manually restart the heartbeat daemon to check why the tunnel did not come up as saw the Port Unreachable ICMP errors.
Alternatively some buffering on the CISCO router when it thinks it's ADSL link has to be retrained can also cause a burst (it takes about 1 minute to retrain but PPPoE session remains alive) - not sure the router does buffering in this case.
Though as I did not change anything on my setup recently I'm surprised to have hit a limit.
The system booted this morning around 8 CEST, do you know when the ratelimit overrunning happend?
Looking up my logs I see:
Jun 23 11:47:41 aphrodite ntpd[2029]: adjusting local clock by -0.274020s
Jun 23 11:50:23 aphrodite ntpd[2029]: adjusting local clock by -0.203187s
Jun 23 12:45:29 aphrodite ntpd[2029]: adjusting local clock by -0.150385s
Jun 23 12:55:57 aphrodite ntpd[2029]: adjusting local clock by -0.360274s
Jun 23 13:01:13 aphrodite ntpd[2029]: adjusting local clock by -0.333676s
Jun 23 13:04:09 aphrodite ntpd[2029]: adjusting local clock by -0.264828s
Jun 23 13:07:10 aphrodite ntpd[2029]: adjusting local clock by -0.195618s
Jun 23 14:16:32 aphrodite ntpd[2029]: adjusting local clock by -0.407195s
Jun 23 14:22:26 aphrodite ntpd[2029]: adjusting local clock by -0.385334s
Jun 23 14:25:32 aphrodite ntpd[2029]: adjusting local clock by -0.323251s
Jun 23 14:29:14 aphrodite ntpd[2029]: adjusting local clock by -0.276485s
Jun 23 14:31:55 aphrodite ntpd[2029]: adjusting local clock by -0.202399s
Jun 23 14:37:18 aphrodite ntpd[2029]: adjusting local clock by -0.148212s
Jun 23 15:32:10 aphrodite ntpd[2029]: adjusting local clock by -0.329207s
Jun 23 15:39:52 aphrodite ntpd[2029]: adjusting local clock by -0.339359s
Jun 23 15:43:53 aphrodite ntpd[2029]: adjusting local clock by -0.270805s
Jun 23 15:47:35 aphrodite ntpd[2029]: adjusting local clock by -0.195101s
Jun 23 15:52:01 aphrodite heartbeat: Failed sending heatbeat: Connection refused
Jun 23 15:53:01 aphrodite heartbeat: Failed sending heatbeat: Connection refused
Jun 23 15:54:01 aphrodite heartbeat: Failed sending heatbeat: Connection refused
Jun 23 15:55:01 aphrodite heartbeat: Failed sending heatbeat: Connection refused
Jun 23 15:56:01 aphrodite heartbeat: Failed sending heatbeat: Connection refused
Jun 23 15:57:01 aphrodite heartbeat: Failed sending heatbeat: Connection refused
Jun 23 15:58:01 aphrodite heartbeat: Failed sending heatbeat: Connection refused
Jun 23 15:59:01 aphrodite heartbeat: Failed sending heatbeat: Connection refused
Jun 23 16:00:01 aphrodite heartbeat: Failed sending heatbeat: Connection refused
So looks like the burst must have happend yesterday afternoon though the tunnel remained alive till late that evening...
Posting is only allowed when you are logged in. |