Category Archives: Technology

IPv6 packet loss @ Telenet.be

IPv6 packet loss at Telenet. Telenet is one of the largest cable providers in Belgium and they have native IPv6 support. Their IPv4 connectivity is very stable and packet loss or outages are very rare. However, IPv6 is less stable.

What’s the problem?

I’m using IPv6 for my ssh connections. A stable connection for ssh is critical, because you notice immediately when packet loss is occurring. Especially if packets are dropped during intervals of more then 10 seconds. It looks like your ssh connection is stalling.

Normal situation at Hetzner

To test where things went wrong, I started by selecting a target for my tests. In this case, I took the IP address of ipv6.google.com

rivy@spdy:~$ dig +short aaaa ipv6.google.com
ipv6.l.google.com.
2a00:1450:400e:800::200e
rivy@spdy:~$

Testing from my server in a datacenter at Hetzner gave very good results.

ping from Hetzner to Google

# ping6 -i 2 -c 150 2a00:1450:400e:800::200e
PING 2a00:1450:400e:800::200e(2a00:1450:400e:800::200e) 56 data bytes
...
--- 2a00:1450:400e:800::200e ping statistics ---
150 packets transmitted, 150 received, 0% packet loss, time 298332ms
rtt min/avg/max/mdev = 11.271/11.359/11.794/0.114 ms

No packet loss, 11msec average and almost no deviation from the average. These are all signs of a good connection.

mtr from Hetzner to Google

mtr does works similar to traceroute. In this output, it tried to traceroute the path 1000 times. Every router in between my server and the destination responded every time. All values are very healthy.

# mtr -n6 -s 1000 -r -c 1000 2a00:1450:400e:800::200e
Start: Fri Dec 23 10:10:32 2016
HOST: flipflop                    Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 2a01:4f8::a:16:b           0.0%  1000    0.7   0.6   0.6  20.9   0.8
  2.|-- 2a01:4f8:0:3::95           0.0%  1000    0.7   0.9   0.6  41.4   2.9
  3.|-- 2a01:4f8:0:3::115          0.0%  1000    5.3   5.3   5.2  18.2   0.8
  4.|-- 2a01:4f8:0:3::16e          0.0%  1000    5.3   5.3   5.2  16.7   0.8
  5.|-- 2001:4860:1:1:0:1:0:68     0.0%  1000    5.4   8.7   5.2  64.9   8.1
  6.|-- 2001:4860::1:0:d0d9        0.0%  1000    5.7   8.9   5.7  61.2   7.6
  7.|-- 2001:4860::8:0:cb93        0.0%  1000    5.8   9.4   5.6  40.6   3.7
  8.|-- 2001:4860::8:0:8f91        0.0%  1000    8.6  14.0   8.3  51.3   7.8
  9.|-- 2001:4860::8:0:87b8        0.0%  1000   11.7  11.9  11.5  29.6   1.8
 10.|-- 2001:4860::1:0:cd13        0.0%  1000   11.8  11.8  11.7  21.7   1.0
 11.|-- 2001:4860:0:f8b::1         0.0%  1000   11.8  11.8  11.6  19.5   0.5
 12.|-- 2001:4860:0:1::155b        0.0%  1000   11.7  11.6  11.4  16.5   0.3
 13.|-- 2a00:1450:400e:800::200e   0.0%  1000   12.3  11.5  11.4  16.5   0.3

Bad situation at Telenet

To show the differences, I’ll run the exact same test from the PC which is directly connected to my cable modem. I’m also using the same destination.

ping from Telenet to Google

$ ping6 -i 2 -c 150 2a00:1450:400e:800::200e
...
--- 2a00:1450:400e:800::200e ping statistics ---
150 packets transmitted, 109 received, 27% packet loss, time 298490ms
rtt min/avg/max/mdev = 19.291/21.341/30.382/1.334 ms

Looking at the amount of packet loss, we clearly seem to have a problem somewhere along the path.

mtr from Telenet to Google

$ mtr -n6 -s 1000 -r -c 1000 2a00:1450:400e:800::200e
HOST: rtr                         Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 2a02:181f:0:e1::1          1.9%  1000   10.6  13.3   7.6 137.3  12.6
  2.|-- 2a02:1800:2:20c0::2       17.9%  1000   96.4  24.7   8.7 461.9  54.1
  3.|-- 2a02:1800:2:20c0::2       17.7%  1000   11.3  16.3   8.7 268.6  25.0
    |  `|-- 2a02:1800:0:1:2104:201:0:3
  4.|-- 2a02:1800:0:1:2104:201:0: 18.0%  1000   12.8  13.0  10.3 256.6   8.9
    |  `|-- 2001:2000:3080:772::1
  5.|-- 2001:2000:3080:772::1     17.7%  1000   21.0  17.5  10.3  66.2   4.8
    |  `|-- 2001:2000:3018:4c::1
  6.|-- 2001:2000:3018:4c::1      17.7%  1000   20.5  19.9  17.3  51.0   2.7
    |  `|-- 2001:2000:3080:5af::2
  7.|-- 2001:2000:3080:5af::2     17.6%  1000   22.4  22.1  17.7  51.6   3.1
    |  `|-- 2001:4860::9:4000:cd8a
  8.|-- 2001:4860::9:4000:cd8a    17.8%  1000   24.4  22.0  18.9  40.5   2.3
    |  `|-- 2001:4860::8:4000:ce26
    |   |-- 2001:4860::c:4000:d9af
  9.|-- 2001:4860::8:4000:ce26    16.6%   980   22.2  21.5  18.9  49.6   2.3
    |  `|-- 2001:4860::8:0:cc3f
    |   |-- 2001:4860::8:4000:d325
    |   |-- 2001:4860::c:4000:d9af
 10.|-- 2001:4860::8:0:cc3f       16.7%   980   28.1  21.9  19.0  53.4   2.5
    |  `|-- 2001:4860::8:0:87b8
    |   |-- 2001:4860::8:0:87b0
    |   |-- 2001:4860::8:4000:d325
 11.|-- 2001:4860::8:0:87b8       16.5%   980   22.8  22.1  19.3  58.8   2.8
    |  `|-- 2001:4860::1:0:cd13
    |   |-- 2001:4860::8:0:87b0
 12.|-- 2001:4860::1:0:cd13       16.6%   980   23.6  22.0  19.6  47.2   2.2
    |  `|-- 2001:4860:0:f8b::1
    |   |-- 2001:4860:0:f8a::1
 13.|-- 2001:4860:0:f8b::1        16.4%   980   20.3  21.6  19.1  37.1   1.6
    |  `|-- 2001:4860:0:1::155b
    |   |-- 2001:4860:0:1::155f
    |   |-- 2001:4860:0:f8a::1
 14.|-- 2001:4860:0:1::155b       16.3%   980   19.8  21.3  19.0  33.0   1.6
    |  `|-- 2a00:1450:400e:800::200e
    |   |-- 2001:4860:0:1::155f
 15.|-- 2a00:1450:400e:800::200e  16.1%   980   21.3  21.5  19.3  41.3   1.7

Hmmmm…

Conclusion

The packet loss starts at router 2a02:1800:2:20c0::2. This router is still within the network 2a02:1800::/24 which is part of Telenet AS.
Message to @Telenet: would it be possible to ask one of you engineers to have a look at this router? It could be overloaded or not correctly configured. Thanks.

Update on 20170103

I’ve been in contact with my ISP Telenet. Thanks Telenet!! They’ve investigated the issue and didn’t find any signs of packet loss on their routers. For that reason, they proposed to replace the Motorola CV6181E cable modem. At first, I thought that the modem was not to blame. But then I ran a traceroute from a server on the internet to my firewall behind the modem. This is the result. Note that I launched the traceroute when ping6 was failing.

# mtr -n -c 10 -r 2a02:181f:0:e1:68b6:5f63:60c0:92d7
Start: Tue Jan  3 16:33:00 2017
HOST: flipflop                    Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 2a01:4f8::a:16:b           0.0%    10    7.8   1.5   0.4   7.8   2.4
  2.|-- 2a01:4f8:0:3::95           0.0%    10    0.5   0.5   0.5   0.6   0.0
  3.|-- 2a01:4f8:0:3::19           0.0%    10    3.1   3.1   3.1   3.2   0.0
  4.|-- 2a01:4f8:0:3::22e          0.0%    10    3.1   3.1   3.1   3.2   0.0
  5.|-- 2001:978:2:a::1           30.0%    10    3.8   3.8   3.7   3.9   0.0
  6.|-- 2001:550:0:1000::9a19:d    0.0%    10    4.1   4.1   3.9   4.2   0.0
  7.|-- 2001:550:0:1000::9a36:25d  0.0%    10    6.2   6.1   6.0   6.2   0.0
  8.|-- 2001:550:0:1000::9a36:24f 80.0%    10   11.5  11.5  11.4  11.5   0.0
  9.|-- 2001:550:0:1000::8275:337 90.0%    10   11.3  11.3  11.3  11.3   0.0
 10.|-- 2001:978:3::c2             0.0%    10   10.9  10.3   8.2  23.7   4.8
 11.|-- 2001:2000:3018:78::1       0.0%    10   24.0  27.3  23.9  55.0   9.8
 12.|-- 2001:2000:3080:778::2      0.0%    10   22.6  22.7  22.6  22.7   0.0
 13.|-- 2a02:1800:0:1:2104:201:0:  0.0%    10   23.2  23.2  23.2  23.3   0.0
 14.|-- 2a02:1800:2:20c1::4        0.0%    10   26.3  27.0  25.4  36.2   3.2
 15.|-- 2a02:181f:0:e1:68b6:5f63: 50.0%    10   33.0  33.1  32.2  33.7   0.5

Hop 15 is my firewall. Hop 14 is the default gw for my firewall. That router was always reachable. The DOCSIS 3.0 cable modem is installed between hop 14 and 15. It’s not visible because it doesn’t operate on layer 3. Obviously I can only test the devices that operate on layer 3 of the OSI model. Possible there are other layer 2 devices, but the modem could certainly be dropping the packets. Next step : replace the Motorola CV6181E DOCSIS3 cable modem.

Update on 20170106

Yesterday, Telenet contacted me to arrange a onsite visit of an engineer to do some tests and try and fix the issue. That engineer came before noon. There are the actions taken.

Visit by Telenet Engineer

  • Replaced the coax cable amplifier/splitter with a new model.
  • Connected his laptop directly to my modem to test if he also experienced packet loss while pinging to Google over IPv6.
  • The engineer confirmed that there is indeed around 10% packet loss over IPv6. No packet loss over IPv4.
  • The CV6181E cable modem is replaced by a CV7160E cable modem. Type is 24*8 DOC 3 EMTA(DOCSIS).
  • It took quiet some time for the device to start. I guess it downloaded and installed the latest firmware and configuration. During this time, the engineer inspected the external equipment in my street.
  • After the initial start of the new device, the internet connectivity was restored.
  • At this moment, the engineer re-ran the ping test and for a few minutes, everything looked fine and the engineer took of to the next customer. Apparently the old modem did cause the issues.

@Telenet : Big thank you for the engineer. He was really friendly and he understood the problem well. I’m glad he was able to replicate the issue with his own laptop and I’m also glad the the Telenet devices in my home are replaced. That rules out problems with those devices.

But….

After the engineer left, I was also convinced that the problem was fixed until I started experiencing those hanging SSH sessions again. Immediately, I ran a more extended ping6 and mtr and came back with the following results.

$ mtr -n -6 -r -c 1200 ipv6.google.com
HOST: rtr                         Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 2a02:181f:0:e1::1          1.1%  1200   17.3  18.5   6.7 182.1  16.3
  2.|-- 2a02:1800:2:20c0::2       25.3%  1200  6615. 6823. 6158. 7429. 148.3
  3.|-- 2a02:1800:0:1:2104:201:0: 12.8%  1200  6648. 2146.  11.6 7456. 3160.3
    |  `|-- 2a02:1800:2:20c0::2
  4.|-- 2a02:1800:0:1:2104:201:0: 11.7%  1200   14.4  17.8  11.3 110.2   8.2
    |  `|-- 2001:2000:3080:772::1
  5.|-- 2001:2000:3080:772::1     11.8%  1200   23.7  22.8  11.3 164.3  10.2
    |  `|-- 2001:2000:3018:4c::1
  6.|-- 2001:2000:3018:4c::1      11.8%  1200   24.5  24.6  17.9 119.0   8.5
    |  `|-- 2001:2000:3080:5af::2
  7.|-- 2001:2000:3080:5af::2     44.8%  1200   28.2  26.7  18.1 136.2   9.3
    |  `|-- 2001:4860::9:4000:cda9
  8.|-- 2001:4860::8:4000:ce26    18.8%  1200   24.4  27.3  21.4 123.9   8.1
    |  `|-- 2001:4860::9:4000:cda9
  9.|-- 2001:4860::8:4000:ce26    11.7%  1200   21.9  27.3  20.7 117.7   9.3
    |  `|-- 2001:4860::8:0:cc3f
 10.|-- 2001:4860::8:0:cc3f       11.6%  1200   25.3  27.9  21.4 116.6   9.2
    |  `|-- 2001:4860::8:0:87b8
 11.|-- 2001:4860::8:0:87b8       31.9%  1200   24.2  27.4  20.6 108.2   9.0
    |  `|-- 2001:4860::1:0:cd12
 12.|-- 2001:4860::1:0:cd12       14.6%  1200   22.3  38.3  20.7 155.2  21.4
    |  `|-- 2001:4860:0:1::15ab
 13.|-- 2001:4860:0:1::15ab       11.8%  1200   25.8  32.5  19.8 113.9  17.3
    |  `|-- 2a00:1450:400e:803::200e
 14.|-- 2a00:1450:400e:803::200e  11.7%  1200   29.8  26.6  19.9 113.2   9.0

A regular ping6 also confirmed around 10% packet loss. This was identical as before. Swapping the devices didn’t change a thing. The problem is not solved.

Some further debugging that might help

Close look at hop 3

Also, looking at the high roundtrip time of the 2nd and 3rd hop, I tested it manually this manually using nping. I captured the related packets on the outside interface of my router. That’s the physical interface that’s directly connected to the modem.

16:02:54.470453 IP6 (flowlabel 0xb317c, hlim 2, next-header TCP (6) payload length: 20) 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611.26119 > 2a00:1450:400e:800::200e.80: Flags [S], cksum 0x5b2e (correct), seq 786760318, win 1480, length 0
16:02:55.470522 IP6 (flowlabel 0xb317c, hlim 2, next-header TCP (6) payload length: 20) 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611.26119 > 2a00:1450:400e:800::200e.80: Flags [S], cksum 0x5b2e (correct), seq 786760318, win 1480, length 0
16:02:56.471619 IP6 (flowlabel 0xb317c, hlim 2, next-header TCP (6) payload length: 20) 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611.26119 > 2a00:1450:400e:800::200e.80: Flags [S], cksum 0x5b2e (correct), seq 786760318, win 1480, length 0
16:03:01.262625 IP6 (hlim 63, next-header ICMPv6 (58) payload length: 68) 2a02:1800:2:20c0::2 > 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611: [icmp6 sum ok] ICMP6, time exceeded in-transit for 2a00:1450:400e:800::200e
16:03:02.262661 IP6 (hlim 63, next-header ICMPv6 (58) payload length: 68) 2a02:1800:2:20c0::2 > 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611: [icmp6 sum ok] ICMP6, time exceeded in-transit for 2a00:1450:400e:800::200e
16:03:03.263520 IP6 (hlim 63, next-header ICMPv6 (58) payload length: 68) 2a02:1800:2:20c0::2 > 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611: [icmp6 sum ok] ICMP6, time exceeded in-transit for 2a00:1450:400e:800::200e

What do we learn? When sending out packets to google with a hop limit of 2, router 2a02:1800:2:20c0::2 answers with an ICMP time exceeded. Strange that the router answers after approx. 7 full seconds. Normally routers tend to reply within 10s of milliseconds, not seconds.
Let’s now send the same packets with a hop limit of 3 and capture those.

16:15:53.143299 IP6 (flowlabel 0x82c3e, hlim 3, next-header TCP (6) payload length: 20) 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611.27168 > 2a00:1450:400e:800::200e.80: Flags [S], cksum 0x1a16 (correct), seq 1505367208, win 1480, length 0
16:15:54.143394 IP6 (flowlabel 0x82c3e, hlim 3, next-header TCP (6) payload length: 20) 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611.27168 > 2a00:1450:400e:800::200e.80: Flags [S], cksum 0x1a16 (correct), seq 1505367208, win 1480, length 0
16:15:54.158081 IP6 (hlim 62, next-header ICMPv6 (58) payload length: 68) 2a02:1800:0:1:2104:201:0:3 > 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611: [icmp6 sum ok] ICMP6, time exceeded in-transit for 2a00:1450:400e:800::200e
16:15:55.144550 IP6 (flowlabel 0x82c3e, hlim 3, next-header TCP (6) payload length: 20) 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611.27168 > 2a00:1450:400e:800::200e.80: Flags [S], cksum 0x1a16 (correct), seq 1505367208, win 1480, length 0
16:15:55.166919 IP6 (hlim 62, next-header ICMPv6 (58) payload length: 68) 2a02:1800:0:1:2104:201:0:3 > 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611: [icmp6 sum ok] ICMP6, time exceeded in-transit for 2a00:1450:400e:800::200e
16:15:59.863624 IP6 (hlim 63, next-header ICMPv6 (58) payload length: 68) 2a02:1800:2:20c0::2 > 2a02:1810:1c87:be01:cc:6c7d:a0cf:e611: [icmp6 sum ok] ICMP6, time exceeded in-transit for 2a00:1450:400e:800::200e

What do we learn? On 2 of the 3 packets, I get an almost immediate response from 2a02:1800:0:1:2104:201:0:3 with hoplimit of 62. That’s expected. The strange this is that one packet was answered by router 2a02:1800:2:20c0::2 which is only 2 hops away, not 3. And again, he’s responding after 7 full seconds.
If I was a Telenet engineer, I would investigate why the router with IP 2a02:1800:2:20c0::2 displays this strange behaviour. In my opinion, there are 2 problems with this router.

  • The router shouldn’t wait 7 seconds before answering
  • The router shouldn’t have answered at all to the packets with hlimit 3.

There might be a good reason for this behavior, but it’s looks strange to me. Also, it might be totally unrelated to the packet loss, but mtr reveals that the packet loss starts on that router.

Ping6 to Telenet first router
ping6 -O 2a02:181f:0:e1::1
PING 2a02:181f:0:e1::1(2a02:181f:0:e1::1) 56 data bytes
64 bytes from 2a02:181f:0:e1::1: icmp_seq=1 ttl=63 time=17.8 ms
64 bytes from 2a02:181f:0:e1::1: icmp_seq=2 ttl=63 time=14.4 ms
... blabla ...
--- 2a02:181f:0:e1::1 ping statistics ---
5603 packets transmitted, 5602 received, 0% packet loss, time 5610117ms
rtt min/avg/max/mdev = 8.485/20.496/184.892/18.344 ms

I ran ping for almost 2 hours and I had no packet loss going to the first router on the Telenet network. This rules out issues with the cable modem.

Reachability of the first router after the Telenet network.

This is a ping to hop 5 in the mtr output.

$ ping6 -O -c 300  2001:2000:3080:772::1
PING 2001:2000:3080:772::1(2001:2000:3080:772::1) 56 data bytes
64 bytes from 2001:2000:3080:772::1: icmp_seq=1 ttl=60 time=14.1 ms
64 bytes from 2001:2000:3080:772::1: icmp_seq=2 ttl=60 time=14.7 ms
....
--- 2001:2000:3080:772::1 ping statistics ---
300 packets transmitted, 272 received, 9% packet loss, time 299542ms
rtt min/avg/max/mdev = 12.060/18.531/98.966/10.146 ms

A reverse lookup of 2001:2000:3080:772::1 gives brx-b2-link.telia.net. Looking at the name, we can see this is a Telia router located in Brussels. It’s directly connected to the Telenet network. Note that while my first hop was always reachable, the Telia router wasn’t reachable for 10% of the time. That means that the source of the packet loss should be located on or between these 2 routers.

Update on 20170112

Today, I received a few calls from Telenet. The people I spoke with were very helpful and knew their stuff. Around 3PM, an engineer called me to inform me that they identified the issue. The same person called me back an hour later to inform me that the problem was fixed. Unfortunately, I couldn’t test immediately.

Final test

With ping6
--- 2a00:1450:400e:800::200e ping statistics ---
4633 packets transmitted, 4622 received, 0% packet loss, time 4632420ms
rtt min/avg/max/mdev = 17.496/61.750/170.214/22.175 ms
rivy@spdy:~$ 

Very little packet loss… it really seems to be fixed.

With mtr
$ mtr -n -r -c 1200 2a00:1450:400e:800::200e
HOST: rtr                         Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 2a02:181f:0:e1::1          0.0%  1200   11.2  16.6   6.1 168.7  15.8
  2.|-- 2a02:1800:2:20c0::2       16.2%  1200  6607. 6845. 6388. 7597. 177.2
  3.|-- 2a02:1800:0:1:2104:201:0:  9.7%  1200   14.9 1687.   9.5 7588. 2943.4
    |  `|-- 2a02:1800:2:20c0::2
  4.|-- 2a02:1800:0:1:2104:201:0:  0.0%  1200   14.5  15.0   9.6 109.9   7.5
    |  `|-- 2001:2000:3080:772::1
  5.|-- 2001:2000:3080:772::1      0.0%  1200   19.9  19.7  10.6 111.7   7.6
    |  `|-- 2001:2000:3018:4c::1
  6.|-- 2001:2000:3018:4c::1       0.1%  1200   19.7  22.3  17.2 111.7   7.8
    |  `|-- 2001:2000:3080:5af::2
  7.|-- 2001:2000:3080:5af::2     24.1%  1200   23.6  25.0  17.5 106.5   7.5
    |  `|-- 2001:4860::9:4000:cda9
  8.|-- 2001:4860::9:4000:cda9     4.8%  1200   26.0  26.1  19.9 110.8   6.3
    |  `|-- 2001:4860::8:4000:ce26
  9.|-- 2001:4860::8:4000:ce26     0.0%  1200   27.0  26.8  20.0 122.7   7.9
    |  `|-- 2001:4860::8:0:cc3f
 10.|-- 2001:4860::8:0:cc3f        0.0%  1200   24.5  26.6  21.4 113.5   7.8
    |  `|-- 2001:4860::8:0:87b8
 11.|-- 2001:4860::8:0:87b8       22.3%  1200   31.2  25.0  19.9 106.0   6.5
    |  `|-- 2001:4860::1:0:cd12
 12.|-- 2001:4860::1:0:cd12        4.6%  1200   74.3  42.4  19.9 158.4  21.8
    |  `|-- 2001:4860:0:1::155b
 13.|-- 2001:4860:0:1::155b        0.0%  1200   21.7  31.9  19.3 133.7  18.2
    |  `|-- 2a00:1450:400e:800::200e
 14.|-- 2a00:1450:400e:800::200e   0.0%  1200   20.6  23.5  19.3 125.9   7.5

The strange issue with hop 2 still exists, but it’s not related to the previous IPv6 packet loss. At this moment, also the mtr confirms that the IPv6 packet loss problem is fixed. Loss of 0.0% on the last hop.

Final conclusion

@Telenet : Thanks a lot Telenet. The IPv6 packet loss is fixed.

Zimbra : Update SpamAssassin using proxy – corrected

In this previous post I explained what to configure in order to update SpamAssassin using a proxy server. While the steps resulted in a successful update of the SpamAssassin rules, it also resulted in the following error in auth.log.

Error message

In /var/log/auth.log

Sep 26 12:52:41 zimbra saslauthd[20344]: zmauth: authenticating against elected url 'https://mail.rivy.org:7073/service/admin/soap/' ...
Sep 26 12:52:41 zimbra saslauthd[20344]: authentication against url 'https://mail.rivy.org:7073/service/admin/soap/' caused error 'curl_easy_perform: error(56): Received HTTP code 403 from proxy after CONNECT'
Sep 26 12:52:41 zimbra saslauthd[20344]: url 'https://mail.rivy.org:7073/service/admin/soap/' will not be used for (at least) 600 seconds
Sep 26 12:52:41 zimbra saslauthd[20344]: Authentication cycle re-elected url https://mail.rivy.org:7073/service/admin/soap/, giving up ...
Sep 26 12:52:41 zimbra saslauthd[20344]: auth_zimbra: rivy auth failed: curl_easy_perform: error(56): Received HTTP code 403 from proxy after CONNECT
Sep 26 12:52:41 zimbra saslauthd[20344]: do_auth         : auth failure: [user=rivy] [service=smtp] [realm=] [mech=zimbra] [reason=Unknown]

Roll back

Remove the last 2 lines which were added to /opt/zimbra/.bashrc

export https_proxy="http://proxy.example.org:3128"
export http_proxy="http://proxy.example.org:3128"

Create a new bashrc for spamassassin

Create a new file and add the 2 lines that contain your outbound proxy settings.

vim /opt/zimbra/.bashrc_for_spamassassin

Add the 2 lines in that single file, save the file and set the correct owner and permissions.

chown zimbra:zimbra /opt/zimbra/.bashrc_for_spamassassin
chmod 444 /opt/zimbra/.bashrc_for_spamassassin

Test

Testing can be done by executing this command and checking the return code.

zimbra@zimbra:~$ . /opt/zimbra/.bashrc;. /opt/zimbra/.bashrc_for_spamassassin; /opt/zimbra/libexec/zmsaupdate
zimbra@zimbra:~$ echo $?
0
zimbra@zimbra:~$

Update crontab

Execute this command as zimbra user.

crontab -e

Look for the following block of text.

#
# Spam rule updates
#
45 0 * * * . /opt/zimbra/.bashrc; /opt/zimbra/libexec/zmsaupdate

And change like this to include your new .bashrc_for_spamassassin

#
# Spam rule updates
#
45 0 * * * . /opt/zimbra/.bashrc; . /opt/zimbra/.bashrc_for_spamassassin; /opt/zimbra/libexec/zmsaupdate

All done now…

Zimbra : Update SpamAssassin using proxy – broken

Please don’t follow these steps, as you’ll get authentication problems when authentication over SMTP. Have a look at this post.

How to configure Zimbra to download SpamAssassin antispam updates using a proxy.

Make configuration change

Start by editing this file.

/opt/zimbra/.bashrc

Then add the following 2 lines. Change them to the hostname or ip address and port of your proxy server.

export https_proxy="http://proxy.example.org:3128"
export http_proxy="http://proxy.example.org:3128"

And save the file.

Test SpamAssassin update

Run the following command as zimbra user. If it doesn’t display an error, you’re good.

. /opt/zimbra/.bashrc; /opt/zimbra/libexec/zmsaupdate

Automatic daily updates

Each night, your zimbra installation will attempt to update the spamassassin definitions at midnight + 45 minutes. The update is triggered by the following crontab entry.

#
# Spam rule updates
#
45 0 * * * . /opt/zimbra/.bashrc; /opt/zimbra/libexec/zmsaupdate

There is no need to create this entry. It should already exists as part of the default zimbra installation.

You want to update ClamAV using a proxy?