Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> ...linux, in particular made it really hard to reliably disable

Section 10.1 of that Archi Wiki page says that adding 'ipv6.disable=1' to the kernel command line disables IPv6 entirely, and 'ipv6.disable_ipv6=1' keeps IPv6 running, but doesn't assign any addresses to any interfaces. If you don't like editing your bootloader config files, you can also use sysctl to do what it looks like 'ipv6.disable_ipv6=1' does by setting the 'net.ipv6.conf.all.disable_ipv6' sysctl knob to '1'.

> You aren't running it during an external transitive failure...

I'll assume you meant "transient". Given that I've already demonstrated that the only relevant traffic that is generated is IPv4 traffic, let's see what happens when we cut off that traffic on the machine we were using earlier, restored to its state prior to the updates.

We start off with empty firewall rules:

  root@ubuntu-server:~# iptables-save
  root@ubuntu-server:~# ip6tables-save
  root@ubuntu-server:~# nft list ruleset
  root@ubuntu-server:~# 
We prep to permit DNS queries and ICMP and reject all other IPv4 traffic:

  root@ubuntu-server:~# iptables -A OUTPUT -o enp0s3 -p udp --dport 53 -j ACCEPT
  root@ubuntu-server:~# iptables -A OUTPUT -o enp0s3 -p tcp --dport 53 -j ACCEPT
  root@ubuntu-server:~# iptables -A OUTPUT -o enp0s3 -p icmp -j ACCEPT
  root@ubuntu-server:~# iptables -A INPUT  -i enp0s3 -p udp --sport 53 -j ACCEPT
  root@ubuntu-server:~# iptables -A INPUT  -i enp0s3 -p tcp --sport 53 -j ACCEPT
  root@ubuntu-server:~# iptables -A INPUT  -i enp0s3 -p icmp -j ACCEPT
  root@ubuntu-server:~# iptables -A OUTPUT -o enp0s3 -j REJECT
  root@ubuntu-server:~# iptables -A INPUT  -i enp0s3 -j REJECT
  root@ubuntu-server:~#
And we do an apt-get update, which fails in less than ten seconds:

  root@ubuntu-server:~# apt-get update
  Ign:1 http://security.ubuntu.com/ubuntu questing-security InRelease
  Ign:2 http://us.archive.ubuntu.com/ubuntu questing InRelease
  <snip>
  Could not connect to security.ubuntu.com:80 (91.189.92.23). - connect (111: Connection refused) Cannot initiate the connection to security.ubuntu.com:80 (2620:2d:4000:1::102). - connect (101: Network is unreachable) <long line snipped>
  <snip>
  W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/questing-security/InRelease  Cannot initiate the connection to security.ubuntu.com:80 (2620:2d:4000:1::102). - connect (101: Network is unreachable) <long line snipped>
  W: Some index files failed to download. They have been ignored, or old ones used instead.
  root@ubuntu-server:~# 
In this case, the IPv6 traffic I see is... an unanswered router solicitation, and the multicast querier chatter that I saw before. [0] What happens when we change those REJECTs into DROPs

  root@ubuntu-server:~# iptables -D OUTPUT -o enp0s3 -j REJECT
  root@ubuntu-server:~# iptables -D INPUT  -i enp0s3 -j REJECT
  root@ubuntu-server:~# iptables -A OUTPUT -o enp0s3 -j DROP
  root@ubuntu-server:~# iptables -A INPUT  -i enp0s3 -j DROP
  root@ubuntu-server:~# 
...and then re-run 'apt-get update'?

  root@ubuntu-server:~# apt-get update
  Ign:1 http://security.ubuntu.com/ubuntu questing-security InRelease
  Ign:1 http://security.ubuntu.com/ubuntu questing-security InRelease
  Ign:1 http://security.ubuntu.com/ubuntu questing-security InRelease
  Err:1 http://security.ubuntu.com/ubuntu questing-security InRelease
  Cannot initiate the connection to security.ubuntu.com:80 (2620:2d:4002:1::103). - connect (101: Network is unreachable) <v6 addrs snipped> Could not connect to security.ubuntu.com:80 (91.189.92.24), connection timed out <long line snipped>
  <redundant output snipped>
  W: Some index files failed to download. They have been ignored, or old ones used instead.
  root@ubuntu-server:~#
Exactly the same thing, except it takes like two minutes to fail, rather than ~ten seconds, and the error for IPv4 hosts is "connection timed out", rather than "Connection refused". Other than the usual RS and multicast querier traffic, absolutely no IPv6 traffic is generated.

However. The output of 'apt-get' sure makes it seem like an IPv6 connection is what's hanging, because the last thing that its "Connecting to..." line prints is the IPv6 address of the host that it's trying to contact... despite the fact that it immediately got a "Network is unreachable" back from the IPv6 stack.

To be certain that my tcpdump filter wasn't excluding IPv6 traffic of a type that I should have accounted for but did not, I re-ran tcpdump with no filter and kicked off another 'apt-get update'. I -again- got exactly zero IPv6 traffic other than unanswered router solicitations and multicast group membership querier chatter.

I'm pretty damn sure that what you were seeing was misleading output from apt-get, rather IPv6 troubles. Why? When you combine these facts:

* REJECTing all non-DNS IPv4 traffic caused apt-get to fail within ten seconds

* DROPping all non-DNS IPv4 traffic caused apt-get to fail after like two minutes.

* In both cases, no relevant IPv6 traffic was generated.

the conclusion seems pretty clear.

But, did I miss something? If so, please do let me know.

[0] I can't tell you why the last line in the 'apt-get update' output is only IPv6 hosts. But everywhere there were IPv6 hosts, the reported error was "Network is unreachable" and for IPv4 the error was "Connection refused".

 help



This part is exactly the problem I was talking about:

  root@ubuntu-server:~# apt-get update
  ...
  Could not connect to security.ubuntu.com:80 (91.189.92.23). - connect (111: Connection refused) Cannot initiate the connection to security.ubuntu.com:80 (2620:2d:4000:1::102). - connect (101: Network is unreachable) <long line snipped>
  <snip>
  W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/questing-security/InRelease  Cannot initiate the connection to security.ubuntu.com:80 (2620:2d:4000:1::102). - connect (101: Network is unreachable) <long line snipped>
  W: Some index files failed to download. They have been ignored, or old ones used instead.
Well... in this case the output does show the failure to connect to 91.189.92.23, but that looks like a different kind of message to the "W:" lines, so maybe it doesn't show up on all setups or didn't make it into the logs on disk, or got buried under other output.

If you look at just the W: lines, it mentions a v6 address but the machine doesn't have v6 and the actual problem is the Connection Refused to the v4 address. The output is understandably misleading but ultimately the problem here has nothing to do with v6.


> ...ultimately the problem here has nothing to do with v6.

I agree... more or less. The remainder of this message is a reply to nyrikki, but I'm sticking it under your comment because you might also appreciate how weird it looks like this guy's setup is.

nyrikki: The rest of this message is directed directly at you:

============================

Actually, what's up with your link-local addresses? They have really odd flags on them.

The only way I can figure that you got into that configuration was to remove the kernel-generated link-local address and add a new one with the arguments 'scope link noprefixroute'. Even if a router on your network advertised a fe80::/64 prefix, that does nothing at all, as hosts are supposed to [0] ignore advertised prefixes that are link-local.

Yeah. After playing around with this for a bit, I can see that your network is at either least as misconfigured as one would be if -say- your DHCP server was giving leases with an invalid default gateway, or it is very, very specially configured for very special reasons.

Starting with the ubuntu-server host in the "IPv4 traffic is REJECTed" configuration from my last comment, we do this on the host to delete the kernel-supplied link-local address and instruct the OS to create an address in the link-local address space that can be used for global addresses.

  root@ubuntu-server:~# ip addr del fe80::5054:98ff:fe00:64a9/64 dev enp0s3
  root@ubuntu-server:~# ip addr add fe80::5054:98ff:fe00:64aa/64 noprefixroute dev enp0s3
  root@ubuntu-server:~# 
We then configure our upstream router to either

* Send RAs on the local link without a prefix

or

* Send RAs on the local link with a link-local prefix (so they're ignored by the Ubuntu host)

or we hard-code the address of a next-hop router on our host. One (or more) of these three things sets up the host with a default route. If you do none of them, you don't get a default route, and global traffic goes nowhere.

Then -because either you or something running on the host deleted the kernel-provisioned link-local address, and then explicitly instructed the kernel to create a link-local address that can be used to reach global addresses- the local host starts emitting IPv6 traffic with a link-local source address and a global destination address.

When presented with this sort of traffic, my router immediately sends back a ICMP6 "destination unreachable, beyond scope", which immediately terminates the connection attempt on the host, so the behavior ends up being exactly the same as when the host didn't have a misconfigured link-local address. But. You claim to be having trouble.

So, there are one or more things that might be going on that explain your trouble.

1) You have a firewall on this host that is dropping important ICMP6 traffic, causing it to miss the "this destination address is beyond your scope" message from the router. Do. Not. Do. This. ICMP is network-management traffic which tells you important things. Dropping important ICMP traffic is how you have mysterious and annoying failures.

2) Your router is configured to ignore link-local traffic with non-link-local destination addresses, rather than replying that the destination is out of scope. On the one hand, this seems stupid to me, but on the other hand, we got here through a misconfiguration that seems very unlikely to me to happen often, [1] so the router admin might not have thought about it when making "locked down" firewall rules.

3) There's some middlebox on the path to the router that's dropping your traffic because not all that many folks would expect to see link-local source and global destination, and middleboxes are widely known for dropping stuff that's even a little bit abnormal.

Investigating your misconfigured host (and maybe also connected network) has been interesting. I'd love to try to figure out if SystemD can be misconfigured to produce the host configuration that we're seeing (or if this misconfiguration is 100% bespoke), but I hear a hot burrito calling my name. Maybe I'll get bored and do more investigation later.

Also, you might object to my conclusion with "But this couldn't happen on IPv4! Clearly IPv6 is too complicated!". I would reply with "What would happen if your host couldn't get a lease from a DHCPv4 server, autoconfigured an address in the IPv4 link-local (169.254.0.0/16) address range, and the network's upstream router was configured to silently drop traffic from that subnet? At least the IPv6 link-local address range is prohibited from sending traffic off the local link [2] and fails the transmission attempt immediately."

[0] ...and Ubuntu questing does ignore such prefixes...

[1] ...that is, a link-local address that has been configured to handle global traffic...

[2] ...unless -as we've discovered- you specifically tell the OS otherwise...


> Actually, what's up with your link-local addresses? They have really odd flags on them.

They were probably configured by one of the fancy network config daemons (systemd-networkd, dhcpcd or similar). They like to take over RA processing, and they add IPs with "noprefixroute" so they can add the route themselves separately.

RAs have nothing to do with link-locals, but I bet one or the other of those daemons also takes over configuring link-local addresses and does the same thing there. If you looked in the routing table, there'll be a prefix route for fe80::/64 that was added by the daemon.

This wouldn't affect how DNS replies are sorted though. On machines without non-link-local v6, AAAA records aren't handled by trying them first and then expecting them to quickly fail. They're handled by pushing them to the bottom of the list so that the A records are tried first.


> They were probably configured by one of the fancy network config daemons (systemd-networkd, dhcpcd or similar). They like to take over RA processing, and they add IPs with "noprefixroute" so they can add the route themselves separately.

Makes sense, yeah.

While I don't see a way to do this with dhcpcd, I have no clue what Lovecraftian horrors systemd-networkd generates, so maybe it's the culprit. And whatever is doing this, this behavior is not configured by default on Ubuntu Server version Questling. Out of the box, I get regular kernel-assigned link-local addresses.

But I don't understand why you'd want to do this for link-local addresses... not automatically, anyway. It looks like doing this has the disadvantage that it erases the baked-in "This shouldn't be used for global-scope transmissions. Send back 'Network is unreachable' in those cases." rule that you get for free with the kernel-generated address. Sheesh. I wonder if there's some additional logic in a stupid daemon somewhere that manages a firewall rule that restores the "Network is unreachable" ICMPv6 response to outbound global-scope packets that come from the link-local address... just to add more moving parts that can get out-of-sync.

> This wouldn't affect how DNS replies are sorted though.

Yeah.

It's a pity that I don't work with OP. I'd rather like to take a look at this system and the network it's hooked to.


> It looks like doing this has the disadvantage that it erases the baked-in "This shouldn't be used for global-scope transmissions.

I tried with the kernel-generated LL and my kernel does attempt to use a link-local source when connecting to GUA addresses if it has no other address to connect from. And it works:

  # ssh 2001:db8::1 env | grep CLIENT
  SSH_CLIENT=fe80::f0b3:20ff:fe3d:d4cf%eth0 54456 22
(...so long as the destination is on the local network. In this case I assigned 2001:db8::1 to the router, but the router will issue an ICMPv6 redirect for other IPs on the network, which is awkward for me to test but should also work.)

I note that you didn't run `ip route add fe80::/64 dev enp0s3` after adding the LL with noprefixroute, which... seems to break surprisingly little? Because the packet gets sent to the router, which does still have a route for fe80::/64 to the same network, so it issues an ICMPv6 redirect and the client ends up doing NDP anyway.


> So, there are one or more things that might be going on that explain your trouble.

Ah, there's secret option #4:

4) This rather weird configuration has been deliberately set up by the sysadmin that manages this system and network and ordinarily works fine, but the "external transitive failure that happened on April 15th." affected both IPv4 and IPv6 traffic (which, duh, that happens frequently)... but it was an intermittent failure so unrelated changes made by OP caused him to come to the wrong conclusions and point the blame cannon at the wrong part of the system.

Okay. Burrito time!




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: