Multiple Connections Inbound Access Challenge

In “High Availability Dual WAN Remote Industrial Connectivity” I discussed some of the challenges of achieving non-stop connectivity. One of the specific challenges relates to having multiple Internet connections from unlike providers. Different IP, different MTU, different latency, different firewall, etc. Point solutions like DDNS etc are challenged, let’s discuss the Multiple Connections Inbound Access Challenge.

Several years ago I wrote about assisting a friend in installing Starlink. Previously I had helped him bond multiple DSL lines together, a solution which gave a single IP and somewhat more bandwidth than any one link, but was not that great otherwise, DSL being DSL. In the new solution, I deployed a Mikrotik router, overwote it with OpenWRT, and setup MWAN3 for multiple WAN access.

MWAN3 is a load balancing/failover open source package that deploys on an OpenWRT router. It health checks each link, and switches between them. In the outbound direction, it can use more than one link simultaneously. In the inbound direction, its at the mercy of what you send it.

In that article, I wrote about some of the challenges of unlike speed and MTU. He had very minimal ‘remote access needs’, so I used our Agilicus AnyX to make the Starlink satellite statistics available remotely, from anywhere.

At the time, I had not given a lot of thought to this challenge, but subsequently, as Agilicus has evolved, it has become more evident that our strategy of “outbound only connections” has multiple advantages, functional, security, simplicity. In this case, let’s re-imagine that ‘remote access’ case without.

This house has 2 IP. 1 is ‘public’, on DSL. 1 is shared, NAT, private, on Starlink. Thus we could theoretically use DDNS and announce the DSL one, We could just accept that when Starlink is down, its unavailable. Would this be acceptable in a non-stop industrial environment? no.

Let’s say that Starlink would magnanimously provide my friend a public, routable IP. What would we do then? We might put the Starlink one in DDNS, and, when it fails, change it. This would take time to propagate, the TTL of DNS might be 1hour or more. Or, we could just call all the people using it using the phone, tell them to reconfigure their equipment. Would this be acceptable in a non-stop industrial environment? no.

Let’s say we could get a 3rd IP address, and, somehow magically NAT it back and forth across these. Would that resolve? Well, we would now run into an issues with the unlike MTU and unlike bandwidth-latency-product, which is kept on a per route-pair basis: there would be a significant glitch. Would this be acceptable in a non-stop industrial environment? no.

SSH Animated Data Flow
SSH Animated Data Flow

It turns out by using an outbound-only connection, we have inadvertently solved this key problem. The IP is on our public cloud, where we have access to tools like Anycast, load-balancers, regional availability zones, kubernetes, Istio, etc., without any real additional cost. The end users see a always-on available connection, and the distinction of which link it runs down is hidden. Individual HTTP transactions might split. Seamless to the end-user, no firewall changes, works across all links, non-stop? What’s not to like, we’ve resolved the Multiple Connections Inbound Access Challenge.