Creating a decentralized shared network…?
Let’s rethink about VPNs again…
As discussed previously, if we strip away all the monetary and technical details of the existing VPN services, down on the bottom the problem everyone tries to solve is simply figuring out a way to route network traffic in-and-out through the firewall. This seems like a problem that we can reduce with graphs. Let’s see if we can reduce the problem ourselves and maybe come up with an alternate way to surf the web.
Before we start thinking about solving it, we need to identify the characteristic differences between different stakeholders of this problem:
End Users (Circles)
- End users are usually mobile.
- What they care about are simplicity and functionality. They just want a magic App that you can press a button and everything just works.
- They are usually hidden behind a NAT (Network Address Translation), which means they cannot be reached by a public IP Address, which means they usually have to initiate the connection from their end. (Network is always bi-directional, so as long as one end can reach to the other end, we will have an established connection.)
Web Service Providers (Squares)
- Web service providers can provide virtual private computing instances that has a public IP Address. (can be directly accessed from the outside)
- They are also bound by regulations of the firewall.
- Their instances can be always running (24/7). In fact, it’s actually cheaper to keep them running for a month than pay per use.
The Firewall (Arc)
- The firewall limits incoming and outgoing traffic from a list of known malicious web services.
- We can assume the firewall actively reads all the unencrypted traffic and blocks traffic that contains sensitive keywords that are related to those destinations.
Facebook, Google, and Instagram (Stars)
- They are on the list.
- People from the states or any other countries outside the firewall can access their services.
Now let’s add these stakeholders to the graph.
The very first graph represents the initial blocked state of internet users in China. The firewall by default blocks all connections that are routed to foreign target websites. This is somewhat not quite accurate to the actual situation because all of our individual network traffic has to go through the network provider’s ISP first (such as AT&T, Verison, etc). In order to gain access to the foreign web services, we can add in foreign VPN servers.
In this graph, all the user traffic will go through the VPN server outside the firewall first, and then get redirected to the true destination. A typical example of this server could be an Amazon AWS instance running a VPN server application. However, VPN traffic can be easily detected from evidence such as the port numbers the connections use and the gibberish content that gets transmitted in the connection. Eventually, when the firewall figures out that the green-flagged foreign IP Address is actually a VPN server, it will ban that IP Address for future access.
Therefore, I think it is not sustainable for us to relay network traffic outside the country. Although there are countless IP Addresses yet to be discovered by the firewall, it’s just not a clean solution. What if we can find the solution inside the firewall?
In this graph, I moved the intermediate server nodes inside the firewall. What does this mean? It means that we create VPS instances in China, and we have all the clients connect to it. This seems like a trivial task to do and doesn’t really solve the problem because the VPS instances cannot connect to Facebook either. But wait… This reminds me of something else.
If we flip the clients and the servers in this graph, this almost exactly look like those reverse VPNs that helped international students in the US access Chinese web contents that are exclusively served to Chinese IP Addresses. A lot of these reverse VPNs (Transocks, for example) allocate VPN servers in China and allow users from the US to connect to the service.
If users from the US can connect to the web services in China, that means that these individual US IP Addresses are not blocked by the firewall. In fact, I believe that the firewall only blocks traffic to those large web service providers. If we can have individual computers in the US connect to the VPN server in China, and have Chinese users connect to that server, we essentially built an end-to-end connection from the Mainland to the outside world.
This graph shows how we can have a network of individual computers from both the US and China. If we get them into the same network, we can basically ask the computers from the US to relay our request to those large web services. The computers in the US can also ask Chinese clients for local access to Mainland exclusive web content. This looks like a promising shared network structure that is beneficial for both sides.
But… Actual implementation details are hard to figure out.
As I said in the previous article, I really don’t have much systems and networks knowledge for me to build a network like this. I have to piece my fragmented knowledge of the Internet together to MacGyver the actual solution.
I searched for “VPN servers” on Google. Most of the results pointed me to a few VPN options such as L2TP, IPSec, and PPTP. There are also fancy ones such as ShadowSocks or VPN Gate. No matter which one I choose, all these buzzwords are just the underlying networking layer that glues the clients and the VPN server together. I just need to look for one that has the most documentation.
After some research, I found that L2TP and IPSec look most promising, as there is an existing Docker image that runs an L2TP/IPSec server in a container.
For testing purposes, I created a DigitalOcean Droplet instance. I chose the cheapest option which is $5 a month (minimal storage & computing power, pretty much just a public instance that I can SSH into). On the instance, I followed the instructions in the repository to set up a container that runs the VPN server. I also have to tweak ufw
settings, which is the firewall security settings for the instance to allow UDP traffic on ports 500/4500. I also created my own login credentials for the VPN server and set up the VPN connection on my iPhone.
Everything was pretty straight-forward. I was able to connect to the VPN server right away. I also checked my IP Address using myip.com to make sure that my IP Address is from DigitalOcean (the VPS instance), rather than Webpass (my local ISP). I also set up the VPN connection on my laptop so my laptop can surf the web using DigitalOcean’s IP Address.
Right after I have two ongoing connection to the server (my phone and laptop), I checked the network interfaces on my laptop. It seemed like that the connection created a ppp0
device on my MacBook and I was assigned an internal IP Address 192.168.42.10
to it. I assume my phone will get 192.168.42.11
, since the VPN server probably supports DHCP, according to the run.sh
.
Next, I kind of wanted to see if my phone can reach an endpoint from my laptop.
P2P Connection Inside the L2TP tunnel
First thing that I tried is to open a HTTP server on my laptop.
python -m SimpleHTTPServer 5000
I created a server on port 5000 on my laptop. Since python by default binds this server to 0.0.0.0
, this server should be reachable from the ppp0
interface by 192.168.42.10:5000
. I typed that into my phone’s browser, it worked!
If my phone and the VPS instance are all in China, and my laptop is in US, this basically means we can access contents across the firewall! However, so far we are only able to access local resources that are hosted in the tunnel. What about the public resources such as Google?
The next thing that I tried is to get a HTTP proxy server on my laptop.
pip install proxy.py
proxy.py --host 0.0.0.0 --port 12345
After some Googling, proxy.py seems like the perfect candidate for our use case — a lightweight HTTP proxy server written in python. I used the commands above to install the proxy server and run on port 12345. Basically, this means any device in that L2TP tunnel should be able to use 192.168.42.10:12345
as a HTTP proxy for browsing web contents.
I configured the corresponding HTTP proxy settings in my phone’s VPN settings page, reconnected the VPN and checked my IP Address. For some reason, my IP Address is still from DigitalOcean, which means that my network traffic is not routed through my local computer.
Soon I realized that my connection settings were recursive: both my phone and computer established a full tunnel to the remote VPS, therefore all the network traffic will get routed through the VPS. Therefore, even if I set up an HTTP proxy on my computer, what’s coming out from that proxy still would go through the tunnel and get routed by DigitalOcean. This means that if I want to use my laptop to provide HTTP proxy service, I cannot route the proxy traffic back in the L2TP tunnel.
I spent a few hours learning about iptables
and how to set up forwarding/routing rules so we can reroute the proxy traffic. However, I didn’t get a working solution and it just seemed very non-trivial to do. Therefore I started to think whether there’s a better way to do this — What if the VPS service establishes a different type of connection to proxy computers so we don’t run into this recursive problem?
The next day, I learned about SSH port forwarding at work. What SSH port forwarding does is basically that whenever you open an SSH connection to a host, you can also bind/reverse-bind a few ports so they can be shared between both the host and the client. If I want my laptop to access a remote web server on the server at port 5000, I can forward the port to the client:
ssh -L 5000:host:1234 user@host
This should bindhost:1234
to localhost:5000
on my local laptop. Or we can do the other way around:
ssh -R 5000:localhost:1234 user@host
This should bind localhost:1234
to host:5000
on the host, so the host can access contents on my local laptop.
My plan was that we could reverse-port-forward my laptop’s HTTP proxy to the DigitalOcean instance. Then if my phone can somehow access the forwarded proxy port on the instance and use that as the HTTP proxy port, we should be able to route the phone’s HTTP traffic through my local proxy.
The Real Struggle
When my laptop had the VPN connection to the server, my local network interface showed that the default gateway for the ppp0
network is 192.168.42.1
. I thought this must be the remote VPN server host and we can use this IP Address to access the contents from the host.
My very first test is the same — I started an HTTP server on the remote VPS at port 5000. Then I tried to access the server at 192.168.42.1:5000
. Unfortunately the request failed almost immediately. Then I tried the other way around — I started the server on my laptop and tried to access it from the server at 192.168.42.10:5000
. This worked right away.
I suspected that something was preventing me from accessing the host. Since I can get internet access from the VPN connection, my IP packets should be correctly routed. After closely inspecting all the NAT/firewall rules by calling:
iptables -S
iptables -S -t nat
I realized that since ufw
is in place, every route is by default dropped. I need to add in a new rule to allow network traffic to flow from the ppp0
interface to other local interfaces. This one-liner did the trick for me.
iptables -I INPUT 4 -p tcp -m state --state NEW --source 192.168.42.0/24 -j ACCEPT
192.168.42.0/24
is the CIDR for my ppp0
interface. After this step, I was able to talk to the host of the ppp0
interface!
Now, let’s circle back and sum up all the previous knowledge. We set up a web server instance somewhere remote running a docker IPSec/L2TP VPN server. Then the internet-sharing computer can start an HTTP proxy locally and SSH into the server and reverse port bind to that proxy’s address. Finally, I can set up my phone to connect to the L2TP VPN server and specify 192.168.42.1:<proxy-port>
as the HTTP proxy for browsing traffic.
I’ve successfully built a cross-firewall internet sharing system!
The next step is to actually land it in China. Since I need to do a little tweaking after running the Docker container, I forked the repo and added in the IP rule in the setup script.
Then I wrote a quick script that silently runs the proxy + SSH connection from the network sharing device.
ssh -fN -gR 12345:localhost:8899 user@host
proxy.py --host 0.0.0.0 --port 8899 &
After this step, I started the Tencent cloud instance, loaded and started the docker container and ran the script locally. Also, I have to do a little sshd
tweak on the remote machine due to the port forward restrictions.
echo "GatewayPorts yes" >> /etc/ssh/sshd_config
service sshd restart
This is because that by default sshd
will not bind the reverse forwarded ports to all interfaces — instead it only binds it to localhost
which is inconvenient for us because our network traffic is from a different interface ppp0
.
Finally, I configured the SSH connection from my phone by adding a new L2TP VPN service and entered the server’s IP and credentials. Also, I manually set the HTTP proxy to 192.168.42.1:12345
.
After successfully connecting to the service, I verified that my HTTP traffic is proxied from myip.com again — I see my network-sharing computer’s IP on the webpage. Also, if I remove the HTTP proxy setting, my IP address will become the same as the remote machine.
Something bigger
My friend Ryan was pretty happy with the “VPN” solution that I built for him. Now he is able to check his Instagram feed again during his lunch break. (Learn more about my friend in this previous post). However, I felt that I have discovered something bigger than just a VPN service.
What I have made was essentially a virtual bridging service that bridges users across different countries into the same LAN network. With this, it is possible to scale even larger and have volunteers connected to this network. We can set up an internet-access sharing protocol so people on the network can share their current internet connection with any other peers. We can also quantize the network traffic flowing through the peers into some sort of cryptocurrency so people will have an incentive to share their network. We can truly build a network of freedom and fairness and gain global internet access as a peer.
When the network grows to a large enough size, one central hub server cannot satisfy everyone’s need. Therefore we can break up the central server into decentralized area servers with some sort of routing protocols in between. Then we can host exclusive contents just on this network but not on the underlying traditional internet. Then in order to browse other people’s exclusive content, we need to build a web index — some sort of search engine that searches for those content. This sounds awfully familiar — Isn’t this how the dark web works? By having a private internet on top of the original internet? (And also the Tor project)
Slightly frightened, I spent hours thinking about how I got to this step — coming up with some idea that’s very similar to an existing example of chaos. I believed that when the dark web was created, or it shouldn’t be called as the dark web at all, people were giving a new definition for the internet. They see the internet as just a layer and a protocol of communication. Therefore we can arbitrarily stack up a few of these layers together to create something new that still essentially has the same functionality as the internet.
The Internet is such a malleable thing — you can divide it into even more layers such as Physical, Link, IP, Transport, Application, etc. You can also stack any of those layers up together and it will still be the same, but just with different implementation details.
In fact, I’m surprised that we have only been horizontally extending the network infrastructure, but not really vertically stacking up more layers. Censorship still just works at the very bottom layer. The best way to bypass it is to go upwards.