- Classification of HTTP/HTTPS Forward Proxies
- Classification by client perception or not
- Classification of HTTPS by proxy decryption or not
- Why does a forward agent need special handling to process HTTPS traffic?
- NGINX Solution
- HTTP CONNECT Tunnel (7 Layer Solution)
- NGINX stream (4-tier solution)
NGINX is mainly designed as a reverse proxy server, but with the development of NGINX, it can also be used as one of the options of forward proxy. Forward proxy itself is not complicated, and how to proxy encrypted HTTPS traffic is the main problem to be solved by forward proxy. This article will introduce two schemes of using NGINX to forward proxy HTTPS traffic, as well as their usage scenarios and main problems.
Classification of HTTP/HTTPS Forward Proxies
The classification of forward agents is briefly introduced as background knowledge for understanding the following:
Classification by client perception or not
- General AgentThe client needs to manually set the agent’s address and port in the browser or system environment variables. For example, squid specifies the squid server IP and port 3128 on the client side.
- Transparent AgentThe client does not need to do any proxy settings. The role of proxy is transparent to the client. For example, Web Gateway devices in enterprise network links.
Classification of HTTPS by proxy decryption or not
- Tunnel AgentThat is, the transfer agent. The proxy server only transmits HTTPS traffic on TCP protocol, and is not aware of the specific content of its proxy traffic. The client and the destination server it visits interact directly with TLS/SSL. The NGINX proxy approach discussed in this article belongs to this pattern.
- MITM, Man-in-the-Middle AgentProxy server decrypts HTTPS traffic, completes TLS/SSL handshake with self-signed certificate to client, and completes normal TLS interaction to destination server. Two TLS/SSL sessions are established in the client-proxy-server link. For example, Charles, a simple description of the principles can be referred to in the article.
Note: In this case, the client actually gets the proxy server’s own self-signed certificate during the TLS handshake phase. The certificate chain verification is unsuccessful by default, and the client needs to trust the Root CA certificate of the proxy self-visa. So in the process, the client feels it. If we want to be a senseless transparent agent, we need to push the self-built ROOTCA certificate to the client, which is achievable in the internal environment of the enterprise.
Why does a forward agent need special handling to process HTTPS traffic?
As a reverse proxy, the proxy server usually terminates the HTTPS encrypted traffic and forwards it to the back-end instance. The encryption, decryption and authentication process of HTTPS traffic occurs between the client and the reverse proxy server.
As a forward proxy, HTTP encryption is encapsulated in TLS/SSL when dealing with the traffic from the client. The proxy server can not see the domain name that the client wants to access in the request URL, as shown below. So proxy HTTPS traffic, compared with HTTP, needs to do some special processing.
According to the classification method mentioned above, NGINX’s solution to HTTPS proxy belongs to the transmission (tunnel) mode, that is, it does not decrypt and does not perceive the upper traffic. Specific ways are as follows: 7-tier and 4-tier solutions.
HTTP CONNECT Tunnel (7 Layer Solution)
As early as 1998, in the SL era when TLS was not formally born, Netscape, which dominates the SSL protocol, proposed INTERNET-DRAFT for tunneling SSL traffic using web agents. The core idea is to establish an HTTP CONNECT Tunnel between the client and the proxy by using HTTP CONNECT requests. In the CONNECT requests, the destination host and port that the client needs to access are specified. The original drawings in Draft are as follows:
The whole process can refer to the chart in the HTTP authoritative guide:
- Client sends HTTP CONNECT request to proxy server.
- The proxy server uses the host and port in the HTTP CONNECT request to establish a TCP connection with the destination server.
- The proxy server returns the HTTP 200 response to the client.
- The client and proxy server set up the HTTP CONNECT tunnel. After HTTPS traffic arrives at the proxy server, it is transmitted directly to the remote destination server through TCP. The proxy server’s role is to pass through HTTPS traffic, and it does not need to decrypt HTTPS.
NGINX ngx_http_proxy_connect_module module
NGINX serves as a reverse proxy server, and the HTTP CONNECT method has not been officially supported. However, based on the modularization and scalability of NGINX, Ali’s @chobits provides the ngx_http_proxy_connect_module module to support the HTTP CONNECT method, so that NGINX can be extended to forward proxy.
Take the environment of CentOS 7 for example.
For the newly installed environment, refer to the normal installation steps and the steps of installing this module (https://github.com/chobits/ngx_http_proxy_connect_module), type the corresponding version of the patch and add the parameter add-module=/path/to/ngx_http_proxy_connect_module when configuring. The example is as follows:
For the environment that has been installed, compiled and installed, the above modules need to be added. The steps are as follows:
2) nginx.conf file configuration
Layer 7 needs to build tunnels through HTTP CONNECT, which belongs to the common proxy mode that the client perceives. It needs to manually configure the IP and port of HTTP (S) proxy server on the client side. On the client side, access with curl plus-x parameter is as follows:
From the details printed out by the above-v parameter, we can see that the client first established the HTTP CONNECT tunnel to the proxy server 184.108.40.206, and the proxy responded to the HTTP/1.1 200 Connection Establishment and began to interact with TLS/SSL handshake and traffic.
NGINX stream (4-tier solution)
Since we are using the method of passing through the upper layer traffic, can we not be a “four-layer proxy” to thoroughly transmit the protocol over TCP/UDP? The answer is yes. NGINX has officially supported the ngx_stream_core_module module since version 1.9.0. The module is not built by default. When configure is needed, the with-stream option is added to turn it on.
Using NGINX stream to proxy HTTPS traffic at the TCP level is bound to encounter the problem mentioned at the beginning of this article: the proxy server cannot obtain the destination domain name that the client wants to access. Because the information acquired at TCP level is limited to IP and port level, there is no chance to get domain name information. In order to get the destination domain name, we must have the ability to disassemble the upper message to get the domain name information, so NGINX stream is not a four-tier agent in the strict sense, or it needs a little help from the upper ability.
To get the domain name accessed by HTTPS traffic without decryption, only the extended address SNI (Server Name Indication) in the first Client Hello message of the TLS/SSL handshake is used. NGINX has officially supported the use of the ngx_stream_ssl_preread_module module, which is mainly used to obtain SNI and ALPN information in Client Hello messages, since version 1.11.5. For a four-tier forward agent, the ability to extract SNI from Client Hello messages is critical, otherwise the NGINX stream solution cannot be established. At the same time, this also brings a limitation, requiring all clients to bring SNI fields in the TLS/SSL handshake, otherwise the NGINX stream proxy can not know the destination domain name the client needs to access.
For newly installed environments, refer to the normal installation steps and add the options of — with stream, — with stream_ssl_preread_module and — with-stream_ssl_module directly when configuring. Examples are as follows:
For the environment that has been installed, compiled and installed, we need to add the above three stream-related modules, the steps are as follows:
2) nginx.conf file configuration
NGINX stream, unlike HTTP, needs to be configured in stream blocks, but the instruction parameters are similar to HTTP blocks. The main configurations are as follows:
For 4-tier forward proxy, NGINX basically passes through the upper traffic, and does not need HTTP CONNECT to build tunnels. A model suitable for transparent proxy, such as using DNS to de-direct the domain name to the proxy server. We can simulate it by binding / etc / hosts on the client side.
On the client side:
1) The client manually sets up the proxy causing the access to be unsuccessful
The four-tier forward proxy is to pass through the upper HTTPS traffic, and it does not need HTTP CONNECT to establish the tunnel, that is to say, it does not need the client to set up HTTP (S) proxy. If we set up HTTP (s) proxy manually on the client side, can we use curl-x to set up proxy for this forward server access test and see the results:
You can see that the client attempted to establish HTTP CONNECT tunnel before forward NGINX, but because NGINX is passed through, the CONNECT request is forwarded directly to the destination server. The destination server does not accept the CONNECT method, so the “Proxy CONNECT aborted” finally appears, resulting in unsuccessful access.
2) Failure of access due to client’s lack of SNI
One of the key factors mentioned above is to extract SNI fields from Client Hello using ngx_stream_ssl_preread_module. If the client does not carry SNI fields, the proxy server will not be able to know the destination domain name, resulting in unsuccessful access.
In transparent proxy mode (simulated by manually binding hosts), we can use OpenSSL to simulate on the client side:
By default, OpenSSL s_client does not have SNI. You can see that the above request ends at the TLS/SSL handshake stage when Client Hello is issued. Because the proxy server does not know which destination domain to forward Client Hello to.
If the SNI is specified by OpenSSL with servername parameters, the normal access can be successful. The command is as follows:
This paper summarizes the principle, environment, scenarios and main problems of NGINX using HTTP CONNECT tunnel and NGINX stream to do HTTPS forward proxy, hoping to provide reference for you when doing various scenarios forward proxy.
Author of this article: Huaizhi
Read the original text
This article is the original content of Yunqi Community, which can not be reproduced without permission.