Connecting an Arduino to the web via a http proxy
http Proxies
There are a variety of reasons why you'd want to have a http-Proxy in a corporate environment. Security, content filtering, authentication and authorization, as well as logging and traffic optimization.
And it makes a lot of sense to not point the default gateway of a corporate network towards the internet, as this raises the bar for malicious software (and users) to bypass network security. But that means that software that needs to access the internet has to do so through the proxy server.
This example has been tried through a SQUID proxy with more or less default configuration.
Arduino http client
The Arduino ethernet library provides sufficient functionality to get basic tasks done. In this case get the weather feed from BBC for a specific location. But with a little twist: It only speaks to the proxy server.
Example Setup |
Modified "Web Client" example
This is just a slightly modified version of the "Examples->Ethernet->WebClient" that comes with the Arduino IDE and an installed Ethernet library. Works as show with an Arduino MKR Zero, but should work with an UNO just as well.
/*
Web proxy client
This sketch reads an RSS feed from a website
using an Arduino Wiznet Ethernet shield.
Circuit:
* Ethernet shield attached to pins 10, 11, 12, 13
created 18 Dec 2009
by David A. Mellis
modified 9 Apr 2012
by Tom Igoe, based on work by Adrian McEwen
modified 11 Jan 2022
by Andy Reischle for http proxy support
*/
#include <SPI.h>
#include <Ethernet.h>
// Enter a MAC address for your controller below.
// Newer Ethernet shields have a MAC address printed on a sticker on the shield
// Don't forget to change that when you have more than one of these on one network segment
byte mac[] = { 0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xED };
// if you don't want to use DNS (and reduce your sketch size)
// use the numeric IP instead of the name for the server:
//IPAddress server(74,125,232,128); // numeric IP for Google (no DNS)
char server[] = "weather-broker-cdn.api.bbci.co.uk"; // name address for target server (using DNS)
char proxy[] = "proxy.internal.mycorporation.something"; // name of the internal proxy server (using DNS)
// Set the static IP address to use if the DHCP fails to assign
// That will be rather pointless in a corporate environment
IPAddress ip(192, 168, 0, 177);
IPAddress myDns(192, 168, 0, 1);
// Initialize the Ethernet client library
// with the IP address and port of the server
// that you want to connect to (port 80 is default for HTTP):
EthernetClient client;
// Variables to measure the speed
unsigned long beginMicros, endMicros;
unsigned long byteCount = 0;
bool printWebData = true; // set to false for better speed measurement
void setup() {
// You can use Ethernet.init(pin) to configure the CS pin
//Ethernet.init(10); // Most Arduino shields
Ethernet.init(5); // MKR ETH shield
//Ethernet.init(0); // Teensy 2.0
//Ethernet.init(20); // Teensy++ 2.0
//Ethernet.init(15); // ESP8266 with Adafruit Featherwing Ethernet
//Ethernet.init(33); // ESP32 with Adafruit Featherwing Ethernet
// Open serial communications and wait for port to open:
Serial.begin(9600);
while (!Serial) {
; // wait for serial port to connect. Needed for native USB port only
}
// start the Ethernet connection:
Serial.println("Initialize Ethernet with DHCP:");
if (Ethernet.begin(mac) == 0) {
Serial.println("Failed to configure Ethernet using DHCP");
// Check for Ethernet hardware present
if (Ethernet.hardwareStatus() == EthernetNoHardware) {
Serial.println("Ethernet shield was not found. Sorry, can't run without hardware. :(");
while (true) {
delay(1); // do nothing, no point running without Ethernet hardware
}
}
if (Ethernet.linkStatus() == LinkOFF) {
Serial.println("Ethernet cable is not connected.");
}
// try to configure using IP address instead of DHCP:
Ethernet.begin(mac, ip, myDns);
} else {
Serial.print(" DHCP assigned IP ");
Serial.println(Ethernet.localIP());
}
// give the Ethernet shield a second to initialize:
delay(1000);
Serial.print("connecting to ");
Serial.print(proxy);
Serial.println("...");
// if you get a connection, report back via serial:
// target the request to the proxy
if (client.connect(proxy, 8080)) {
Serial.print("connected to ");
Serial.println(client.remoteIP());
// Make a HTTP request to the proxy:
client.println("GET http://weather-broker-cdn.api.bbci.co.uk/en/observation/rss/2907669 HTTP/1.1");
client.println("Host: weather-broker-cdn.api.bbci.co.uk");
client.println("Connection: close");
client.println();
} else {
// if you didn't get a connection to the server:
Serial.println("connection failed");
}
beginMicros = micros();
}
void loop() {
// if there are incoming bytes available
// from the server, read them and print them:
int len = client.available();
if (len > 0) {
byte buffer[80];
if (len > 80) len = 80;
client.read(buffer, len);
if (printWebData) {
Serial.write(buffer, len); // show in the serial monitor (slows some boards)
}
byteCount = byteCount + len;
}
// if the server's disconnected, stop the client:
if (!client.connected()) {
endMicros = micros();
Serial.println();
Serial.println("disconnecting.");
client.stop();
Serial.print("Received ");
Serial.print(byteCount);
Serial.print(" bytes in ");
float seconds = (float)(endMicros - beginMicros) / 1000000.0;
Serial.print(seconds, 4);
float rate = (float)byteCount / seconds / 1000.0;
Serial.print(", rate = ");
Serial.print(rate);
Serial.print(" kbytes/second");
Serial.println();
// do nothing forevermore:
while (true) {
delay(1);
}
}
}
If all goes well, this is the output:
Initialize Ethernet with DHCP:
DHCP assigned IP 1.2.3.4
connecting to proxy.internal.mycorporation.something...
connected to 4.3.2.1
HTTP/1.0 200 OK
Access-Control-Allow-Origin: *
Content-Type: application/rss+xml
ETag: "8b0e69972930952dc99c4d239b1c489402f5940392a0bbe16ed534b0b242787f"
expiry_extended_seconds: 0
Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips
Cache-Control: no-transform, max-age=60
Date: Tue, 11 Jan 2022 18:37:23 GMT
Content-Length: 1660
X-Cache: MISS from proxy.internal.mycorporation.something
X-Cache-Lookup: MISS from proxy.internal.mycorporation.something:8080
Via: 1.0 proxy.internal.mycorporation.something:8080 (squid/2.6.STABLE21)
Proxy-Connection: close
<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" version="2.0">
<channel>
<title>BBC Weather - Observations for Heilbronn, DE</title>
<link>https://www.bbc.co.uk/weather/2907669</link>
<description>Latest observations for Heilbronn from BBC Weather, including weather, temperature and wind information</description>
<language>en</language>
<copyright>Copyright: (C) British Broadcasting Corporation, see http://www.bbc.co.uk/terms/additional_rss.shtml for more details</copyright>
<pubDate>Tue, 11 Jan 2022 17:00:00 GMT</pubDate>
<dc:date>2022-01-11T18:00:00Z</dc:date>
<dc:language>en</dc:language>
<dc:rights>Copyright: (C) British Broadcasting Corporation, see http://www.bbc.co.uk/terms/additional_rss.shtml for more details</dc:rights>
<atom:link href="https://weather-service-thunder-broker.api.bbci.co.uk/en/observation/rss/2907669" type="application/rss+xml" rel="self" />
<item>
<title>Tuesday - 18:00 CET: Not available, 4°C (40°F)</title>
<link>https://www.bbc.co.uk/weather/2907669</link>
<description>Temperature: 4°C (40°F), Wind Direction: Easterly, Wind Speed: 4mph, Humidity: 66%, Pressure: 1033mb, Rising, Visibility: --</description>
<pubDate>Tue, 11 Jan 2022 18:00:00 GMT</pubDate>
<guid isPermaLink="false">https://www.bbc.co.uk/weather/2907669-2022-01-11T18:00:00.000+01:00</guid>
<dc:date>2022-01-11T17:00:00Z</dc:date>
<georss:point>49.1399 9.2205</georss:point>
</item>
</channel>
</rss>
disconnecting.
Received 2181 bytes in 0.4765, rate = 4.58 kbytes/second
No big deal extracting values from that output. I've done something similar here:
Fritzmeter Project , but with TinyXML you should be able to do that with more style.
The proxy log (acces.log) entry shows:
1.2.3.4 TCP_MISS/200 2181 GET http://weather-broker-cdn.api.bbci.co.uk/en/observation/rss/2907669 - DIRECT/184.86.251.134 application/rss+xml
What about HTTPS?
The Arduino Uno is to anemic for HTTPS, but the experiments on this page have been done with a MKR ZERO and the matching MKR ETH Shield. The MKR ZERO can do HTTPS easily.
But:
HTTPS through a http-proxy works differently from HTTP: The client requests a tunneled connection from the Proxy, using the "CONNECT" method (rather than "GET" in the HTTP example).
The otherwise brilliant SSLClient library does not provide an obvious way to support that:
I have not yet found a way to establish that first part of the connection request without encryption and switch to SSL/TLS after the tunnel is initiated.