Recently, I got a question from a user of selenium-query
about how to change the proxy after the chrome instance is launched. Additionally, the solution should support proxy authorization.
To set a proxy in chrome with selenium driver is not a big deal, there are even two options - through the API, the example here or with the command line argument --proxy-server=http://foo.bar
. But, after the instance is launched, the reconfiguration of the capabilities has no effect.
So here, I'll show a very simple workaround to the issue. We'll go through the theoretical part, and then the small implementation. The examples will contain a couple of Windows-specific things, but for macOS and Linux it would work similar.
1. Use additional local proxy, like squid
Okay, if we can set a proxy for chrome, at least at startup, then our requirement of "changing the proxy" we should move out of the chrome context. For this, we can create an additional local proxy, which supports parent proxies. The ideal option here will be the cross-platform tool - the squid cache
(squid-cache.org). Installing the package with choco and configuration with vscode is quite simple
$ choco install squid
$ code c:/Squid/etc/squid/squid.conf
Here we need to configure the parent proxy with cache_peer
, it supports also the authorization.
cache_peer PARENT_IP parent PARENT_PORT 0 no-query default login=USERNAME:PASSWORD connect-fail-limit=99999999 proxy-only
never_direct allow all
Now, to change the proxy, we need a) to read the configuration, b) to edit those two lines, c) to write the file back, and d) to reconfigure the squid with the command
$ squid -d 0 -k reconfigure
⚠️ The terminal should be launched with admin privileges.
All of these are easily scriptable. We'll go into details later.
2. Keep-Alive
sessions.
So now we know, how to change the proxy servers at runtime, but there is a caveat ❗❗— the requests can still go through the old proxy, as the agent can reuse connections and will hold the sockets opened, even in the idle state.
I looked for an API to close all idle connections via selenium, but haven't found it: if anybody has some hints for me, it would be great. But there is another way, we can do it via chrome://net-internals/#sockets
, by evaluating the script
chrome.send('closeIdleSockets')
3. Implementation
Part 1.
For modifying the squid config, I will use:
import { File } from 'atma-io'
import { run } from 'shellbee';
import { URL } from 'url'
export namespace SquidConfig {
const SQUID_CONFIG_PATH = 'file://c:/Squid/etc/squid/squid.conf';
export async function setProxy (httpProxy: string) {
let { hostname, port, username, password} = new URL(httpProxy);
let content = await File.readAsync<string>(SQUID_CONFIG_PATH);
let rgxCheckExists = /^cache_peer .+/m;
let line = `cache_peer ${hostname} parent ${port} 0 no-query default login=${username}:${password} connect-fail-limit=99999999 proxy-only`;
if (rgxCheckExists.test(content)) {
content = content.replace(rgxCheckExists, line);
} else {
content += `${line} \n never_direct allow all`;
}
await File.writeAsync(SQUID_CONFIG_PATH, content);
// re-read the configuration
let { stderr } = await run(`squid -d 0 -k reconfigure`);
if (stderr.length > 0) {
throw new Error(stderr.join('\n'));
}
}
}
Part 2.
I'll show how to close the sockets using the selenium-query
itself.
import SQuery from 'selenium-query'
export namespace ChromeSockets {
export async function closeIdleSockets () {
let $internalsPage: SQuery = await SQuery.load('chrome://net-internals/#sockets');
await $internalsPage.eval(`chrome.send('closeIdleSockets')`)
}
}
4. Testing
I use atma-utest
module to run the test.
Bootstrap the demo project in some folder:
npm i atma -g
# repair default packages and TS runners
atma init
# install additional packages
npm i selenium-query shellbee
After you've created the files with snippets provided earlier, you can create the test, e.g. ./tests/proxies.spec.ts
import SQuery from 'selenium-query';
import { ChromeSockets } from '../src/ChromeSockets';
import { SquidConfig } from '../src/SquidConfig';
import { URL } from 'url';
// remote proxy servers
const PROXY1 = process.env.PROXY1;
const PROXY2 = process.env.PROXY2;
const IP_PROVIDER = `https://api.ipify.org?format=json`;
UTest({
async 'should visit both proxies' () {
await SquidConfig.setProxy(PROXY1);
// opens new chrome instance
let { data: server1 } = await SQuery.fetch(IP_PROVIDER, {
args: [
'--log-level=3',
// the local squid proxy
'--proxy-server=http://localhost:3128'
]
});
eq_(server1.ip, new URL(PROXY1).hostname);
await SquidConfig.setProxy(PROXY2);
await ChromeSockets.closeIdleSockets();
// reuses chrome instance
let { data: server2 } = await SQuery.fetch(IP_PROVIDER);
eq_(server2.ip, new URL(PROXY2).hostname);
}
});
Finally, run the test
$ atma test tests/proxies.spec.ts
Both servers should be in Environment as strings like,
http://USERNAME:PASSWORD@IP:PORT