How to Use Proxies With Google Colab and Jupyter

How to Use Proxies With Google Colab and Jupyter

Setting Up Proxies in Google Colab and Jupyter

Right, let’s get straight to the business of running proxies in Google Colab or Jupyter—no faffing about. Whether you’re scraping data, bypassing geo-restrictions, or just wanting a bit of privacy, proxies are your go-to mates. There’s a knack to doing it right, though, especially on platforms like Colab and Jupyter that sometimes have their own quirks.


Why Use Proxies with Colab and Jupyter?

Scenario Benefit of Proxy
Web scraping Avoiding IP bans
Accessing geo-blocked APIs Unblocking content
Research with anonymity Masking your digital footprint

Choosing Your Proxy Source

Now, before you go bush-bashing through the wilds of the internet looking for proxies, let’s make it easy. ProxyRoller is your mate here—offers fresh, free proxies ready to go. More on them in a tick.


Types of Proxies

Type Description Typical Use
HTTP/HTTPS Standard web proxies Web scraping, crawling
SOCKS4/SOCKS5 Lower-level, supports more protocols Streaming, P2P, etc.
Rotating Changes IP frequently Avoiding rate limits
Residential Real user IPs, harder to block Scraping, automation

For most Colab/Jupyter work, HTTP/HTTPS proxies will do the trick.


Getting Free Proxies from ProxyRoller

  1. Head over to ProxyRoller.
  2. Click on the “Get Free Proxies” button.
  3. Copy the proxy list—looks something like ip:port.

Bit of Aussie advice: test your proxies, because free proxies can be fickle, like Melbourne weather.


Configuring Proxies in Google Colab

Colab runs in a virtual machine, so you need to instruct Python (and related libraries) to use a proxy. Here’s how you do it, Arvid-style:

Setting HTTP/HTTPS Proxy for requests

import requests

proxies = {
    'http': 'http://username:password@proxy_ip:proxy_port',
    'https': 'http://username:password@proxy_ip:proxy_port'
}

response = requests.get('https://httpbin.org/ip', proxies=proxies)
print(response.json())
  • If your proxy doesn’t need authentication, leave out username:password@.

Setting Proxies Globally in Notebook

Sometimes you want everything to go through the proxy. Set environment variables:

import os

os.environ['http_proxy'] = 'http://proxy_ip:proxy_port'
os.environ['https_proxy'] = 'http://proxy_ip:proxy_port'

Now, any library that respects these environment variables (requests, urllib, etc.) will use the proxy.

Rotating Proxies

If you’re scraping like a dingo on a chicken farm, rotate your proxies to dodge bans:

import random

proxy_list = [
    'http://ip1:port1',
    'http://ip2:port2',
    'http://ip3:port3'
]

def get_random_proxy():
    return {'http': random.choice(proxy_list), 'https': random.choice(proxy_list)}

for i in range(10):
    proxies = get_random_proxy()
    response = requests.get('https://httpbin.org/ip', proxies=proxies)
    print(response.json())

Configuring Proxies in Jupyter Notebook

Much the same as Colab, mate. Here’s the drill:

For requests and urllib

import requests

proxies = {
    'http': 'http://proxy_ip:proxy_port',
    'https': 'http://proxy_ip:proxy_port'
}

response = requests.get('https://httpbin.org/ip', proxies=proxies)
print(response.json())

For System-wide Proxy (Jupyter Kernel)

import os

os.environ['HTTP_PROXY'] = 'http://proxy_ip:proxy_port'
os.environ['HTTPS_PROXY'] = 'http://proxy_ip:proxy_port'

For Selenium (Headless Browsers)

If you’re running Selenium in Jupyter (bit of a power move):

from selenium import webdriver
from selenium.webdriver.common.proxy import Proxy, ProxyType

proxy_ip_port = 'proxy_ip:proxy_port'

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument(f'--proxy-server={proxy_ip_port}')

driver = webdriver.Chrome(options=chrome_options)
driver.get('https://httpbin.org/ip')

Comparing Proxy Methods

Method Scope Tools/Libraries Use Case
requests proxies Per-request requests, urllib3 Scraping, API calls
Env variables Global (session) All libraries (most) Consistent proxy usage
Selenium proxy Browser automation selenium Web automation/scraping

Testing Your Proxy

Always test if your proxy’s working—otherwise, you might be the digital equivalent of bushwhacking in circles.

import requests

proxies = {
    'http': 'http://proxy_ip:proxy_port',
    'https': 'http://proxy_ip:proxy_port'
}

response = requests.get('https://httpbin.org/ip', proxies=proxies)
print("Proxy IP:", response.json())

If the IP matches the proxy, you’re golden.


Common Pitfalls (and Quick Fixes)

Issue What it Looks Like How to Fix
Connection timeout Requests hang, no response Try a different proxy
403/407 Proxy Auth Required Authentication error Use username/password if needed
Proxy not working in Colab No change in IP, errors Check environment variables
SSL issues SSL handshake failed Use HTTP or set verify=False

Useful Resources


And there you go—no need to wrestle a croc to get your proxy game on point in Colab or Jupyter. If you need fresh proxies, remember ProxyRoller’s always open and doesn’t bite.

Arvid Warral

Arvid Warral

Lead Network Architect

Arvid Warral, a native of the sunburnt country, is the brain behind ProxyRoller's robust and agile network architecture. With over two decades of experience in digital security and proxy management, he has been instrumental in developing systems that cater to the dynamic needs of privacy and data security. Arvid's journey with ProxyRoller began as a passionate technologist, and his innovative solutions have since propelled the company to the forefront of the industry. In his role, he is committed to ensuring that ProxyRoller's offerings remain cutting-edge and reliable for users worldwide.

Comments (0)

There are no comments here yet, you can be the first!

Leave a Reply

Your email address will not be published. Required fields are marked *