6. Web Scraping Client

Below is a simple web scraping client that will extract a public company’s current stock price. The scraper is not meant to be sophisticated; it merely exists to highlight some of the amazing features of using Django with Celery & Redis.

If you’re interested in doing real-time stock monitoring, the below web scraping client is not for you. There are a plethora of companies that offer real-time APIs for monitoring the stock market.

6.1. Stocks app

In Create Django App, we started our stocks app with the following:

python manage.py startapp stocks

Do this now if you haven’t already.

6.2. Additional Installations:

In Install Packages for the web scraping portion of this book:, we already installed this requirements. Do it now if you didn’t already.

pip install requests requests-html

6.3. stocks.scraper.py

Create scraper.py in the stocks app for our scraper client. As you see below, there’s a few service options that were working at the time of writing. The echo service is merely there to ensure the rest of our app is working even if our scraper is not.

# stocks/scraper.py

import json
import random
import requests
from requests_html import HTML


SERVICES = {
    "business_insider": "https://markets.businessinsider.com/stocks/{ticker}-stock",
    "google_finance": "https://www.google.com/finance/quote/{ticker}:NASDAQ",
    "echo": "https://www.httpbin.org/anything/{ticker}",
}

class StockTickerScraper:        
    '''

    Usage:

    StockTickerScraper(service='echo', ticker='AAPL').scrape()
    '''

    service = 'echo'
    url = None
    ticker = "AAPL"
    
    def __init__(self, service='echo', ticker="AAPL"):
        self.service = service
        self.url = SERVICES[service]
        self.ticker = ticker 
    
    def scrape_business_insider(self, url=None):
        '''
        Perform web scraping on markets.businessinsider.com/stocks
        Exract ticker's current price and name.
        '''
        if url == None:
            return None, None
        r = requests.get(url)
        html = HTML(html=r.text)
        name = html.find(".price-section__label")[-1].text
        price = html.find(".price-section__current-value")[-1].text
        return name, price
    
    def scrape_google_finance(self, url=None):
        '''
        Perform web scraping on google.com/finance
        Exract ticker's current price and name.
        '''
        if url == None:
            return None, None
        r = requests.get(url)
        html = HTML(html=r.text)
        name = html.find(".KY7mAb")[0].text
        price = html.find(".kf1m0")[0].text
        return name, price
    
    def scrape_echo(self, url=None):
        '''
        Fallback method if the above two stop working.
        '''
        random_price = "%.2f" % (random.randint(0, 12000) / 100.00)
        r = requests.post(url, json={"ticker": self.ticker, "price": random_price})
        data = json.loads(r.json()['data'])
        return data.get('ticker'), data.get("price")
        
    def scrape(self, ticker=None):
        to_scrape_ticker = ticker or self.ticker
        if to_scrape_ticker == None:
            to_scrape_ticker = self.ticker
        url = self.url.format(ticker=to_scrape_ticker)
        func = getattr(self, f"scrape_{self.service}")
        name, price = func(url)
        return name, price

6.3.1. Reference Companies & their Stock Symbols

A few reference companies are:

companies = [
    {'name': "Apple Inc.", "ticker_symbol": "APPL"},
    {'name': "Alphabet C", "ticker_symbol": "GOOG"},
    {'name': "Amazon", "ticker_symbol": "AMZN"},
    {'name': "Microsoft Corp", "ticker_symbol": "MSFT"},
    {'name': "Tesla", "ticker_symbol": "TSLA"},
]