6. Web Scraping Client¶
Below is a simple web scraping client that will extract a public company’s current stock price. The scraper is not meant to be sophisticated; it merely exists to highlight some of the amazing features of using Django with Celery & Redis.
If you’re interested in doing real-time stock monitoring, the below web scraping client is not for you. There are a plethora of companies that offer real-time APIs for monitoring the stock market.
6.1. Stocks app¶
In Create Django App, we started our stocks
app with the following:
python manage.py startapp stocks
Do this now if you haven’t already.
6.2. Additional Installations:¶
In Install Packages for the web scraping portion of this book:, we already installed this requirements. Do it now if you didn’t already.
pip install requests requests-html
6.3. stocks.scraper.py
¶
Create scraper.py
in the stocks
app for our scraper client. As you see below, there’s a few service options that were working at the time of writing. The echo
service is merely there to ensure the rest of our app is working even if our scraper is not.
# stocks/scraper.py
import json
import random
import requests
from requests_html import HTML
SERVICES = {
"business_insider": "https://markets.businessinsider.com/stocks/{ticker}-stock",
"google_finance": "https://www.google.com/finance/quote/{ticker}:NASDAQ",
"echo": "https://www.httpbin.org/anything/{ticker}",
}
class StockTickerScraper:
'''
Usage:
StockTickerScraper(service='echo', ticker='AAPL').scrape()
'''
service = 'echo'
url = None
ticker = "AAPL"
def __init__(self, service='echo', ticker="AAPL"):
self.service = service
self.url = SERVICES[service]
self.ticker = ticker
def scrape_business_insider(self, url=None):
'''
Perform web scraping on markets.businessinsider.com/stocks
Exract ticker's current price and name.
'''
if url == None:
return None, None
r = requests.get(url)
html = HTML(html=r.text)
name = html.find(".price-section__label")[-1].text
price = html.find(".price-section__current-value")[-1].text
return name, price
def scrape_google_finance(self, url=None):
'''
Perform web scraping on google.com/finance
Exract ticker's current price and name.
'''
if url == None:
return None, None
r = requests.get(url)
html = HTML(html=r.text)
name = html.find(".KY7mAb")[0].text
price = html.find(".kf1m0")[0].text
return name, price
def scrape_echo(self, url=None):
'''
Fallback method if the above two stop working.
'''
random_price = "%.2f" % (random.randint(0, 12000) / 100.00)
r = requests.post(url, json={"ticker": self.ticker, "price": random_price})
data = json.loads(r.json()['data'])
return data.get('ticker'), data.get("price")
def scrape(self, ticker=None):
to_scrape_ticker = ticker or self.ticker
if to_scrape_ticker == None:
to_scrape_ticker = self.ticker
url = self.url.format(ticker=to_scrape_ticker)
func = getattr(self, f"scrape_{self.service}")
name, price = func(url)
return name, price
6.3.1. Reference Companies & their Stock Symbols¶
A few reference companies are:
companies = [
{'name': "Apple Inc.", "ticker_symbol": "APPL"},
{'name': "Alphabet C", "ticker_symbol": "GOOG"},
{'name': "Amazon", "ticker_symbol": "AMZN"},
{'name': "Microsoft Corp", "ticker_symbol": "MSFT"},
{'name': "Tesla", "ticker_symbol": "TSLA"},
]