CH
CalcHub
Back to Guides
Beginner

Fetching & Analysing Stock Data

Stock data is your raw material. Before you can build strategies or analyse risk, you need to know how to fetch, clean, and explore price data. This guide covers everything from downloading data to calculating key metrics.

Downloading OHLCV Data

OHLCV stands for Open, High, Low, Close, Volume โ€” the five pieces of data recorded for every trading day. yfinance gives you all of these for free.

import yfinance as yf import pandas as pd # Download 2 years of data for Microsoft msft = yf.download("MSFT", start="2022-01-01", end="2024-01-01") # See what we got print(msft.head()) print(f"\nShape: {{msft.shape[0]}} rows x {{msft.shape[1]}} columns") print(f"Date range: {{msft.index[0].date()}} to {{msft.index[-1].date()}}")

DataFrames = Spreadsheets in Code

A pandas DataFrame is essentially an Excel spreadsheet. Rows are dates, columns are Open/High/Low/Close/Volume. You can filter, sort, calculate, and plot โ€” all in a few lines of code. If you can use a spreadsheet, you can use pandas.

Daily Returns

Returns tell you how much the price moved as a percentage. This is more useful than raw prices because it normalises everything โ€” a 1% move on a $200 stock is comparable to a 1% move on a $50 stock.

# Simple daily return: (today - yesterday) / yesterday msft["Daily_Return"] = msft["Close"].pct_change() # Log return (preferred in quant finance) # log returns are additive over time, which makes maths easier import numpy as np msft["Log_Return"] = np.log(msft["Close"] / msft["Close"].shift(1)) print(msft[["Close", "Daily_Return", "Log_Return"]].tail(10))

Cumulative Returns

Cumulative returns show the total growth of an investment over time. โ€œIf I invested $1,000 at the start, what would it be worth now?โ€

# Cumulative return: compound all daily returns msft["Cumulative_Return"] = (1 + msft["Daily_Return"]).cumprod() - 1 # If you started with $10,000 initial_investment = 10000 msft["Portfolio_Value"] = initial_investment * (1 + msft["Cumulative_Return"]) print(f"Starting value:  ${{initial_investment:,.2f}}") print(f"Ending value:    ${{msft['Portfolio_Value'].iloc[-1]:,.2f}}") print(f"Total return:    {{msft['Cumulative_Return'].iloc[-1]:.2%}}")

Moving Averages

Moving averages smooth out daily noise and reveal trends. The 50-day and 200-day moving averages are the most watched indicators in finance.

import matplotlib.pyplot as plt # Calculate moving averages msft["MA_50"] = msft["Close"].rolling(window=50).mean() msft["MA_200"] = msft["Close"].rolling(window=200).mean() # Plot price with moving averages plt.figure(figsize=(12, 6)) plt.plot(msft.index, msft["Close"], label="Close", alpha=0.7) plt.plot(msft.index, msft["MA_50"], label="50-day MA", linewidth=2) plt.plot(msft.index, msft["MA_200"], label="200-day MA", linewidth=2) plt.title("Microsoft (MSFT) with Moving Averages") plt.xlabel("Date") plt.ylabel("Price ($)") plt.legend() plt.grid(True, alpha=0.3) plt.tight_layout() plt.show()

Comparing Multiple Stocks

To compare stocks fairly, normalise them to start at 100 (or 1). That way you're comparing percentage growth, not raw dollar values.

# Download multiple stocks tickers = ["AAPL", "MSFT", "GOOGL", "AMZN"] data = yf.download(tickers, start="2023-01-01", end="2024-01-01")["Close"] # Normalise to 100 at start normalised = data / data.iloc[0] * 100 # Plot comparison plt.figure(figsize=(12, 6)) for ticker in tickers: plt.plot(normalised.index, normalised[ticker], label=ticker, linewidth=1.5) plt.title("Tech Stock Comparison (Normalised to 100)") plt.xlabel("Date") plt.ylabel("Normalised Price") plt.legend() plt.grid(True, alpha=0.3) plt.axhline(y=100, color="white", linestyle="--", alpha=0.3) plt.tight_layout() plt.show() # Print total returns for ticker in tickers: total_return = (data[ticker].iloc[-1] / data[ticker].iloc[0] - 1) * 100 print(f"{{ticker}}: {{total_return:+.1f}}%")