Book a Demo Start Instantly
streamlit_banner

In today’s data-driven world, analyzing large datasets quickly and easily is essential. With powerful tools like TiDB Cloud and Streamlit, developers can build custom, interactive user interfaces that make it easier to analyze and visualize data. 

TiDB Cloud is the fully-managed service of TiDB, an advanced, open-source, distributed SQL database that provides real-time access to transactional data with low latency and horizontal scalability. Combined with Streamlit, TiDB Cloud can provide real-time access to financial data and enable users to analyze and visualize the data in real-time. 

Streamlit, on the other hand, is an open-source Python framework that simplifies the process of building web applications. It provides an intuitive API that enables developers to create powerful data applications with minimal effort. 

In this post, we’ll show you how to use TiDB Cloud and Streamlit to build an interactive Stock Data web application. We’ll walk you through the code to explain some of the building blocks of the application. You’ll learn how to connect to TiDB Cloud, retrieve data from the Yahoo Finance API, and use Plotly to visualize the data. By the end of this tutorial, you’ll have a better understanding of how to leverage these powerful tools to build similar applications that can streamline your data analysis and visualization workflows.

Prerequisites

Before you begin, you will need to set up the following:

Initial setup 

Setting up the Stock Data Project

1. Clone the integration example code repository for TiDB Cloud and Streamlit. From your terminal or command prompt (Windows user), run the following command:

git clone https://github.com/tidbcloud/streamlit-stockdata-analysis.git 

2. Navigate to the project’s root directory “streamlit-stockdata-analysis”. Enter:

cd streamlit-stockdata-analysis

3. Install the project dependencies listed in requirements.txt. In the terminal window, enter:

python3 -m pip install -r requirements.txt

Storing Database Secrets

We will now create a secrets file to store the connection information for the TiDB Cloud cluster. The Streamlit app will reference this file to establish a connection with the TiDB database. For more information on connecting a Streamlit app with TiDB, please refer to the Streamlit tutorial, Connect Streamlit to TiDB.

  1. In the project’s root directory, create a file named .streamlit/secrets.toml.
  2. Add the following content to the file, replacing the placeholder values with your TiDB Cloud Cluster information:
#.streamlit/secrets.toml
[tidb]
host = "<TiDB_cluster_host>"
port = 4000
database = "test"
user = "<TiDB_cluster_user>"
password = "<TiDB_cluster_password>"
ssl_ca = "<path_to_CA_store>"

The Certificate Authority (CA) store path depends on your operating system. You can find this path in the Connection tab of the TiDB Cloud cluster. For example, on macOS, the path would be /etc/ssl/cert.pem.

For more information on connecting a Streamlit app with TiDB, refer to the Streamlit documentation.

Creating a TiDB Table for Storing Stock Data

To store daily historical stock data for a ticker symbol in TiDB, follow these steps:

  1. Log in to the TiDB Console and navigate to the cluster overview page. 
  2. On the left navigation pane, select the Chat2Query interface, and execute the SQL script shown below. The script creates a `stock_price_history` table inside the test database. This table stores information such as the volume of stock traded, dividends gained, and opening and closing prices.
SQL
CREATE TABLE test.stock_price_history
(
  id int PRIMARY KEY AUTO_INCREMENT,
  Opening_Price DOUBLE,
  High DOUBLE,
  Low DOUBLE,
  Closing_Price DOUBLE,
  Volume BIGINT,
  Dividends DOUBLE,
  Market_Date DATE,
  Ticker VARCHAR(10)
);

The stock data project in action

Now that you have completed the initial setup, it’s time to put the project into action.  For this walkthrough, we will gather and analyze 10 years of stock historical data for Microsoft (MSFT) and Google (GOOGL). If you’d rather track other stocks, substitute the ticker symbols you’d like. The application retrieves data from the Yahoo Finance API, saves it in TiDB Cloud, and visualizes it using Plotly’s 3D charts.  

  1. Run the project on your local machine. From the terminal or command prompt (Windows user), enter streamlit run stocks.py.
  2. Launch the application in your web browser at http://localhost:8502.
  3. In the default menu, select Collect Trade Data. Enter the ticker symbol for the stock you want to analyze, in this case, MSFT.
  4. Leave the start and end dates to their default values, and click Get Data to retrieve 10 years of historical data for the selected stock. 
  5. Click Save Data to store the retrieved stock data in TiDB Cloud.
  6. Repeat steps 3-5 to get data for Google, with the ticker symbol GOOGL.
  7. To analyze the data, click Visualize Trade Data, enter the corresponding ticker symbols, keep the default date values, and click Visualize.

This action will execute an analytical query on TiDB and display the results in a 3D chart that compares the total dividend paid and volume traded over the years for Google and Microsoft stocks.

Code walkthrough

In this section, we’ll examine important elements of the Stock Data Project’s code. We’ll explore the application’s components and how they interact to collect, analyze, and visualize historical stock data. We’ll also discuss the various packages used and how they contribute to the application’s overall functionality.

Packages used

import streamlit as st
import yfinance as yf
import pandas as pd
import plotly.express as px
import pymysql
PackageDescription
streamlitBuilds web applications for data science and machine learning projects
yfinanceRetrieve and analyze historical stock price data from Yahoo Finance
pandasData manipulation and analysis
plotly.expressCreates interactive visualizations
pymysqlInteracts with TiDB 

Together, these packages and TiDB provide a powerful toolkit to build interactive web pages that analyze, visualize, and store data.

Accessing TiDB 

The functions that interact with TiDB provide the foundation for the analysis and visualization capabilities of the project. 

Connecting to the database

The init_connection() function connects to the TiDB database using credentials stored in Streamlit Secrets Manager. The function returns a connection object that is used to fetch and store data.

def init_connection():
    config = st.secrets["tidb"]
    return pymysql.connect(
        host=config["host"],
        port=config["port"],
        user=config["user"],
        password=config["password"],
        database=config["database"],
        ssl_verify_cert=True,
        ssl_verify_identity=True,
        ssl_ca= config["ssl_ca"]
    ) 

conn = init_connection()

Saving data

The get_ticker_data() function retrieves historical stock data for a given ticker symbol, start date, and end date. Using the yfinance library of Yahoo Finance, the function returns a pandas.DataFrame that contains the stock’s open, high, low, close, volume, and dividends:

def get_ticker_data(symbol, start_date, end_date):
    ticker = yf.Ticker(symbol)
    data = ticker.history(start=start_date, end=end_date)
    return data

The save_data() function inserts the stock data into the TiDB database using the bulk insert method executemany of the pymysql library. The function takes a pandas.DataFrame and a ticker symbol as input. The function then formats the data and inserts it into the database.

def save_data(data, symbol):
    data["Date"] = data.index
    data["Ticker"] = symbol
    data.reset_index(drop=True, inplace=True)

    df = data.loc[:, ['Open', 'High', 'Low', 'Close', 'Volume', 'Dividends', 'Date', 'Ticker']]

    cur = conn.cursor()

    data = [tuple(row) for row in df.itertuples(index=False)]

    query = 'INSERT INTO stock_price_history (Opening_Price, High, Low, Closing_Price,Volume, Dividends , Market_Date, Ticker) VALUES (  %s,  %s,  %s,  %s, %s,  %s, %s, %s);'
    cur.executemany(query, data)

    rows_upserted = cur.rowcount

    # commit the changes
    conn.commit()
    cur.close()
    conn.close()
    st.success( str(rows_upserted) +  " data saved successfully!")

Retrieve data

The fetch_data() function leverages TiDB’s Hybrid Transactional/Analytical Processing (HTAP) capabilities to execute SQL queries and retrieve aggregated stock data information.  

The function accepts two stock ticker symbols and a date range and returns a pandas.DataFrame containing the ticker symbols, year, total dividends, and the average volume traded.

def fetch_data(symbol1, symbol2, start_date, end_date):
    cursor = conn.cursor()
    query = f"""
    SELECT Ticker, YEAR(Market_Date) AS Year, ROUND(SUM(Dividends), 2) AS Total_Dividends, CAST(ROUND(AVG(Volume), 2) AS DOUBLE) AS Avg_Volume
    FROM stock_price_history
    WHERE Ticker IN ('{symbol1}', '{symbol2}') AND Market_Date BETWEEN '{start_date}' AND '{end_date}'
    GROUP BY Ticker, YEAR(Market_Date)
    ORDER BY Ticker, YEAR(Market_Date) ASC;
    """
    cursor.execute(query)
    data = cursor.fetchall()
    cols = ['Ticker', 'Year', 'Total_Dividends', 'Avg_Volume']
    df = pd.DataFrame(data, columns=cols)
    return df

Web application

The web application lets you collect and visualize historical stock trade data. 

Collecting trade data 

The Collect Trade Data page lets users enter a stock ticker symbol, a start date, and an end date. When users click Get Data, the app retrieves historical trade data for the specified stock within the given date range and displays it in a table. The data is stored in the session state until the user saves it or navigates away from the page.

# Define the inputs
symbol = st.text_input("Ticker Symbol").upper()
start_date = st.date_input("Start Date", value=pd.to_datetime("today").floor("D") - pd.offsets.DateOffset(years=10))
end_date = st.date_input("End Date", value=pd.to_datetime("today").floor("D"))

# Define the button to retrieve the data
if st.button("Get Data"):
    data = get_ticker_data(symbol, start_date, end_date)
    st.write(data)
# store data in session
    st.session_state.data = data

Clicking the Save Data button will add the data to TiDB and clear the session.

# Define the button to save the data to TiDB
if st.button("Save Data"):
    if st.session_state.get("data") is None:
        st.write("No data to save.")
        return
    data = st.session_state.data
    save_data(data, symbol)
    del st.session_state['data']

Visualizing trade data 

The Visualize Trade Data page lets users enter two stock ticker symbols, a start date, and an end date. When users click Visualize, the app fetches historical trade data for the specified stocks within the given date range and plots a 3D line chart of the dividend paid versus the volume traded for each stock.

# Create two text boxes for entering the stock symbols
symbol1 = st.text_input("Enter Ticker Symbol").upper()
symbol2 = st.text_input("Enter Ticker Symbol to Compare").upper()

# Create two date pickers for selecting the start and end dates
start_date = st.date_input("Start Date", value=pd.to_datetime("today").floor("D") - pd.offsets.DateOffset(years=10))
end_date = st.date_input("End Date", value=pd.to_datetime("today").floor("D"))

if st.button("Visualize"):
    # Fetch data from the database
    data = fetch_data(symbol1, symbol2, start_date, end_date)

    if data.empty:
        st.warning('No data found for the selected criteria. Please adjust the inputs.')
    else:
        # Display the chart
        plot_3d_line(data)

Plot 3D Chart

The plot_3d_line function uses the Plotly library to create the 3D line chart. This library offers a variety of interactive visualization tools that can make data more meaningful. By visualizing the relationship between the dividend paid, volume traded, and year for each stock ticker, users can gain insights into the stocks’ performance over time and make more informed investment decisions.

def plot_3d_line(data):
    fig = px.line_3d(data, x='Year', y='Total_Dividends', z='Avg_Volume', color='Ticker')
    fig.update_layout(title=f"Dividend Paid Vs Volume Traded", height=600, width=800, scene=dict(xaxis_title="Year", yaxis_title="Dividends Paid", zaxis_title="Volume Traded"))
    st.plotly_chart(fig)

Conclusion

Building a stock data analysis app using Python and Streamlit can be a powerful tool for visualizing and analyzing financial data. In addition, using TiDB as the backend database can provide horizontal scalability and high availability for the application. 

By following the steps outlined in this blog post, you can create an end-to-end trading data analysis app that collects data, saves it to TiDB Cloud, and visualizes it in a way that can provide insights into trading trends and patterns. If you have any questions or feedback, please feel free to contact us through Twitter, LinkedIn, or our Slack Channel.


Book a Demo


Have questions? Let us know how we can help.

Contact Us
TiDB Dedicated

TiDB Dedicated

A fully-managed cloud DBaaS for predictable workloads

TiDB Dedicated

TiDB Serverless

A fully-managed cloud DBaaS for auto-scaling workloads