Stock Market Data for
Tesla
between \(2016\) and \(2022\)
The dataset we use for the following statistical analysis is stock
market data for Elon Musk’s publicly traded companies Tesla
and Twitter
from \(01/01/2018\) to \(05/20/2022\). We obtained this data from
the Yahoo Finance API using the package quantmod
in R. The
data contains the daily percentage returns for the Tesla and Twitter
stock indexes, with \(2208\)
observations on the following \(13\)
variables.
variable | type | description |
---|---|---|
symbol | character | The ticker symbol uniquely indefintying a stock |
date | datetime | The trade day of the recorderd observation |
open | float | Opening value of the stock that day |
close | float | Closing value of the stock that day |
high | float | Highest price of the stock on a given trade day |
low | float | Lowest price of the stock on a given trade day |
volume | integer | Number of daily shares traded in billions |
direction | factor | Factor indicating whether the market had a positive or negative return |
return | decimal | Percentage return for that day |
lag1 | decimal | Percentage return for previous day |
lag2 | decimal | Percentage return for 2 days previous |
lag3 | decimal | Percentage return for 3 days previous |
lag4 | decimal | Percentage return for 4 days previous |
Twitter Data for
Elon Musk
between \(2016\) and \(2022\)
Dataset of Elon Musk’s most recent Tweets during 2015-2022, stored in CSV format, where each row represents a separate tweet object. All Tweets are collected, parsed, and plotted using the Twitter API and rtweet package in R. In total, there are more than ten-thousand tweets in this dataset, including retweets, replies, and quotes. All objects are to go into a single database.
Here, we use the pairs
function to create a scatterplot
matrix for every pair of variables in the stock dataset as shown
below.
df.pairs <- df %>% dplyr::select(-alltext)
pairs(stocks.data)
Based on the correlation coefficients and their corresponding
p-values, there is indeed an association between the
daily return rate
and the predictors volume
,
lag2
, nfav
, nretweet
, and
nreply
.