Chapter 3 Data transformation

3.1 Select Data

The original data includes the records which is more than one year, so we firstly choose the data between the end of April last year to the beginning of May 2022.

3.2 Combine the Data

we combine the price of each cryptocurrency into one table, df.close:

Date BTC BNB ETH USDC USDT Price
record
  • Date: date of the crypto record
  • BTC: price of Bitcoin
  • BNB: price of Binance Coin
  • ETH: price of Ethereum
  • USDC: price of USD Coin
  • USDT: price of USD Tether

(all price are in USD)

This table can help us quickly find price of all 5 cryptos we selected given date information.

Then we create a comprehensive tidy data table include all financial information, df.crypto:

date cryptocurrency price volume market_cap
record

3.3 Calculate the Return and Volatility

Firstly, we calculate the return of each date by the formula: \[\frac{final\: pric e - initial\: price}{initial\: price}\]
Then we calculate the 15 days volatility of each crypto by calculating the stand deviation of 15 days return, and multiply square root of 365.

\[sd(15 \: days \: return) \cdot \sqrt{365}\]

(For stocks, the volatility is stand deviation multiply square root of 250, but the cryptocurrency can be traded every day, so we choose square root of 365)

3.4 Combine All the Value into One Table

Finally, we add the volatility, return, into df.crpyto, plus month and year column seperated from date in case we want to group data. The fianl format of our table df.crypto:

##   cryptocurrency    price       Date   market_cap        return volatility
## 1            BTC 54810.61 2021-04-29 1.024626e+12            NA         NA
## 2            BTC 53596.70 2021-04-30 1.001974e+12 -0.0221473496         NA
## 3            BTC 57828.51 2021-05-01 1.082722e+12  0.0789564931         NA
## 4            BTC 57812.97 2021-05-02 1.080898e+12 -0.0002687715         NA
## 5            BTC 56600.75 2021-05-03 1.057850e+12 -0.0209680266         NA
## 6            BTC 57200.30 2021-05-04 1.069571e+12  0.0105927052         NA
##        volume year month
## 1 48072930663 2021    04
## 2 46958036373 2021    04
## 3 53879045974 2021    05
## 4 43967070046 2021    05
## 5 39072664394 2021    05
## 6 54132470274 2021    05