Chapter 3 Data transformation
3.1 Select Data
The original data includes the records which is more than one year, so we firstly choose the data between the end of April last year to the beginning of May 2022.
3.2 Combine the Data
we combine the price of each cryptocurrency into one table, df.close:
| Date | BTC | BNB | ETH | USDC | USDT | Price | 
|---|---|---|---|---|---|---|
| record | 
- Date: date of the crypto record
 - BTC: price of Bitcoin
 - BNB: price of Binance Coin
 - ETH: price of Ethereum
 - USDC: price of USD Coin
 - USDT: price of USD Tether
 
(all price are in USD)
This table can help us quickly find price of all 5 cryptos we selected given date information.
Then we create a comprehensive tidy data table include all financial information, df.crypto:
| date | cryptocurrency | price | volume | market_cap | 
|---|---|---|---|---|
| record | 
3.3 Calculate the Return and Volatility
Firstly, we calculate the return of each date by the formula:
\[\frac{final\: pric e - initial\: price}{initial\: price}\]
Then we calculate the 15 days volatility of each crypto by calculating the stand deviation of 15 days return, and multiply square root of 365.
\[sd(15 \: days \: return) \cdot \sqrt{365}\]
(For stocks, the volatility is stand deviation multiply square root of 250, but the cryptocurrency can be traded every day, so we choose square root of 365)
3.4 Combine All the Value into One Table
Finally, we add the volatility, return, into df.crpyto, plus month and year column seperated from date in case we want to group data. The fianl format of our table df.crypto:
##   cryptocurrency    price       Date   market_cap        return volatility
## 1            BTC 54810.61 2021-04-29 1.024626e+12            NA         NA
## 2            BTC 53596.70 2021-04-30 1.001974e+12 -0.0221473496         NA
## 3            BTC 57828.51 2021-05-01 1.082722e+12  0.0789564931         NA
## 4            BTC 57812.97 2021-05-02 1.080898e+12 -0.0002687715         NA
## 5            BTC 56600.75 2021-05-03 1.057850e+12 -0.0209680266         NA
## 6            BTC 57200.30 2021-05-04 1.069571e+12  0.0105927052         NA
##        volume year month
## 1 48072930663 2021    04
## 2 46958036373 2021    04
## 3 53879045974 2021    05
## 4 43967070046 2021    05
## 5 39072664394 2021    05
## 6 54132470274 2021    05