Chapter 2 Data sources

2.1 Contribution

Tianyu is responsible to determine the cryptocurencies we are focus on. Juntian downloads corresponding cryptocurrencies’ datasets and determines the scale of the data.

2.2 Data Collection

We use coinmarketcap api to verified top 5 cryptocurrencies by market cap, and then pulled each cryptocurrency dataset from coingecko, including historical trading information.

We then explore multiple financial sources and yahoo finance is the most tangible and flexible website to download dataset we desired. It not only offers detailed information about the crypto we are interested in, but also provides options to download dataset with self-define time frame.

2.3 Dataset Information

We downloaded 5 datasets and each corresponding to one of the major cryptocurrencies we observed above. Each dataset include 7 columns and up to 3292 rows.

2.3.1 Format

The format of each dataset:

date price market_cap total_volume
record

2.3.2 Column Details

  • date: date of the crypto record
  • price: trading price (USD)
  • market_cap: total market cap
  • total_volume: the number of shares traded

2.4 Issue

As the histogram above shown, the market cap of Bitcoin is way bigger than any cryptocurrencies. This is a challenging problem because it would affect our comparison between cryptos.