Section 3 Data

Data is publicly available at https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page

Please Find Data catalog explaining columns and categories here: https://github.com/Mkassem16/NycTaxiPBI/blob/main/data/data_dictionary_trip_records_yellow.pdf

For the purpose of building a demonstration dashboard, I extracted a sample of 10,000 rows of 2020 data. This sample dataset is hosted publicly on Google Cloud storage and can be queried directly from PowerBi. https://storage.googleapis.com/powerbi_datacamp/nyc_taxi_db.csv

glimpse(df)
## Rows: 10,018
## Columns: 17
## $ vendor_id           <dbl> 1, 1, 2, 1, 2, 1, 4, 2, 1, 2, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 2, 2,...
## $ pickup_datetime     <dttm> 2021-03-27 12:47:16, 2021-06-10 19:02:02, 2021-10-23 20:35:45, 2021-09-11 20:13:27, 2021-06-20 22...
## $ dropoff_datetime    <dttm> 2021-03-27 13:39:54, 2021-06-10 19:31:53, 2021-10-23 21:05:05, 2021-09-11 20:36:56, 2021-06-20 22...
## $ passenger_count     <dbl> 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 3, 2, 1, 1, 1, 2, 2, 1, 1, 2, 6, 1, 1, 1,...
## $ trip_distance       <dbl> 2.70, 15.10, 7.92, 6.50, 6.44, 10.00, 7.24, 36.97, 9.60, 2.31, 8.00, 9.62, 19.22, 6.73, 10.08, 10....
## $ rate_code           <dbl> 1, 1, 1, 1, 1, 1, 1, 5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 1, 1, 1,...
## $ store_and_fwd_flag  <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N"...
## $ payment_type        <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
## $ fare_amount         <dbl> 29.00, 42.00, 26.00, 22.50, 24.50, 29.00, 22.00, 83.50, 31.50, 23.00, 24.00, 39.00, 65.00, 23.50, ...
## $ extra               <dbl> 0.0, 0.0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.5, 0.5...
## $ mta_tax             <dbl> 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5...
## $ tip_amount          <dbl> 5.95, 12.10, 5.46, 4.75, 3.87, 6.05, 4.66, 21.51, 5.81, 7.14, 4.95, 9.31, 5.55, 4.00, 8.51, 10.00,...
## $ tolls_amount        <dbl> 0.00, 5.76, 0.00, 0.00, 0.00, 0.00, 0.00, 23.76, 0.00, 0.00, 0.00, 5.76, 0.00, 0.00, 5.76, 5.76, 0...
## $ imp_surcharge       <dbl> 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3...
## $ total_amount        <dbl> 35.75, 60.66, 32.76, 28.55, 29.67, 36.35, 27.96, 129.07, 38.11, 30.94, 29.75, 55.87, 71.35, 28.30,...
## $ pickup_location_id  <dbl> 68, 138, 261, 262, 261, 100, 7, 132, 264, 170, 237, 138, 132, 87, 264, 161, 229, 138, 186, 186, 13...
## $ dropoff_location_id <dbl> 162, 88, 41, 231, 162, 127, 53, 265, 264, 236, 261, 163, 255, 262, 264, 138, 246, 151, 13, 265, 18...