|
| 1 | +# Overview |
| 2 | +Data extraction for parks in Nova Scotia, Canada and visualization |
| 3 | + |
| 4 | +## Dataset Source |
| 5 | + |
| 6 | +https://data.novascotia.ca/Lands-Forests-and-Wildlife/DNR-Camping-Parks-Reservation-Data-2016/4zt7-x443 |
| 7 | + |
| 8 | +### About the Dataset |
| 9 | + |
| 10 | +The dataset DNR Camping Parks Reservation Data 2016 lists various camping sites in Nova Scotia and this information is collected through the reservation system |
| 11 | +for the general public to reserve camping sites in Nova Scotia. The dataset has 34,900 rows and 13 columns. It lists a lot of information regarding the park’s |
| 12 | +name (ParkName), origin state and country of the water body, total booking size (partySize), type of rate (RateType), the type of booking (BookingType), Equipment, |
| 13 | +Booking start date along with its end date, night and their permits. |
| 14 | + |
| 15 | +## File Description |
| 16 | + |
| 17 | +### file1.csv |
| 18 | + |
| 19 | +I have used the csv module to read and write the contents from the dataset csv file named DNR_Camping_Parks_Reservation_Data_2016.csv to file1.csv respectively. |
| 20 | +To create the csv file from extracting all the data from dataset, I have used the functions ‘csv.writer’ and ‘csv.reader’ to write and read data respectively. |
| 21 | +As I am dealing with csv files, naturally the delimiter used is comma (,). |
| 22 | + |
| 23 | +### file2.csv |
| 24 | + |
| 25 | +Here I have removed the unnecessary columns and extracted data on ParkName, State, PartySize, BookingType, RateType and Equipment. |
| 26 | + |
| 27 | +### file3.csv |
| 28 | + |
| 29 | +Scanned the "Equipment" column, and replaced all “less than” with “LT” [e.g. less than 30 ft. after transforming LT30ft]. Similarly, replaced all “Single tent” with “ST”. |
| 30 | +I have used the regex module to perform this substitutions. To find and replace these words, I have used the ‘.sub’ function. I have replaced and substituted both the |
| 31 | +words consecutively rather than simultaneously to make the code simpler. |
| 32 | + |
| 33 | +### file4.csv |
| 34 | + |
| 35 | +This file has only the 20 unique parks in Nova Scotia which have the maximum number of "partySize". |
| 36 | + |
| 37 | +## Visualization using Neo4j |
| 38 | + |
| 39 | +### Load data and node creation |
| 40 | + |
| 41 | + |
| 42 | +### Parks with identical ‘RateType’, connected using a ‘NeghbourByRate’ relation |
| 43 | + |
| 44 | + |
| 45 | +### Parks with identical ‘Equipment’, connected using a ‘NeghbourByEquipment’ relation |
| 46 | + |
| 47 | + |
| 48 | +### Final Image file that has both the relations: NeghourByRate and NeghbourByEquipment |
| 49 | + |
| 50 | + |
| 51 | +### Using visualization, found the park with maximum "partySize" |
| 52 | + |
| 53 | + |
0 commit comments