Skip to content

Commit de46d66

Browse files
authored
Create README.md
1 parent 0d97375 commit de46d66

File tree

1 file changed

+53
-0
lines changed

1 file changed

+53
-0
lines changed

README.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# Overview
2+
Data extraction for parks in Nova Scotia, Canada and visualization
3+
4+
## Dataset Source
5+
6+
https://data.novascotia.ca/Lands-Forests-and-Wildlife/DNR-Camping-Parks-Reservation-Data-2016/4zt7-x443
7+
8+
### About the Dataset
9+
10+
The dataset DNR Camping Parks Reservation Data 2016 lists various camping sites in Nova Scotia and this information is collected through the reservation system
11+
for the general public to reserve camping sites in Nova Scotia. The dataset has 34,900 rows and 13 columns. It lists a lot of information regarding the park’s
12+
name (ParkName), origin state and country of the water body, total booking size (partySize), type of rate (RateType), the type of booking (BookingType), Equipment,
13+
Booking start date along with its end date, night and their permits.
14+
15+
## File Description
16+
17+
### file1.csv
18+
19+
I have used the csv module to read and write the contents from the dataset csv file named DNR_Camping_Parks_Reservation_Data_2016.csv to file1.csv respectively.
20+
To create the csv file from extracting all the data from dataset, I have used the functions ‘csv.writer’ and ‘csv.reader’ to write and read data respectively.
21+
As I am dealing with csv files, naturally the delimiter used is comma (,).
22+
23+
### file2.csv
24+
25+
Here I have removed the unnecessary columns and extracted data on ParkName, State, PartySize, BookingType, RateType and Equipment.
26+
27+
### file3.csv
28+
29+
Scanned the "Equipment" column, and replaced all “less than” with “LT” [e.g. less than 30 ft. after transforming LT30ft]. Similarly, replaced all “Single tent” with “ST”.
30+
I have used the regex module to perform this substitutions. To find and replace these words, I have used the ‘.sub’ function. I have replaced and substituted both the
31+
words consecutively rather than simultaneously to make the code simpler.
32+
33+
### file4.csv
34+
35+
This file has only the 20 unique parks in Nova Scotia which have the maximum number of "partySize".
36+
37+
## Visualization using Neo4j
38+
39+
### Load data and node creation
40+
![](graph%20images/load_data_and_node_creation.png)
41+
42+
### Parks with identical ‘RateType’, connected using a ‘NeghbourByRate’ relation
43+
![](graph%20images/NeghbourByRate.png)
44+
45+
### Parks with identical ‘Equipment’, connected using a ‘NeghbourByEquipment’ relation
46+
![](graph%20images/NeghbourByEquipment.png)
47+
48+
### Final Image file that has both the relations: NeghourByRate and NeghbourByEquipment
49+
![](graph%20images/graph.png)
50+
51+
### Using visualization, found the park with maximum "partySize"
52+
![](graph%20images/visualization.png)
53+

0 commit comments

Comments
 (0)