The workshop will be streamed on YouTube live: Spark SQL Workshop: Advanced join & group by techniques. Post stream, it will be available to watch and follow at your own pace.
Note Please remember to switch off your code spaces.
- Start a codespace machine
. - Wait for the terminal to start and then run the command
docker compose up -d && sleep 30on the terminal
. - Click on ports tab -> click on the globe icon in the address for port
8888.
. - Click on the
notebooksfolder and openadv_joins_group_by.ipynb.
. - Open as jupyter lab, for better experience.

Follow along with the workshop!
Note remember to switch off codespaces as

Prerequisites:
Start the container by cloning the repo and starting the containers (note you will have to stop other containers that you mayh have runnign on port 8888 & 8080)
git clone https://github.com/josephmachado/advanced_spark_sql_for_data_engineers.git
cd advanced_spark_sql_for_data_engineers
docker compose up -d
sleep 30Open Jupyter lab at http://localhost:8888/lab/tree/notebooks.
Spark UI is available at http://localhost:8080.
Stop container with
docker compose down