Recently I’ve gained an interest in Jupyter Notebook, datasets, and acquiring data. Of course, I just had to spin up a pet project for such an adventure, and thus I decided to use Jupyter Notebook.
I honestly wasn’t sure what to do for this Jupyter Notebook as I really just wanted to learn how to use Jupyter Notebook, plotting data, and so on. I found that I always enjoyed watching videos and reading up on how expensive the Hong Kong housing market is and after realizing the Government of Hong Kong provided a decent bit of economic data for public use, I decided on charting data from the Government of Hong Kong on the topic of the housing market.
Starting Off With Pandas
My first challenge was to acquire data, so I decided that I needed multiple scripts to get the economic data. Based on what I looked at while figuring out how to extract data from spreadsheets, it seemed that Pandas was the best candidate.
With the exception of the capital markets data, all the other scripts for the project were coded in Python. The reason for why capital markets was in PHP? I realized the capital markets data was in JSON format so it was way easier to get the data than from spreadsheets, so I made that one in PHP with the thought of migrating it to Python in the near future. The data provider for my capital markets data also suggested that I pull the most recent data every time, however, I wanted to extract data and since the data was only updated monthly or so, I don’t exactly need the latest data if it was going to be the same for the entire month.
Pandas does a lot of heavy lifting in terms of parsing through spreadsheets it turns out. It took a bit of time to figure out how the data was structured after it parsed the spreadsheet and it still takes me some trial and error to figure out how to read elements of the data even when it shows the data structure to me.
A Website To Accompany The Notebook
I wanted an alternative to just a Jupyter Notebook to display data and thought that allowing people to easily read this data through a website could be useful. The plan for a website was met by me telling myself “Another website!?” and to be honest, I got so sick of web projects. Eventually, I was like “you know what, I’m gonna spend most of the time on the Jupyter Notebook anyways” and decided to “greenlight” the website idea.
Back to PHP
Initially I wanted to create the website in NodeJS with Express.js, however, after realizing I was spending more time on the website trying to figure out how to connect to the MySQL database from NodeJS, I decided to scrap the idea for now and move back to PHP. I really did not want to push through another PHP project but here we go again I guess.
Switching to PHP make the website development go so much faster and I can see why they call LAMP stack one of the easiest to use.
Charting And Data Troubles
I can’t tell whether or not I messed up on the data type when inserting into the database, not properly understanding how to chart with matplotlib, or both, but there was a decent bit of converting I had to do to get matplotlib to recognize the series of values I wanted it to plot on the X and Y axes.
Finishing The Project
After finishing the project, I decided that I should probably make the repository public on Github. However, I really wanted to separate the website in case I wanted to host it on its own down the road. In the end, the Github repository for the Jupyter Notebook itself was posted on Github as a nice example of what I learned to do with Pandas, Python, and Jupyter Notebook.
I’m still not overly accustomed to Jupyter Notebook as each “cell” of the notebook file must be executed for the next part to work if you separated a block of code into different cells. But I must say, the layout for code and output in Jupyter Notebook is way cleaner and better to look at than whatever I’m seeing in Spyder or command line.
I think I’ll continue to poke around on Jupyter Notebook with the hopes of improving my ability to chart database rows and use a library that isn’t matplotlib.
To check out the Jupyter Notebook: https://github.com/angusleung100/hkhousinganalysis
To check out the website: https://hkhousingstats.techiskey.net/