Very first interview was with the potential direct supervisor and the
Vice President of Technology & Innovation. I was told that I would be given an excel spreadsheet dataset and to respond within two days with my findings. I asked for a few more days because I was in the middle of a large work project that was due at the end of the week, and they were okay with that.
The spreadsheet consisted of four columns -- Property ID, Account ID, City Supplied Lot Area, and the Lot Area in SMLS's system. The very first thing I wanted to do was to check for duplicates within the data. I did this in Excel through conditional formatting, and I also created additional columns with concatenation of columns, as a way to better demonstrate duplication and attempt to create a unique identifier key for each row. I also made a hash of this concatenated column as well, since it would save size in a database. There were duplicates across multiple rows, sometimes every single row. I deleted duplicates and noticed that there were often large discrepancies between the SMLS' system and Lot Provided area.
I found Pearson's Coefficient between the City Supplied Lot Area and the SMLS system, both before and after the duplicate values were removed. I then brought both my modified and original dataset into SQL and found that there were 46,770 unique rows of data. I then filtered out outliers, by only counting rows that had both land areas within 10% of each other in either direction, and I then took the Pearson's coefficient between the two land area rows in SQL.
I also uploaded the original dataset and my cleaned version to Weka and ran some modeling and created visualizations from there.
Finally, I brought the data over to Power BI, and created some measurements, calculations, and created a dashboard.
I am not surprised that I wasn't the top candidate in such a competitive field, but their communication is awful to nonexistent. I followed up after a month and was told that they would get back to me soon. It's been five months now. It's safe to say that I didn't get the job, and that's okay, but some sort of follow-up on a project that used four different software suites, double digits and a report, would be a common courtesy.
If you value your time and respect, I would steer clear of here.