Scale By the Bay Scale By the Bay

Friday, October 29, 2021

Location-Based Data Engineering for Good with PySpark/Graphs
Evan Chan
Evan Chan
UrbanLogiq, Inc., Senior Data Engineer

Cell phones are ubiquitous, and a huge amount of location-based data is generated by apps, advertiser networks, libraries, mobile software and hardware providers, and cell networks. This is a talk about working with location-based data for good: to understand mobility patterns and help improve urban, traffic, and economic planning. PySpark is used to wrangle the large amount of data, sessionize and calculate potential trips, and ingest into different form factors such as graphs to understand mobility patterns. I also discuss data quality engineering and experimentation.