Extending the toolbox for open geolocation data analytics
Thousands of researchers and practitioners in the fields of travel behavior, city science, and urban planning are craving making good use of geolocation data. But how? x y t library offers an open library to ease privacy-by-design calculation of the basics of human mobility patterns, integrated with open contextual data
In 2019, the Swiss Digital Strategies integrated a clear agenda about the future of mobilities . Firstly, to foster and accompany the development of new technologies and services; and secondly, to higher the threshold of standard open data. Within the same time frame, business-driven stakeholders (e.g., transport operators, telecom companies) started to offer insights on their data, such as the Urban Mobility platform from Swisscom , Movement platform from Uber , Metro from Strava , or the Tomtom traffic index. Today, geolocation is everywhere. Yet, the generated data is extensive and heterogenous (when accessible). Datasets are large, multi-sourced, often noisy, and come in various formats and standards. In addition, a specificity of urban data is its mix levels of restriction. While geolocation data is private and sensitive, public transit schedules are open data.
In this context, there is a need
- to provide a framework to re-unify the multiformity of urban dynamics data;
- to articulate open and restricted data;
- to cohere offer and demand data;
- and to keep track of a privacy metric.
This project proposes to develop and release an open Python package to address these four needs and therefore contribute to Urban Mobility Open Research Data practices. This package will complement, enhance or piggyback on a set of existing open python packages [6-11]:
From raw geolocation data to mobility data
The preprocessing part is probably the most important: how to transform raw geolocation data – basically timestamped longitude/latitude tuples – into mobility data (e.g. trips, modes, point-of-interest) ? Raw geolocation data is valuable because it can be collected seamlessly and widely. It can therefore be representative of the population at low cost and it is not subject to the respondents’ fatigue. However, geolocation data is far from being usable in behavioral analyses as it only contains noisy georeferenced timestamps. And it requires both an extensive expertise in transportation engineering and advanced skills in data science. Today, only a few companies offer to transform raw geolocation data into mobility data (e.g. ). But to the best of our knowledge, no open-source library allows to do it in a sufficiently accurate manner.
Data alteration for privacy control
In addition to transforming raw geolocation data into mobility data, the preprocessing steps include systematic spatial aggregation to purposedly alter the quality of the data (mainly for storage purposes). This can be leveraged as a privacy-controller to choose (a) different levels of aggregation depending on the needs for the final application (e.g., traffic analysis zone, neighborhood, zip code, town); and (b) the geofence the raw data (e.g., remove points in the vicinity of home or work). Put together, this pre-processing ensures that privacy is taken into account from the very first stage of data transformation. This constitutes the building block to comply with the GDPR (General Data Protection Regulation) or the DPA (Data Privacy Act).
Integrate open contextual data
Lastly, additional efforts are expected to be conducted for “contextual data” integration. Based on a promising study that leverages transit data to assess the demand-supply gaps in public transport , the goal is to integrate General Transit Feed Standard (GTFS)  or the Open Street Map (OSM)  data. A growing (and well organized) community is contributing to such efforts (see ). This data integration must happen at this stage to ensure the correspondence in spatial or coordinate reference systems.
To conclude, x y t abstracts activity space and activity organization as simple standard objects to facilitate scalability and reproducibility. x y t allows privacy-by-design analytics, and a smooth the integration of open contextual data. x y t brings different fields and methods together, at the service of human mobility. x y t covers multiple needs and tend to comply with the emerging practices and skills of researchers and tomorrow’s practitioners. x y t brings a contribution to the transportation-related open research community that gathers more than a hundred contributors: awesome-transit , Urban Data Lab  and MATSim  communities.
- tracktotrip 
- movingpandas 
- scikit-mobility 
open data manipulation
- gtfs_function 
- osmnx 
- PySAL 
- NetworkX 
- Spatial Access or tracc for accessibility
 “iNUA #9: Avoid-Shift-Improve (A-S-I),” SUTP. https://sutp.org/publications/sustainable-urban-transport-avoid-shift-improve-a-s-i-inua-9/ (accessed Mar. 28, 2022).
 D. opérationnelle S. numérique GDS, “Stratégie Suisse numérique - Plan d’action,” Stratégie Suisse numérique. https://www.digitaldialog.swiss/fr/plan-d-action (accessed Mar. 28, 2022).
 “Swisscom Mobility Insights | Swisscom.” https://www.swisscom.ch/en/business/enterprise/offer/enterprise-mobile/mobility-insights.html (accessed Mar. 28, 2022).
 “Uber Movement: Let’s find smarter ways forward, together.” https://movement.uber.com/?lang=en-US (accessed Mar. 28, 2022).
 “Strava Metro Home.” https://metro.strava.com/ (accessed Jan. 07, 2021).
 R. Gil, “TrackToTrip.” Dec. 21, 2021. Accessed: Mar. 29, 2022. [Online]. Available: https://github.com/ruipgil/TrackToTrip
 “MovingPandas,” MovingPandas. https://anitagraser.github.io/movingpandas/ (accessed Mar. 25, 2022).
 L. Pappalardo, F. Simini, G. Barlacchi, and R. Pellungrini, “scikit-mobility: a Python library for the analysis, generation and risk assessment of mobility data,” ArXiv190707062 Phys., Feb. 2021, Accessed: May 10, 2021. [Online]. Available: http://arxiv.org/abs/1907.07062
 S. Toso, “Library GTFS functions.” 2020. Accessed: Nov. 28, 2021. [Online]. Available: https://github.com/Bondify/gtfs_functions
 G. Boeing, “OSMnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks,” Comput. Environ. Urban Syst., vol. 65, pp. 126–139, Sep. 2017, doi: 10.1016/j.compenvurbsys.2017.05.004.
 “About MATSim,” MATSim.org. https://www.matsim.org/about-matsim (accessed Mar. 31, 2022).
 “Motion Tag.” https://motion-tag.com/de/ (accessed Mar. 23, 2022).
 M.-E. Schultheiss, “Assessment of the Bus Transit Network: A Perspective from the Daily Activity-Travel Organization of Travelers,” Sustainability, vol. 14, no. 4, Art. no. 4, Jan. 2022, doi: 10.3390/su14042406.
 Google Transit API, “Documentation for GTFS data.” 2021. [Online]. Available: https://developers.google.com/transit/gtfs/reference
 “OpenStreetMap Wiki,” 2021. https://wiki.openstreetmap.org/wiki/Downloading_data (accessed Feb. 02, 2021).
 Center for Urban Transportation Research, “Library awesome-transit.” 2021. Accessed: Nov. 28, 2021. [Online]. Available: https://github.com/CUTR-at-USF/awesome-transit
 “Urban Data Lab,” Geoff Boeing, Jun. 04, 2020. https://geoffboeing.com/lab/ (accessed Mar. 31, 2022).
 “PySAL.” https://pysal.org/ (accessed Mar. 25, 2022).
 “NetworkX — NetworkX documentation.” https://networkx.org/ (accessed Mar. 25, 2022).