Bilal Farooq, Ryerson University IATBR 2018, Santa Barbara

Page created by Joan Henderson
 
CONTINUE READING
Bilal Farooq, Ryerson University IATBR 2018, Santa Barbara
Bilal Farooq, Ryerson University
IATBR 2018, Santa Barbara
Bilal Farooq, Ryerson University IATBR 2018, Santa Barbara
§ Topics covered
   § Route choice prediction
  § Mode choice prediction
  § Discrete-continuous mix prediction
  § Spatial structure of travel/activities
  § Travel/activity pattern inference
§ Artificial Neural Networks [2 paper], DNN [3], CNN [1], RBM [1]
§ Decision Tree/Random Forest [3]
§ Clustering approaches [3]
§ Ensemble machines [1]
1. What is the scope of data-driven learning in the context of travel
   behaviour modelling
2. What are the key gaps in the research?
3. Develop three research projects that address these gaps
§ July 16, 2018
   § 4:00-4:15PM   Introduction to the workshop
  § 4:15-4:30PM    Participants introduction
  § 4:30-5:15PM    Developing the problem statement
  § 5:15-6:30PM    Identifying key research gaps
§ July 18, 2018
   § 4:00-4:15PM   Recap of the workshop
  § 4:15-4:20PM    Formation of three groups
  § 4:20-4:50PM    Research projects sketch
  § 4:50-5:00PM    Presentation/feedback
  § 5:00-6:00PM    Interaction with time use and travel workshop
§ Dr. Shadi Djavadian (Ryerson)
§ Melvin Wong (Ryerson)
§ Georges Sfeir (AUB)
§ Vishnu Baburajan (IST)
Discriminative models
§ Good for:
  § Extraction and analysis of travel patterns
    § Purpose of trip
    § Mode of transportation
    § Travel activity/diary
  § Classification of travellers
§ Key advantages in the case of activity/mobility surveys using GPS
 data from smartphone
 § Cheaper
 § Managing big data sources
 §…
Discriminative models
§ Classifying major modes/purpose only
  § Ignoring the purpose since it’s not an easy task to detect?
  § Abstract representation of purpose
§ Such models good for capturing unique patterns
   § Our responsibility to put semantic meaning to them
Generative models
§ Good for:
  § Predictive modelling
 § Exploring the distribution and correlations of variables
 § Dealing with missing data
 § Population synthesis
 § Merging multiple data sources
§ Spatio-temporal transferability of models
   § Assumption that behaviour remains the same
Generative models
§ Imbalance data: applications can be risky
   § Case of elderly population without smartphones
  § Use of probabilistic models based on historical data to predict
   missing part of the data (e.g. When phone is off)
§ Such models can be useful for diagnostics
   § Case of identification of latent classes
§ When and how to use data-driven learning techniques?
  § Alchemy!
§ Interpretation of the model; what can be done with them and what
 cannot; and what is their use
§ Incorporating dynamics in data-driven models
   § Beyond LSTM/time series models
§ Use in forecasting (especially the generative models)
§ Data-driven estimation techniques for hypothesis-driven modelling
  § Advancements in stochastic gradient decent
§ Exploring the abstract representation of travel purpose (and mode)
§ Benchmark datasets
   § Openly available datasets from North America, Europe, Asia
§ Using such techniques to capture unexplainable dimensions of
 hypothesis-driven modelling
§ Individual specific modelling
   § Rich longitudinal data on individuals
§ Privacy preserved model estimation
§ Incorporating context-aware variables in data-driven approaches
§ Improving predictive accuracy of discrete choice models with
 machine learning while maintaining interpretability
 § Exploration of hybrid model formulations
§ Selection processes for variables/features for interpretable and
 uninterpretable parts of utility function
§ Exploration of models for the uninterpretable information
§ Trade-off analysis
§ Benchmark dataset for comparative analysis
§ Definition of dataset
  § Which decision variable? Or families of decisions?
    § Balanced data
  § What location?
  § 1 day vs multiple days
  § Size of data
  § Related transportation data
§ Role of Kaggle sort of data repositories
§ Use of synthetic data?
§ Predictive power: what usage and and what cost
§ To what extent is privacy important in travel behaviour?
§ What could be the implications of masking/filtering private data in
 travel behaviour?
§ Training of privacy aware and counterpart models
§ Quantification of:
  § Improvement in privacy
  § Semantic data needs
§ Joint discussion on:
   § Theory/Hypothesis-driven and Data-driven approaches
§ Large dataset can inspire new theories
§ Predictability vs Transferability
§ Interpretability
   § What’s inside!?
§ Bayesian origin of machine learning
§ What problems are good to use this tool and what are not?
You can also read