Projects

Within the last years I have been involved in several application domains, where their common denominator is the role of data driven modeling and machine learning. While the majority of the applications are in interdisciplinary spatial problems over time I have moved toward larger scales and larger data sets.

Machine Learning and Data Driven Modeling in Urban and Spatial Problems

  • Urban morphology and deep learning: A comparative study of urban development patterns in 1.1 million cities across the planet (Ongoing)
  • Fast and scalable urban flood risk estimation: Learning to emulate slow physics based simulation engines (Ongoing)
  • Data driven urban air quality assessment at the global scale (Ongoing)
  • Remote sensing and slum detection: Using satellite imagery and deep learning techniques to detect informal settlements (slums) and poverty estimates across the globe (Ongoing)
  • Exploratory city mining: Geo-visualization of high dimensional spatial patterns
  • Data driven urban air pollution estimation using large urban parameters in SingaporeUrban and Spatial Problems
  • Data driven urban traffic simulation using GPS traces of cars

Economic and Financial Problems

  • Urban economy and real estate market dynamics: Developing a real estate portal by crawling publicly available data streams in Switzerland
  • Systemic risk in world economic networks
  • An image of the market: A macroscopic view to the dynamics of 6000 stocks at New York stock exchange

Other Application Domains

  • Structural design and design space exploration: Machine learning based modeling of the dynamics and interdependencies of force and form diagrams in the context of structural design (Ongoing)
  • Atmospheric science: Understanding the role of cyclone activities in accumulation of ice cores in coastal West Antarctic
  • Natural Language Processing: Developing an automatic smart news application that after crawling thousands of news from Twitter and other news channels, using methods such as word2vec, automatically produces private and personalized clusters of news for the user

………………………………………………………………

(Urban Morphology)

1-Urban morphology meets deep learning: A comparative study of urban development patterns in 1.1 million cities across the planet (ongoing project)

………………………………………………………………

Short Description: Most of the studies on urban form and urban morphology are based on very few observations. On the other hand, availability of large spatial data sets across the planet such as Open Street Map (OSM) offers a new opportunity for the study of urban development patterns.

13We use spatial data of 1.1 million cities, towns and villages across the planet, which is publicly available in open street map. Higher resolution: http://bit.ly/2x9glbp

In this work, with the use of deep neural networks (specifically, deep convolutional auto-encoders) from computer vision, we automatically compare and identify the main patterns of urban forms via images of the street network of these 1.1 million locations, all taken at the same spatial scale. After training the model, one can visually find similar patterns of development to any selected area.

14Using deep learning we find similar patterns of development across the planet. For a given location (first cell in each column) the trained model automatically finds the most similar urban forms all over the planet.

Further, the following map shows a section of the automatically generated spectrum of urban forms all over the world.

15An automatically generated spectrum of urban development patterns for 1.1 million cities, towns and villages across the planet. Interactive version: http://bit.ly/2tzvLGT
16Topological Data Analysis on main patterns of urbanization at the global level
Next step: How cities can learn from each other? Building a search engine of development patterns for spatial planners

At the moment, in addition to spatial form we are going to simultaneously consider functional use and other urban quality indicators including economy, health, environment, and transportation. American community survey with information of more than 60K locations (census tracts) is an immediate Big Data to train multimodal machine learning algorithms. The final results can be represented with easy to browse interfaces for urban design and spatial development researchers.

Project website: https://sevamoo.github.io/cityastext/
Related publication:
Vahid Moosavi, Urban morphology meets deep learning: Exploring urban forms in one million cities, town and villages across the planet, (Under review at the journal of Environment and Planning B), https://arxiv.org/abs/1709.02939.

………………………………………………………………

(Urban Flood Modeling and Learning Physics)

2- Fast and scalable urban flood risk estimation: Learning to emulate slow physics based simulation engines (ongoing project)

………………………………………………………………

Short Description: Urban flood risk estimation at a high spatial resolution (e.g. building level) is an important problem. However, computational complexity of theory driven or the so-called “physics based simulation engines” makes their use very hard if not impossible. The main idea here is that we use machine learning to learn the behavior of these slow engines in relation to the final flood risk without a need to run the whole simulation again and again. The training of the model might take a long time, depending on the scale and the number of samples. However, the trained model is then much faster than the original simulation and easy to run by changing the initial conditions.

17An early stage result for calculating the maximum water levels in water catchment area of Lucerne: (left) physics based model that takes 5 days to simulate in comparison to (right) the data driven model that takes around 5 minutes to produce the final maximum water levels.

Collaborators: Dr. Joao Paulo Leitao, research scientist, systems engineering and intelligent network operations, department of urban water management, EAWAG and Dr. Mohamed Zaghloul, post-doc, CAAD, ITA, ETH Zurich

The use of physics based engines as the data generator and the state of art in machine learning and computer vision such as convolutional neural networks and recurrent nets is becoming a hot topic in fields such as computer graphics and physics. This new field, which is called “learning physics”, has great potentials in many engineering simulations, which are based on valid theories, but suffer from the issue of computational complexity. This field has some similarities to the field of “model reduction” in nonlinear dynamical systems, but here the emulations are based on nonlinear methods.

………………………………………………………………

(Urban Air Pollution at the Global Scale)

3- Data driven urban air quality assessment at the global scale (Initiated recently)

………………………………………………………………

Short Description: The current approaches to study of urban air quality are either based on slow physics-based simulations or qualitative observations from a single location. Our idea in this work is to invert the game, by coupling the measurements from thousands of urban air pollution monitoring stations, climatic measurements, high resolution satellite images, land use models and state of the art in deep learning (e.g. multi-modal Siamese networks) to learn a model that predicts the overall air quality levels of any region without a need to direct measurements.

18We are continuously crawling the air quality measures of more than 8000 monitoring stations, which are being collected in real time and generously offered from aqicn.org
19Can we learn to predict the effect of urban parameters on air pollution levels by analyzing high-resolution satellite images of more than 8000 locations coupled with the measurements from their corresponding monitoring stations? (Red squares in the middle of each image indicates the location of the sensor.

Collaborators: Dr. Erik Velasco, research scientist, center for environmental sensing and modeling (CENSAM), MIT Singapore SMART and Giulio Isacchini, Master student, Department of Physics, ETH Zurich, Switzerland.

………………………………………………………………

(GIS and High Dimensional Data Visualization)

4-Exploratory city mining: Geo-visualization of high-dimensional spatial patterns

………………………………………………………………

Short Description: In this work, we address a technical problem in the field of spatial visualization, dimensionality reduction and map coloring of high dimensional patterns. We developed a new method that projects high dimensional data to a linear one-dimensional space in a way that similar high dimensional patterns receive similar one-dimensional values. The one-dimensional values then can be easily used as a color spectrum on a map, while they are contextually referring to high dimensional patterns. In the maps below, the color spectrums visualize the high dimensional patterns (explained on the radar diagrams). For example, in figure below, the dark blue (i.e. bad and very bad health conditions with low education levels) is an extremely opposite pattern to the red color (i.e. high level of educations and good health condition). Interestingly there are some neighborhoods in the center of London with these extreme contrasts.

1

Contextual map of London, based on 12 variables in categories of health and education

2

Contextual map of NYC, showing the level of spatial segregation based ethnic groups

Related publication: Vahid, Moosavi. “Contextual mapping: Visualization of high-dimensional spatial patterns in a single geo-map.” Computers, Environment and Urban Systems 61 (2017): 1-12.

This computational method can be easily used as a Visual Spatial Assessment Tool, where different stakeholders choose different spatial aspects of their interest and the method automatically identifies the main patterns related to the chosen criteria and renders them in one single map with an intuitive coloring of the clusters.

………………………………………………………………

(Urban Air Pollution)

5- Data driven urban air pollution estimation: Finding candidate locations for air pollution monitoring stations in streets of Singapore

………………………………………………………………

Short Description: Air pollution at the street level, where people have the direct exposure to air is very important, but very expensive to measure. With the use of a collected urban data sets on more than 80 urban parameters such as land use and road network measures in Rochor area of Singapore and with expensive direct measurements of only 3% of the area (red lines in the figure below in the middle), we developed a nonlinear air pollution estimation model, which accurately estimates the air pollution for the whole area.

7More than 80 urban parameters plus few direct air pollution measurements as the input to our algorithm

Based on the estimated pollution patterns and the chosen urban parameters, we identified four main spatial patterns with different dynamics. Next, the center of each cluster was chosen as the best candidate locations for fixed monitoring stations.

8Identified urban clusters and the candidate locations for monitoring stations

Related publication: Vahid Moosavi, Gideon Aschwanden and Erik Velasco, Finding candidate locations for aerosol pollution monitoring at street level using a data-driven methodology, the journal of Atmospheric Measurement Techniques 8, 3563-3575, 2015.

In collaboration with Dr. Erik Velasco, Center for Environmental Sensing and Modeling (CENSAM), MIT Singapore SMART

Project website: http://goo.gl/TN3XdP

………………………………………………………………

(Urban Transportation Networks)

6- Data driven traffic modeling: Case of Beijing

………………………………………………………………

Short Description: With the use of GPS traces of 30,000 taxicabs in Beijing, we constructed a Markov Chain (MC), which encapsulates the dynamics of the traffic in a probabilistic network. The MC model has very interesting (but les explored) mathematical properties that can be used for community detection and identification of critical nodes or sections of the network in terms of systemic risk.

GPS traces of 30,000 taxicabs, encapsulate the spatial dynamics over the road networks of Beijing (video: http://bit.ly/2fTHNG3)
6The constructed Markov Chain from GPS traces has several useful mathematical properties

Related publication: Vahid, Moosavi and Ludger Hovestadt, Modeling urban traffic dynamics in coexistence with urban data streams, Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing. ACM, 2013.

………………………………………………………………

(Urban Economy and Real Estate Market)

7- Real estate market dynamics: Developing a real estate portal by crawling publicly available data streams in Switzerland

………………………………………………………………

Short Description: While the current business model in real estate consulting market is based on expensive services with limited access to data sets, we discovered that there are a lot of valuable spatial data, available for free. From the last year by collecting several spatial data sets and daily web crawling of the main real estate portals in Switzerland (and recently in Germany), we first developed an automated property evaluation all over Switzerland (above 94% accuracy for rental price), which led to developing further business ideas and developing a real estate portal for business analytics, which are freely available to public. We believe this brings more transparency to the market.

11Estimated rental price per square meter in buildings in Geneva in a certain period of time

This platform offers several data driven analytics to different types of stakeholders.

Live Prototype: http://www.keylead.ch

12Different publicly available analytics shown for Zurich: (left) public transport accessibility, (middle) noise levels, and (right) clusters of available rental properties with different aggregated prices

Related publication: Vahid Moosavi, Urban data streams and machine learning: A case of Swiss real estate market (Technical report 2016). https://arxiv.org/abs/1704.04979

………………………………………………………………

(Systemic Risk)

8- Systemic risk in world economic networks

………………………………………………………………

Short Description:

In 1952 Nobel Prize winner economist, Robert Solow showed the power of algebraic methods such as Markov Chains (MC) for the analysis of Input-Output economic networks. However, back then due to the lack of data and computational power majority of economic models were focused on analytical approaches. In this work by using the recently available data from transactions of 35 industries within 41 main economies in the world for the period of 1995 to 2011, we developed a dynamic probabilistic model of this network based on this MC formalism. Unlike the majority of theory driven simulations, which are based on assumptions on behaviors of economic agents, here the only ingredient is data of the flow of money/commodity between the economic nodes. The proposed model has different mathematical properties that can be translated into systemic risk measures of countries and economies.

Next to the known properties such as steady state probabilities, we show how the use of less explored features of MC such as Kemeny constant in combination to perturbation analysis of the constructed matrices reveal interesting and paradoxical patterns of the economic networks. In this direction, we developed two systemic risk measures, called systemic fragility and systemic influence.

9Markov chain model of monetary flows between industries in different countries
10GDP shares of economies (red line) compared with their structural fitness (blue line) over time. The ratio between two time series reveals the structural risk of economic failure (red gaps).

Related publication: Vahid Moosavi, Giulio Isacchini, A Markovian model of evolving world economic network, PLOS ONE | https://doi.org/10.1371/journal.pone.0186746, 2017.

Supporting materials: https://sevamoo.github.io/Markovian_IO_SI_PLOSONE/

………………………………………………………………

(Financial Markets)

9-An image of the market: A macroscopic view to the dynamics of 6000 stocks at New York stock exchange

……………………………………………………………..

Short description: If we simply visualize 6000 time series together most of the times it is very unlikely to find any cluster of patterns in the market. In this project, based on the idea of random fields we developed a sorting algorithm, which reveals the underlying dynamics in the market (if any). Each box in the first figure corresponds to one week from June 2010 to June 2012. Each dot in each box shows the value of one stock out of 6000 stocks. Location of each stock in the boxes are identified automatically in a way that stocks with similar dynamics over two years, have similar locations in the boxes. As result, all the boxes together reveal the underlying dynamics of the market like a series of heat maps.

Main mathematical concepts: Random fields and Self Organizing Maps

3

In this case, as figure below shows there are two main distinct patterns in the market. Further, each blob represents a portfolio of stocks with very similar dynamical patterns.

4

Further details can be found here: http://bit.ly/2wbsuyQ

………………………………………………………………

Other Ongoing projects



Using satellite imagery and deep learning to detect informal settlements across the globe 

In collaboration with DLR Germany



Structural design space exploration with machine learning  

In collaboration with the chair of Structural Design at ETH Zurich

Here, our main goal is to learn the underlying manifolds of design space parameters plus the final structural forms. While the method of graphic statics takes care of finding forms in equilibrium it is really hard and complex to understand the interrelationships between force-diagram parameters and the final forms. More than that, from the design point of view we need another model to be able to sort the final designs based on several factors including their geometric forms. In the first step, using a set of 17K randomly generated forms (with a set of defined functional boundaries) we trained a Self Organizing Map, which automatically creates a spectrum of emerged forms. As it can be seen there are distinct clusters of forms with their internal variations.

SOM_80_80_small

Different sample sections of the identified spectrum:

Project repository: https://github.com/sevamoo/Structural_Design_Machine_Learning

 

Advertisements