MUD3 · Mining Urban Data Workshop

MUD3 is a full day workshop, organized in conjunction with ACM SIGKDD 2018 at London, UK. This is the third workshop of a successful series of MUD Workshops that have been organized at MUD@EDBT in 2014 and MUD2@ICML in 2015.

News: The workshop's program is available here.


We are gradually moving towards a smart city era. Many innovative applications arise daily utilizing massive urban data streams. Technologies that apply machine learning algorithms to urban data will have significant impact in a lot of aspects of the citizens' everyday life. Examples of such applications include managing disastrous events, understanding the city's sentiment and opinion, tracking health issues, monitoring crucial environmental factors as well as improving energy efficiency and optimizing traffic. Unfortunately, urban data have some characteristics that hinder the state of the art in machine learning algorithms. Such are diversity, privacy, lack of labels, noise, complimentary of multiple sources and requirement for online learning. Many smart city applications require to tackle all these problems at once. This workshop aims at discussing a set of new Machine Learning applications and paradigms emerging from the smart city environment.

We especially welcome contributions based on data that can be reused by the community, and we plan to make a list of these data sets available on the workshop website. For researchers who would like to try their hand on smart city data, data sets from the city of Dublin are already listed on this website.

A non-exhaustive list of challenges posed by smart cities data and welcomed at the workshop include:

  • Online learning: data generated from sensors can only partially/temporarily be stored. Thus, a major requirement is to process and analyse them as they arrive from the sources. Algorithms should be on-line and adaptive.

  • Large Scale Learning: the massive volume of data demands distributed / parallel processing technologies. Other issues include the complexity of the data coming from different sources with different spatial and temporal references or granularity.

  • Learning in Mobile Environment: special techniques are required for storing and learning in mobile environments.

  • Heterogeneous Data and Information Fusion: in many smart-city applications, different types of information (GPS, weather, Twitter, traffic data) should be analysed and combined in order to draw conclusions.

  • Learning with Social media: the main issue in mining micro-blogging data (e.g. Twitter) is that the text is very short, cursorily written and in different languages.

  • Event Detection: a very interesting research issue that arises from such data is the identification of real world events (e.g. "traffic jam", "accident", "flood", "concert").

  • Learning with Uncertain/Noisy Data: data generated by a smart city are typically very noisy. Uncertainty management procedures as well as crowdsourcing techniques might be required in order to aid the data models disambiguate the information.

  • Learning without Labels: with the size of the data sets and the associated area, labeling the full data set can be prohibitively expensive. Therefore, learning must typically be done with originally no or extremely few labels. Semi-supervised or active learning approaches could therefore be very interesting for such applications.

  • Computer vision: CCTV cameras are a rich source of information. They can be used to count pedestrians, detect accidents, security etc.

  • Sensor Networks

  • Visual Analytics

  • Traffic Management

  • Crowd Sourcing

  • Emergency Response

This workshop aims at raising the awareness of the Machine Learning community on the challenges and opportunities of the Urban Data research arena. Mining Urban Data is a multi-disciplinary field. The Data Management and Knowledge Discovery communities have already started working towards this direction, however even though there are some first efforts from Machine Learning perspective (see references), a greater ML involvement is required. ML will contribute to a better understanding and long term prediction of the urban sensor-citizen environment.

The workshop is organized by the consortium of the Horizon2020 Project VaVeL. For more information please visit the website of the project by clicking the logo below.