Urban mobility through quality public transport is one of the major challenges for the consolidation of smart cities. Public bus system administrators, such as SPTrans, in São Paulo, define schedules for departures and arrival at bus stops that are hardly complied by bus operators. Understanding the behavior of this system under different contexts, such as day of the week, time of the day and holidays is vital for a better planning of bus transportation systems. There is a large amount of available data regarding bus lines, including GPS data from all buses in São Paulo that could be use to model bus travel time behavior and predict travel times under different contexts.
Modeling of bus lines is useful to predict the travel time of individual buses in the bus lines. Current prediction schemes use statistical evaluation or apply machine learning techniques over current position status of other buses in the line or from multiple lines, without considering historical data. Although current information is useful to predict the travel times for the next few minutes, to determine the travel time for the complete itinerary, which can take more than 1 hour, we should consider the evolution of the travel times in the period following bus departure until it reaches its destination. This prediction is useful for passengers, which will have better estimates on the time of arrival of buses, and also for bus line administrators, since they can better schedule the departure times of buses.
The objective of this project is to employ machine learning techniques, trained using computer simulation and historical data, to improve scheduling quality and predict future behavior in bus transportation system environments. To achieve this objective, we plan to improve bus scheduling and travel time predictions using a novel graph-based model for bus travel time behavior under different contexts, using: (i) regression and clustering techniques over historical GPS data to extract spatio-temporal features, and considering factors such as week day, time of the day and holidays, and (ii) combination of the graph-based model and real-time information from multiple bus routes to predict the performance of bus lines and the behavior of individual buses.