Abstract
Traf?c sensing is a key baseline input for sustainable cities to plan and administer demand-supply management through better road networks, public transportation, urban policies etc., Humans sense the environment frugally using a combination of complementary information signals from different sensors. For example, by viewing and/or hearing traf?c one could identify the state of traf?c on the road. In this paper, we demonstrate a fusion based learning approach to classify the traf?c states using low cost audio and image data analysis using real world dataset. Road side collected traf?c acoustic signals and traf?c image snapshots obtained from ?xed camera are used to classify the traf?c condition into three broad classes viz., Jam, Medium and Free. The classi?cation is done on {10sec audio, image snapshot in that 10sec} data tuple. We extract traf?c relevant features from audio and image data to form a composite feature vector. In particular, we extract the audio features comprising MFCC (Mel-Frequency Cepstral Coef?cients) classi?er based features, honk events and energy peaks. A simple heuristic based image classi?er is used, where vehicular density and number of corner points within the road segment are estimated and are used as features for traf?c sensing. Finally the composite vector is tested for its ability to discriminate the traf?c classes using Decision tree classi?er, SVM classi?er, Discriminant classi?er and Logistic regression based classi?er. Information fusion at multiple levels (audio, image, overall) shows consistently better performance than individual level decision making. Low cost sensor fusion based on complementary weak classi?ers and noisy features still generates high quality results with an overall accuracy of 93 96%.