Prediction of Future Movement of Surrounding Traffic Actors in Autonomous Vehicles using Deep Learning

12 min readAug 23, 2021

Autonomous vehicles are one of the most disruptive and futuristic technologies of this generation and have great research potential. These self-driving vehicles (SDVs) are expected to save millions of lives by preventing road accidents caused by human errors, such as faulty driver behavior, eventually improving road safety. Despite the progress achieved thus far, one of the critical pieces of the puzzle that remains unsolved with autonomous vehicles is predicting uncertainty of traffic actors in the surrounding. Artificial Intelligence (AI) with its self-learning capabilities can help predict pedestrian behavior, assess traffic actors’ surroundings, and effectively decide cars’ future routes without any collisions with its surrounding traffic actors. AI can also help provide safer operation for a self-driving vehicle by helping understand the movement of its nearby actors and anticipating in advance their intent and future behavior.

Introduction to Autonomous Cars and Future Behavior Prediction

An Autonomous Vehicle or SDV is a vehicle that functions without any human intervention using in-built sensors located in different parts of the vehicle that sense the environment or any surrounding objects. These cars are operated using several integral sensory components: radar sensors detecting any nearby vehicle, video cameras installed to read traffic lights and road signs, Lidar (light detection and ranging) sensors to detect lanes, measure distances between other traffic actors and detect road edges, and Ultrasonic sensors in the wheels to assist while parking.

There are many potential benefits that these SDVs have to offer. Primarily, automation can help reduce the number of accidents on our roads [1]. The leading cause of most crashes is dangerous and faulty driver behavior, such as inattentiveness, inadequate driving experience, over speeding, and aggressiveness; introduction of autonomy could reduce such risky behaviors. In addition to promising greater road safety, SDVs would also help in achieving reduced traffic congestion by maintaining a safe distance between two vehicles since these cars are programmed to space out automatically, thereby allowing other cars to move freely between lanes.

Another benefit that these SDVs promise is higher human productivity while commuting by allowing its occupants to pursue other activities like reading and working. The passengers could also save parking time since a self-driving car would park itself after dropping passengers to their destinations. Furthermore, autonomous vehicles allow greater independence to the elderly and people with physical disabilities to drive without any dependency on others and help them live free and self-sufficient lives. Most importantly, the biggest advantage that these cars would offer is a cleaner environment. Since SDVs help in reducing traffic congestion, they will allow achieving lower fuel or battery consumption eventually reducing greenhouse gases’ emission [2].

With the recent advent of breakthrough hardware technologies and electric cars, AI holds promise for the future. Automotive industries have seemingly begun adopting AI and autonomous algorithms [4]. AI is being widely implemented in advanced driver assistance systems (ADAS) to create a risk assessment module that computes potential dangers associated with surrounding vehicles [3]. It then assists drivers to suggest a safe trajectory with the lowest risk levels [1]. It is also being tested to prevent heavy congestion with the traffic control system based on machine learning predictions that dynamically develops traffic congestion systems and routes vehicles through low congestion routes [2].

Prediction of future trajectory of a surrounding vehicle

Given the plethora of entertainment options, such as listening to a radio station or listening to a book on the tape or talking to a passenger, chances of drivers’ being distracted have further increased. The autonomous vehicles can potentially detect driver’s attention levels and suggest safer trajectories to the SDV. One of the challenges for the prediction algorithms is to account for a wide variety of scenarios that it may encounter upon deployment. While there are 6 second filters and techniques such as Kalman filter (KF) [3], these techniques are good only for short term route and phase out over a longer horizon, as the model ignores surrounding context, rendering autonomous vehicles still being unable to best human drivers.

To ensure safe driving experiences, there needs to be a method to predict accurate future movement of surrounding traffic actors that can be resolved using DL. DL is a subset of Machine Learning that is based on networks similar to a human brain, called neural networks, with three or more layers. The DL model encodes each actor’s surrounding along with behavior assessment into an image that automatically derives relevant response for the scenario. This can provide early prediction for future routes as it reasons out its probabilities based on map data and predicting the actor’s future behavior from the current state.

Literature Review

There have been several studies conducted in the past that have dealt with the problem statement of predicting
an actor’s future movement in autonomous vehicles. This section will give a comprehensive summary of previous research on the topic by going over some of the problems related to implementation of autonomous vehicles and how AI and DL in specific can help solve these problems based on previous research.

Prediction of future movement in self-driving systems

Most of the autonomous vehicles use the technique of predicting future movement of vehicles by using techniques like the Kalman Filter which is usually beneficial only for short-term predictions [5]. One of its disadvantages is that it ignores surrounding objects like the roads, traffic signals, road signs etc., which is why its performance keeps on deteriorating for longer horizons.
The Kahman Filter is a methodology that is used in autonomous vehicles to predict future actions of a car in front of the subject car [3]. It uses data that is received from Lidar and Radar sensors installed in cars [7]. This can be combined with the CAMS (collision avoidance/ mitigation system) methodology which helps in case of any unexpectedly approaching vehicle, by overriding the normal driving algorithm with a more suitable collision avoidance trajectory by concentrating on just a single obstacle. [1]

Predicting motion and trajectories

There are several DL based algorithms that are used for prediction of movement of surrounding traffic actors in autonomous vehicles. Most of these are based on recent research in the area of Recurrent Neural Networks (RNNs) based Long Short-Term Memory (LSTM) for sequential prediction of tasks. LSTM is an artificial neural network used in deep learning that is well suited for operations like classification and prediction based on time-series data [4]. For SDVs, these neural networks help in predicting motion using past trajectory data.
Another technique for motion prediction is based on Convolutional Neural Networks (CNNs) that is a deep learning algorithm that takes an input image, assigns some weight to different objects in the image, to be able to differentiate them from each other.
In case of an SDV, it primarily works by encoding a vehicle’s surrounding context and using the encoded image for surrounding cars. Additionally, congestion avoidance with surrounding cars should be considered by deploying a real-time rerouting strategy with lower turnaround time. [2]

GAN based trajectory predictions

There have been several recent studies that demonstrate the efficiency of a Generative Adversarial Network (GAN) based architectures in autonomous vehicle related problems. GANs are algorithmic architecture that are comprised of a generator and a discriminator that help in identification of real images amongst real and fake images
[3]. A research proposes a Social-BiGAT which is a graph based GAN that uses scene context image as an input to the discriminator [4]. The discriminator has an in-built CNN that helps in extracting the features from the scene context image for motion prediction.
On the contrary, a self-attention conditional or SC GAN based architecture converts the “trajectory input into a sequence of 2D occupancy grids in a differentiable way” and uses these for predicting the future motion of the vehicles in front of it [5].

Machine-learned prediction models

Some of the most popular machine learning models used initially were like Hidden Markov Model, Bayesian networks, and Gaussian Processes. However, CNNs and LSTM outperformed these in terms of efficiency of prediction and accuracy scores [3]. LSTMs have also been used for predicting pedestrian future trajectories, but in case of SDVs, factors like traffic rules, signals and congestion play a vital role [10]. These perform prediction using the past trajectory data of the vehicles. However, the performance of each of these models is dependent on the number of parameters used for the prediction and the dataset being used [9].
Additionally, factors are environment conditions and traffic rules are sometimes ignored which eventually affect the prediction results and accuracy for future movement of traffic actors on roads [7]. These should be considered in the model input in a quantifiable manner for a better output.

Why do we need Future Behavior Prediction in SDVs?

Having surpassed human intelligence and performance, AI has made immense progress in various industries in the last couple of decades- ranging from advertising to finance to now, automobile industry. Over the last two decades car companies have begun developing and tapping AI for autonomous vehicles.

Driving is a part of our everyday lives, but in many ways, people tend to underestimate the potential danger that comes with it. To put this into absolute numbers, there are about five million more deaths from driving compared to twenty from plane crashes with human factors accounting for the vast majority [1]. Even though newer cars come with better seatbelt designs, airbags and advancements, AI is yet to be fully tapped by the automotive industry. This could be a game changing revolution to make driving an even more enjoyable and safer experience. As this technology continues to progress, a question regarding improvement arises- how can an autonomous vehicle predict the movement of other traffic participants with high accuracy?

One of the major questions that arises is what the surrounding actors would want to do in the future, in what direction they would want to move forward, rather than trying to find out how they would do it. This can be considered as a classification problem where there are different directions considered as classes. The input to this class is the state of our object that we are trying to predict- its position, velocity, type, shape, orientation, and location [4].

Few main concerns that may arise are what if the SDV commits mistakes. Therefore, if an autonomous vehicle can make accurate predictions of the movement of other traffic participants, it could improve its decision making and decrease the chances of an accident which will alleviate all consumer concerns. It could also influence other commercial car companies to begin investing in autonomous vehicles without fear of liability. There have been multiple studies conducted by researchers from industries and academia that have proposed algorithms to solve the problem of predicting future movements [4]. Deep Learning bases models such as GAN, CNN, RNN and LSTM address this problem and prove to solve it by predicting a safer future route for SDVs with high accuracy.

System Architecture

The first GAN-based model architecture can be divided into 2 sections: Generator and Discriminator. The Generator is fed with a scene-context image as an input. The input is passed through a CNN that extracts the features of the input image, flattens it and concatenates that with the state features obtained from a State Encoder. The concatenated image is passed though a Trajectory Decoder that outputs the trajectory of the surrounding traffic actor.

The Discriminator is used to infer the trajectories that are not scene compliant. In this case, the gradients flow back through the network to the generator and help the generator to generate only scene complaint images to be fed as input to the Discriminator as shown in the Architecture Diagram below.

Another technique of predicting the future trajectories of surrounding traffic actors on the road can be using a top-down two-dimensional convolutional RNN. This can be accomplished by leveraging past movements and using these for predicting future. As is typically the case with RNNs as shown in a figure below, various time steps are passed as inputs into a deep neural network sequentially. Each time step represents a top-down image that displays the surroundings of the vehicles at that instance. This image shows both- any obstacles near the car captured via live cameras and landmarks provided by a map.

The generated image is then passed as an input through 2D convolutions into an RNN. Finally, the RNN predicts movements of surrounding cars one-five seconds into the future based on the kind of scene, for instance, a highway, street, or freeway.

Future Scope

Autonomous vehicles have come a long way. From recognizing a stop sign and automatically braking, to clocking several thousand miles of test runs along the road between human drivers. It is now very common for riders to summon their cars remotely while they wait outside the parking spot.
Despite these advancements, there are some questions that are left unanswered. For example, how would SDVs behave in a blizzard, or adapt to different or extreme weather conditions. Would SDVs recognize bridges and underpasses the same way as humans and accordingly plan their movement without any accidents?

Movement of Autonomous Vehicles through underpasses and during extreme weather conditions (Photo Credit)

If future research succeeds in answering these questions, traveling by road can be much safer, given 1.2 million people die involving auto accidents every year [2] and as a result, the expected highway fatalities can go closer to zero. Additionally, the current models are yet to overcome challenges with respect to estimating accelerations and velocities of the vehicles moving ahead of the subject vehicle on the scene. This would help the autonomous vehicles to accurately predict speeds of the vehicles that are moving speedily and vehicles that are parked or fully stopped.

I hope this articles inspires you to further explore the area of Future Behavior Prediction in Autonomous Vehicles using Deep Learning based methods.

Happy learning! :)

References

[1] Lee, Kibeom, & Kum, Dongsuk. (2019). Collision Avoidance/Mitigation System: Motion Planning of Autonomous Vehicle via Predictive Occupancy Map. IEEE Access, 7, 52846–52857. https://doi.org/10.1109/ACCESS.2019.2912067

[2] Younghoon, Kahng, Hyungu, Lee, Soon-Kyo, Chung, Seokhyun, Lee, Sangmin, Kim, Cheong, Taesu, Shin, Keeyong, Park, Jeehyuk, & Kim, Seoung Bum. (2020). Intelligent traffic control for autonomous vehicle systems based on machine learning. Expert Systems with Applications, 144.
https://doi.org/10.1016/j.eswa.2019.113074

[3] Cui, Henggang, Radosavljevic, Vladan, Chou, Fang- Chieh, Lin, Tsung-Han, Nguyen, Thi, Huang, Tzu-Kuo, Schneider, Jeff, & Djuric, Nemanja. (2019). Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks. In 2019 International Conference on Robotics and Automation (ICRA) (pp. 2090–2096). IEEE.
https://doi.org/10.1109/ICRA.2019.8793868

[4] Wang, Eason, Cui, Henggang, Yalamanchi, Sai, Moorthy, Mohana, & Djuric, Nemanja. (n.d.). Improving Movement Predictions of Traffic Actors in Bird’s-Eye View Models using GANs and Differentiable Trajectory
Rasterization. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 8024–2348). ACM.
https://doi.org/10.1145/3394486.3403283

[5] Shi, Xiupeng, Wong, Yiik Diew, Chai, Chen, & Li, Michael Zhi-Feng. (2020). An Automated Machine Learning (AutoML) Method of Risk Prediction for Decision-Making of Autonomous Vehicles. IEEE Transactions on Intelligent Transportation Systems., 1– 10.
https://doi.org/10.1109/TITS.2020.3002419

[6] S. Kuutti, R. Bowden, Y. Jin, P. Barber and S. Fallah, “A Survey of Deep Learning Applications to Autonomous Vehicle Control,” in IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 2, pp. 712–733, Feb. 2021, doi: https://doi.org/10.1109/TITS.2019.2962338

[7] S. Mozaffari, O. Y. Al-Jarrah, M. Dianati, P. Jennings and A. Mouzakitis, “Deep Learning-Based Vehicle Behavior Prediction for Autonomous Driving Applications: A Review,” in IEEE Transactions on Intelligent Transportation Systems, doi: https://doi.org/10.1109/TITS.2020.3012034

[8] K. Samal, M. Wolf and S. Mukhopadhyay, “Attention- Based Activation Pruning to Reduce Data Movement in Real-Time AI: A Case-Study on Local Motion Planning in Autonomous Vehicles,” in IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 10, no. 3, pp. 306–319, Sept. 2020, doi:
http://doi.org/10.1109/JETCAS.2020.3015889

[9] S. Mandal, S. Biswas, V. E. Balas, R. N. Shaw and A. Ghosh, “Motion Prediction for Autonomous Vehicles from Lyft Dataset using Deep Learning,” 2020 IEEE 5th International Conference on Computing Communication and Automation (ICCCA), 2020, pp. 768–773, doi: http://doi.org/10.1109/ICCCA49541.2020.9250790

[10] Z. Ouyang, J. Niu, Y. Liu and M. Guizani, “Deep CNNBased Real-Time Traffic Light Detector for Self-Driving Vehicles,” in IEEE Transactions on Mobile Computing, vol. 19, no. 2, pp. 300–313, 1 Feb. 2020, doi: http://doi.org/10.1109/TMC.2019.2892451

[11] K. Messaoud, I. Yahiaoui, A. Verroust-Blondet and F. Nashashibi, “Relational Recurrent Neural Networks For
Vehicle Trajectory Prediction,” 2019 IEEE Intelligent Transportation Systems Conference (ITSC), 2019, pp. 1813–1818, doi: http://doi.org/10.1109/ITSC.2019.8916887