Integrating Expert Guidance for Efficient Learning of Safe Overtaking in Autonomous Driving Using Deep Reinforcement Learning

Preprints
J. Lu, G. Alcan, V. Kyrki
Submitted to IEEE Transactions on Intelligent Transportation Systems
Publication date: 2023

Overtaking on two-lane roads is a great challenge for autonomous vehicles, as oncoming traffic appearing on the opposite lane may require the vehicle to change its decision and abort the overtaking. Deep reinforcement learning (DRL) has shown promise for difficult decision problems such as this, but it requires massive number of data, especially if the action space is continuous. This paper proposes to incorporate guidance from an expert system into DRL to increase its sample efficiency in the autonomous overtaking setting. The guidance system developed in this study is composed of constrained iterative LQR and PID controllers. The novelty lies in the incorporation of a fading guidance function, which gradually decreases the effect of the expert system, allowing the agent to initially learn an appropriate action swiftly and then improve beyond the performance of the expert system. This approach thus combines the strengths of traditional control engineering with the flexibility of learning systems, expanding the capabilities of the autonomous system. The proposed methodology for autonomous vehicle overtaking does not depend on a particular DRL algorithm and three state-of-the-art algorithms are used as baselines for evaluation. Simulation results show that incorporating expert system guidance improves state-of-the-art DRL algorithms greatly in both sample efficiency and driving safety.

QDP: Learning to Sequentially Optimise Quasi-Static and Dynamic Manipulation Primitives for Robotic Cloth Manipulation

Conference Papers
D. Blanco-Mulero, G. Alcan, F. J. Abu-Dakka, V. Kyrki
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023), Detroit, USA, October 1-5, 2023
Publication date: 2023

Abstract

Pre-defined manipulation primitives are widely used for cloth manipulation. However, cloth properties such as its stiffness or density can highly impact the performance of these primitives. Although existing solutions have tackled the parameterisation of pick and place locations, the effect of factors such as the velocity or trajectory of quasi-static and dynamic manipulation primitives has been neglected. Choosing appropriate values for these parameters is crucial to cope with the range of materials present in house-hold cloth objects. To address this challenge, we introduce the Quasi-Dynamic Parameterisable (QDP) method, which optimises parameters such as the motion velocity in addition to the pick and place positions of quasi-static and dynamic manipulation primitives. In this work, we leverage the framework of Sequential Reinforcement Learning to decouple sequentially the parameters that compose the primitives. To evaluate the effectiveness of the method we focus on the task of cloth unfolding with a robotic arm in simulation and real-world experiments. Our results in simulation show that by deciding the optimal parameters for the primitives the performance can improve by 20% compared to sub-optimal ones. Real-world results demonstrate the advantage of modifying the velocity and height of manipulation primitives for cloths with different mass, stiffness, shape and size.

>Project Website<

Differential Dynamic Programming with Nonlinear Safety Constraints Under System Uncertainties

Journal ArticlesConference Papers
G. Alcan, V. Kyrki
IEEE Robotics and Automation Letters, Volume 7, Issue 2, Pages 1760 - 1767
IEEE International Conference on Robotics and Automation (ICRA 2023), ExCeL London, UK, May 29 - June 2, 2023
Publication date: April, 2022

Abstract

Safe operation of systems such as robots requires them to plan and execute trajectories subject to safety constraints. When those systems are subject to uncertainties in their dynamics, it is challenging to ensure that the constraints are not violated. In this letter, we propose Safe-CDDP, a safe trajectory optimization and control approach for systems under additive uncertainties and nonlinear safety constraints based on constrained differential dynamic programming (DDP). The safety of the robot during its motion is formulated as chance constraints with user-chosen probabilities of constraint satisfaction. The chance constraints are transformed into deterministic ones in DDP formulation by constraint tightening. To avoid over-conservatism during constraint tightening, linear control gains of the feedback policy derived from the constrained DDP are used in the approximation of closed-loop uncertainty propagation in prediction. The proposed algorithm is empirically evaluated on three different robot dynamics with up to 12 degrees of freedom in simulation. The computational feasibility and applicability of the approach are demonstrated with a physical hardware implementation.

Keywords

    • Optimization and Optimal Control
    • Constrained Motion Planning
    • Planning under Uncertainty
    • Robot Safety
    • Motion and Path Planning

BibTeX

@article{alcan2022differential,
  title={Differential dynamic programming with nonlinear safety constraints under system uncertainties},
  author={Alcan, Gokhan and Kyrki, Ville},
  journal={IEEE Robotics and Automation Letters},
  volume={7},
  number={2},
  pages={1760--1767},
  year={2022},
  publisher={IEEE}
}

Learning Based High-Level Decision Making for Abortable Overtaking in Autonomous Vehicles

Preprints
E. Malayjerdi, G. Alcan, E. Kargar, H. Darweesh, R. Sell, V. Kyrki
Submitted to IEEE Transactions on Intelligent Transportation Systems
Publication date: 2022

Autonomous vehicles are a growing technology that aims to enhance safety, accessibility, efficiency, and convenience through autonomous maneuvers ranging from lane change to overtaking. Overtaking is one of the most challenging maneuvers for autonomous vehicles, and current techniques for autonomous overtaking are limited to simple situations. This paper studies how to increase safety in autonomous overtaking by allowing the maneuver to be aborted. We propose a decision-making process based on a deep Q-Network to determine if and when the overtaking maneuver needs to be aborted. The proposed algorithm is empirically evaluated in simulation with varying traffic situations, indicating that the proposed method improves safety during overtaking maneuvers. Furthermore, the approach is demonstrated in real-world experiments using the autonomous shuttle iseAuto.

Learning Visual Feedback Control for Dynamic Cloth Folding

Conference Papers
J. Hietala, D. Blanco-Mulero, G. Alcan, V. Kyrki
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022), Kyoto, Japan, October 23–27, 2022
Publication date: 2022

Robotic manipulation of cloth is a challenging task due to the high dimensionality of the configuration space and complexity of dynamics affected by various material properties.The effect of the complex dynamics is even more pronounced in dynamic folding, for example, when a square piece of fabric is folded in two by a single manipulator. To account for the complexity and uncertainties, feedback of the cloth state using e.g. vision is typically needed. However, construction of visual feedback policies for dynamic cloth folding is an open problem. In this paper, we present a solution that learns policies in simulation using Reinforcement Learning (RL) and transfers the learned policies directly to the real world. In addition, to learn a single policy that manipulates multiple materials, we randomize the material properties in simulation. We evaluate the contributions of visual feedback and material randomization in real world experiments. The experimental results demonstrate that the proposed solution can fold successfully different fabric types using dynamic manipulation in the real world.

>Project Website<

BibTeX

@inproceedings{hietala2022learning,
  title={Learning visual feedback control for dynamic cloth folding},
  author={Hietala, Julius and Blanco--Mulero, David and Alcan, Gokhan and Kyrki, Ville},
  booktitle={2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  pages={1455--1462},
  year={2022},
  organization={IEEE}
}

Simultaneous and Independent Micromanipulation of Two Identical Particles with Robotic Electromagnetic Needles

Conference Papers
O. Isitman, H. Kandemir, G. Alcan, Z. Cenev, Q. Zhou
International Conference on Manipulation, Automation and Robotics at Small Scales (MARSS 2022), Toronto, Canada, July 25-29, 2022
Publication date: 2022

Abstract

Magnetic manipulation of particles at close vicinity is a challenging task. In this paper, we propose simultaneous and independent manipulation of two identical particles at close vicinity using two mobile robotic electromagnetic needles. We developed a neural network that can predict the magnetic flux density gradient for any given needle positions. Using the neural network, we developed a control algorithm to solve the optimal needle positions that generate the forces in the required directions while keeping a safe distance between the two needles and particles. We applied our method in five typical cases of simultaneous and independent microparticle manipulation, with the closest particle separation of 30 µm.

BibTeX

@inproceedings{isitman2022simultaneous,
  title={Simultaneous and Independent Micromanipulation of Two Identical Particles with Robotic Electromagnetic Needles},
  author={Isitman, Ogulcan and Kandemir, Hakan and Alcan, Gokhan and Cenev, Zoran and Zhou, Quan},
  booktitle={2022 International Conference on Manipulation, Automation and Robotics at Small Scales (MARSS)},
  pages={1--6},
  year={2022},
  organization={IEEE}
}

Planning for Safe Abortable Overtaking Maneuvers in Autonomous Driving

Conference Papers
J. Palatti, A. Aksjonov, G. Alcan, V. Kyrki
24th IEEE International Conference on Intelligent Transportation (ITSC 2021), Indianapolis, United States, September 19-22, 2021
Publication date: October, 2021

Overtaking is one of the most challenging tasks in driving, and the current solutions to autonomous overtaking are limited to simple and static scenarios. In this paper, we present a method for behaviour and trajectory planning for safe autonomous overtaking. The proposed method optimizes the trajectory by simultaneously enforcing safety and minimizing intrusion onto the adjacent lane. Furthermore, the method allows the overtaking to be aborted, enabling the autonomous vehicle to merge back in the lane, if safety is compromised, because of e.g. traffic in opposing direction appearing during the maneuver execution. A finite state machine is used to select an appropriate maneuver at each time, and a combination of safe and reachable sets is used to iteratively generate intermediate reference targets based on the current maneuver. A nonlinear model predictive controller then plans dynamically feasible and collision-free trajectories to these intermediate reference targets. Simulation experiments demonstrate that the combination of intermediate reference generation and model predictive control is able to handle multiple behaviors, including following a lead vehicle, overtaking and aborting the overtake, within a single framework.

BibTeX

@inproceedings{palatti2021planning,
  title={Planning for safe abortable overtaking maneuvers in autonomous driving},
  author={Palatti, Jiyo and Aksjonov, Andrei and Alcan, Gokhan and Kyrki, Ville},
  booktitle={2021 IEEE International Intelligent Transportation Systems Conference (ITSC)},
  pages={508--514},
  year={2021},
  organization={IEEE}
}

Driver Evaluation in Heavy Duty Vehicles Based on Acceleration and Braking Behaviors

Conference Papers
M. E. Mumcuoglu, G. Alcan, M. Unel, O. Cicek, M. Mutluergil, M. Yilmaz, K. Koprubasi
46st Annual Conference of the IEEE Industrial Electronics Society (IECON 2020), Singapore, October 18-21, 2020
Publication date: 2020

Abstract

In this paper, we present a real-time driver evaluation system for heavy-duty vehicles by focusing on the classification of risky acceleration and braking behaviors. We utilize an improved version of our previous Long Short Term Memory (LSTM) based acceleration behavior model [10] to evaluate varying acceleration behaviors of a truck driver in small time periods. This model continuously classifies a driver as one of six driver classes with specified longitudinal-lateral aggression levels, using driving signals as time-series inputs. The driver gets acceleration score updates based on assigned classes and the geometry of driven road sections. To evaluate the braking behaviors of a truck driver, we propose a braking behavior model, which uses a novel approach to analyze deceleration patterns formed during brake operations. The braking score of a driver is updated for each brake event based on the pattern, magnitude, and frequency evaluations. The proposed driver evaluation system has achieved significant results in both the classification and evaluation of acceleration and braking behaviors.

Keywords

  • Driver evaluation
  • Driver behaviors
  • Classification
  • LSTM networks
  • Heavy-duty vehicles
  • Acceleration
  • Braking

BibTeX

@inproceedings{mumcuoglu2020driver,
  title={Driver evaluation in heavy duty vehicles based on acceleration and braking behaviors},
  author={Mumcuoglu, Mehmet Emin and Alcan, Gokhan and Unel, Mustafa and Cicek, Onur and Mutluergil, Mehmet and Yilmaz, Metin and Koprubasi, Kerem},
  booktitle={IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society},
  pages={447--452},
  year={2020},
  organization={IEEE}
}

Optimization-Oriented High Fidelity NFIR Models for Estimating Indicated Torque in Diesel Engines

Journal Articles
G. Alcan, V. Aran, M. Unel, M. Yilmaz, C. Gurel, K. Koprubasi
International Journal of Automotive Technology, Volume 21, Issue 3, Pages 729 - 737
Publication date: January, 2020

Abstract

In this paper, optimization-oriented high fidelity indicated torque models which cover the whole operating regions under both steady-state and transient cycles for heavy-duty vehicles are developed. Two different experiments are performed and their data are merged to be utilized in the training of the models. In the first experiment, all combustion input channels are excited by quadratic chirp signals with different sweeps in their frequency profiles. Different from the first experiment, the engine speed is excited by ramp-hold signals in the second experiment. The estimations of friction, pumping and inertia torques in addition to the torque measured from the engine dynamometer are utilized in the indicated torque calculations. In order to model the calculated indicated torque, a nonlinear finite impulse response (NFIR) model with a single layer sigmoid neural network has been designed. A sensitivity analysis is performed by generating several models with different number of input regressors and neurons. Experimental results show that the majority of the models in a selected wide range of the model parameters are validated with fit accuracies higher than 90 % and 85 % on the World Harmonized Stationary Cycle (WHSC) and the World Harmonic Transient Cycle (WHTC), respectively.

Keywords

    • Diesel engine
    • Indicated torque
    • System identification
    • Experiment design
    • NFIR model

BibTeX

@article{alcan2020optimization,
  title={Optimization-oriented high fidelity NFIR models for estimating indicated torque in diesel engines},
  author={Alcan, Gokhan and Aran, Volkan and Unel, Mustafa and Yilmaz, Metin and Gurel, Cetin and Koprubasi, Kerem},
  journal={International Journal of Automotive Technology},
  volume={21},
  number={3},
  pages={729--737},
  year={2020},
  publisher={The Korean Society of Automotive Engineers}
}

Robust Trajectory Control of an Unmanned Aerial Vehicle Using Acceleration Feedback

Journal Articles
H. Zaki, G.Alcan, M. Unel
International Journal of Mechatronics and Manufacturing Systems, Volume 12, Issue 3-4, Pages 298 - 317
Publication date: October, 2019

Abstract

In this work, acceleration feedback is utilized in a hierarchical control structure for robust trajectory control of a quadrotor helicopter subject to external disturbances where reference attitude angles are determined through a nonlinear optimization algorithm. Furthermore, an acceleration-based disturbance observer (AbDOB) is designed to estimate disturbances acting on the positional dynamics of the quadrotor. For the attitude control, nested position, velocity, and inner acceleration feedback loops consisting of PID and PI type controllers are developed to provide high stiffness against external disturbances. Reliable angular acceleration is estimated through a cascaded filter structure. Simulation results show that the proposed controllers provide robust trajectory tracking performance when the aerial vehicle is subject to wind gusts generated by the Dryden wind model along with the uncertainties and measurement noise. Results also demonstrate that the reference attitude angles calculated through nonlinear optimization are smooth and within the desired bounds.

Keywords

  • Robust Control
  • Acceleration Feedback
  • Disturbance Observer
  • Quadrotor
  • Hierarchical Control
  • Nonlinear Optimization

BibTeX

@article{zaki2019robust,
  title={Robust trajectory control of an unmanned aerial vehicle using acceleration feedback},
  author={Zaki, Hammad and Alcan, Gokhan and Unel, Mustafa},
  journal={International Journal of Mechatronics and Manufacturing Systems},
  volume={12},
  number={3-4},
  pages={298--317},
  year={2019},
  publisher={Inderscience Publishers (IEL)}
}