Acceleration and optimization of dynamic parallelism for irregular applications on GPUs