Canor Coach: Towards Noise-Robust Human-In-The-Loop Reinforcement Learning