Quickly and accurately completing endoscopic submucosal dissection (ESD) operations within narrow lumens is currently challenging because of the environment’s high flexibility, invisible collision, and natural tissue motion. This paper proposes a novel stereo visual servoing control for a dual-segment robotic endoscope (DSRE) for ESD surgery. Departing from conventional monocular-based methods, our DSRE leverages stereoscopic imaging to rapidly extract precise depth data, enabling quicker controller convergence and enhanced surgical accuracy. The system’s dualsegment configuration enables agile maneuverability around lesions, while its compliant structure ensures adaptability within the surgical environment. The implemented stereo visual servo controller uses image features for real-time feedback and dynamically updates gain coefficients, facilitating rapid convergence to the target. In visual servoing experiments, the controller demonstrated strong performance across various tasks. Even when subjected to unknown external forces, the controller maintained robust performance in target tracking. The feasibility and effectiveness of the DSRE were further verified through ex vivo experiments. We posit that this novel system holds significant potential for clinical application in ESD surgeries.