We propose “DUal-NET”, a novel transformer-based model for enhancing speech capture through bone conduction headsets in human-centered sensing systems. As wearable bone-conduction devices become increasingly important for continuous health monitoring and ambient computing, they face a unique challenge - bone-conduction microphones can receive significant interference from speakers playing audio to the user, this occurs because the headset is directly in contact with the skull and induces vibrations similar to human speech, much like a user speaking, degrading speech recognition accuracy and communication quality. Existing state-of-art speech enhancement and sound source separation methods are ‘blind’ and assume that the interference noise is not available due to the inherent difficulty in observing clean correlated noise. By contrast, headsets have full knowledge of the sounds they play through their speakers, and DUal-NET takes advantage of this raw signal in its denoising process. We demonstrate that DUal-NET can significantly improve standard speech quality metrics over existing state-of-art methods in realistic scenarios (PESQ - 135%, STOI - 50%, LSD - 66%), enabling more accurate speech sensing for human-centered applications including health monitoring, personalized assistants, and augmented communication.