Your Breath Could Reveal Cancer: How AI Is Changing Early Detection

Tecelra and MAGE-A4 TCR Therapy: Engineering Safer, High-Affinity Adoptive T-Cell Immunotherapy

Artificial Intelligence has revolutionized cancer screening over the last decade and finding its way ahead in drug screening for the personalized medicine. AI enables oncologist to identify radiological scans patterns, predict disease progression, offer tailored therapies, and predict clinical trial outcomes using simulated algorithmic models. The technology utilizes machine learning (ML) and deep learning (DL) frameworks that use multi-layered web of algorithms to process information and learn from it automatically, without explicit programming, thereby mimicking aspects of human cognition.

AI Models uses:

Predictive AI: Learn patterns from training data to forecast outcomes in new scenarios. This model has been extensively used to predict the structure of oncological target antigens and to diagnose breast cancer from the mammograph scan data.

Generative AI: Creates novel outputs that are not explicitly in the training data. Applications include high-throughput drug screening through bioinformatics and computational biology–based pharmacophore modelling, as well as AI chatbots that interact with patients to support care and communication.

Application of AI models in oncology

Diagnosis & Classification of cancer: Machine learning algorithms, particularly convolutional neural networks and deep learning architectures, successfully classify cancerous versus benign images with accuracy comparable to clinicians by learning patterns invisible to the human eye, especially in cases of skin and breast cancers.

Prediction of disease prognosis: AI-based prognostic models generate individualized risk predictions using both structured and unstructured data from electronic health records. These models can calculate 180-day risk of mortality with high accuracy (AUC: 0.95–0.96), offering a personalized, data-driven alternative to conventional prognostic models and decision-making frameworks derived from randomized controlled trials.

Personalised Medicine: Deep Learning and Neural network models are utilised to predict the protein structure of oncology markers, thus providing models for designing pharmacophore models that reinforces design of novel anti-cancer compounds and optimization of drug-like properties.

Chatbots for improving patient care: Generative AI Large Language Models (LLMs) process large text-based datasets to o generate context-aware responses to patient concerns and queries. LLM chatbots offer support to patient education, enhance patient-clinician communication, and assist with mental health services by encoding the clinical knowledge, automating the medical documentation, enhancing the telemedicine interactions, and assisting in clinical trial enrolment.

Furthermore, recent advances in deep learning, including attention mechanisms, residual networks, ensemble learning, and hierarchical architectures offer powerful tools to distinguish subtle, nonlinear, and overlapping features of the breath profiles of cancer patients.

Breath Analysis for Multi-cancer Screening

Byeongju Lee & colleagues at the Korea Advanced Institute of Science and Technology (KAIST) and Electronics and Telecommunications Research Institute (KTRI), Korea developed a hierarchical deep convolutional neural network (HD-CNN)-based platform for dual cancer classification using a multimodal gas sensor array. The HD-CNN model employs a two-stage structure:

Coarse classifier-distinguishes healthy controls from cancer patients.

Fine classifier-classifies lung cancer and gastric cancer cases.

This hierarchical design achieved superior classification performance, with the coarse classifier distinguishing healthy controls from cancer patients with accuracies of 82.1% and 85.9%, respectively, while the fine classifier differentiated lung cancer and gastric cancer with accuracies of 95.1% and 98.1%, respectively.

The sensor array, made of a semiconductor metal oxide (SMO), electrochemical (EC), and photoionization detector (PID) sensors housed within a flow-optimized chamber, generating the time-resolved signals into 2D response maps for classification of cancer types, achieving an overall accuracy upto 88%. The sensor captures volatile organic compounds (VOCs) such as such as ethanol, isobutanol, formaldehyde, carbon monoxide, and hydrogen sulfide, in exhaled breath at concentrations ranging from ppb to ppm. These VOCs are abundant in cancer patients as the byproducts of metabolic processes, also serving as a disease-specific biomarkers. The device focuses on the overall “breathprint” pattern of the individuals that reflects integrated metabolic signatures, rather than targeting individual VOCs.

The device is designed such that the gas chamber is enclosed within a thermally insulated housing, which minimizes temperature-induced baseline drift and improves signal stability during prolonged clinical operation. The closed-loop feedback system to precisely regulate the heating profile, thereby protecting sorbent integrity and ensuring consistent VOC release on the sensors. In addition, an FPGA-based analog-to-digital converter (ADC) board enables high-speed and high-resolution signal acquisition by stable and synchronized sampling across multiple sensor channels. Each sensor is sampled at 50 Hz using an ADC chip (ADS8556, Texas Instruments), with signals routed through an FPGA (XC3S200AN-4FTG256C, AMD Xilinx), generating values at the rate of one per second

Scope & Applications:

The 2D Breadth analyzer, thus offers a robust screening method, capable of identifying multiple cancer types from a single breath sample within 30 minutes of sample introduction into the sensor chamber. The HD-CNN model used in the device achieved an accuracy of 84.7%, as compared to the ResNet (78.4% accuracy) and Transformer (79.3% accuracy).

The HD-CNN framework represents a critical step toward practical and scalable cancer screening. By reducing diagnostic burden and improving accessibility and cost-efficiency, such systems have strong potential for deployment across diverse clinical settings. Such robust classification models with high sensitivity and specificity are essential to accurately distinguish the subtle, nonlinear, and overlapping features of oncogenic markers that vary across individuals and disease stages. The model is further being studied on expanding the sample size, incorporating additional diseases such as chronic obstructive pulmonary disease (COPD), and validating the system in larger clinical settings.

References:

1) Kolla, L., & Parikh, R. B. (2024). Uses and limitations of artificial intelligence for oncology. Cancer, 130(12), 2101–2107. doi.org/10.1002/cncr.35307

2) Lee, B., Lee, J., Noh, H., Bahn, H. K., Jeon, J. H., Park, I., Jheon, S., & Lee, D. S. (2026). Advanced breath analysis through hierarchical deep convolutional neural network for multi-cancer screening. NPJ digital medicine, 9(1), 138. doi.org/10.1038/s41746-025-02319-1