Project

Project Overview

During my Ph.D. at a government-funded research institute, I participated in 15 projects focused on AI and ML-driven medical imaging innovations. I worked on real-world clinical applications, solving complex challenges in data preprocessing, model optimization, and large-scale learning. This section highlights a selective set of projects where my contributions were most impactful.

Key Expertise & Technologies

Machine Learning & AI: Medical imaging, pattern recognition, large-scale learning algorithms
Deep Learning: Foundation models, LLMs, Generative AI (GenAI)
Computer Vision: Image processing, segmentation, multi-modal data integration
Real-World Data Processing: Preprocessing heterogeneous clinical datasets, optimizing workflows

Medical image analysis skills I've acquired through various courses, including those from my PhD studies and Fast Campus, and then introduce the real-world projects I've contributed to

First, I will showcase the technical skills I have learned and applied through coursework and assignments during my Ph.D. program and Fast Campus training. Then, I will introduce the real-world projects I have contributed to while working at research institutes. For the projects I have led independently, I have specified the institution and the duration of the project to provide clear context.

Radiology & Pathology Image Preprocessing
X-ray, CT, MRI, Pathology
MONAI, PyTorch, TensorFlow, DICOM, PyDicom, Nibabel, SimpleITK, etc.

Medical Image Classification
Medical Image Detection
Medical Image Segmentation
Medical Image Synthesis
Medical Image Scoring & Regression
Evaluation (Pearson Correlation Coefficient (PCC), Confusion Matrix, Spearman’s Rank Correlation, Kendall’s Tau, PSNR (Peak Signal-to-Noise Ratio), SSIM (Structural Similarity Index Measure), MSE (Mean Squared Error), RMSE (Root Mean Squared Error), MAE (Mean Absolute Error), and R² (Coefficient of Determination))

Invisible to Visible: Enhancing Soft Tissue Visibility in CT Imaging (2022–2024)
Korea Intituete of Science and Technology, Bionics Research Center

From 2022 to 2024, I conducted research under the theme "Invisible to Visible," focusing on enhancing the visibility of soft tissues in CT imaging. This study leverages advanced AI-based image translation techniques to reconstruct and highlight anatomical structures that are traditionally challenging to visualize in CT scans. By integrating multi-modal imaging data, particularly MR-guided segmentation and CT-to-MR translation, this research aims to bridge the gap between imaging modalities, improving diagnostic accuracy and clinical applicability. The findings of this study are currently under review in a top-tier journal.

***While detailed content cannot be shared at this stage, I will update this section once the paper is published online***

Image Analysis of Human Cerebral Small Vessel Disease (2019–2021)
Korea Intituete of Science and Technology, Brain Science Institute

This collaborative research with a Malaysian medical university focused on analyzing cerebral small vessel disease (CSVD) through AI-driven image analysis. The project brought together a clinical research team, led by a physician PI in Malaysia, and our computational research lab, creating a unique synergy between clinical expertise and AI-based medical imaging techniques.

During this collaboration, a Malaysian postdoctoral researcher visited our lab, where I provided computational expertise, including object detection, medical image preprocessing, and AI-based diagnostic techniques. This experience reinforced the critical role of data preprocessing in medical imaging, as even minor variations significantly impacted diagnostic outcomes. Additionally, I introduced VR-based 3D tracing, enabling direct visualization of cerebral vasculature and brain structures, which enhanced the understanding of CSVD progression.

Our interdisciplinary approach led to a joint conference presentation and the publication of our research in Frontiers in Cardiovascular Medicine in 2021.

fcvm-08-632131.pdf

A 3D Image Analysis of Human Cerebral Small Vessel Disease (CSVD) (First Author)

Conference: The 7th Asian Oceanian Congress on Clinical Neurophysiology (AOCCN), January 2021
Title: Non-Invasive 3D Image Analysis of Microcirculation Vascular Integrity in Asymptomatic Cerebral Small Vessel Disease
Collaboration: Dr. Che Mohd Nasril (Postdoc) & Professor Muzaimi Mustapha (Wet Lab)
My Role: Computational analysis using imaging informatics techniques (Dry Lab)

Summary

In this study, I contributed to the development of neuTube, a neural tracing software designed to detect cerebral small vessel disease (CSVD) using object detection-based AI techniques. By leveraging AI-driven image analysis, our method enables rapid and accurate identification of CSVD, facilitating early diagnosis and potential intervention.

Key Highlights

Background: CSVD is a neurological disorder affecting small brain vessels, often linked to stroke, dementia, Alzheimer’s disease (AD), and depression.
Objective: Investigate microcirculation vascular integrity in individuals with and without CSVD through advanced image analysis.
Methods:
- Utilized neuTube for 3D reconstruction of cerebral vasculature from non-invasive 3D-time-of-flight MRA images.
- Conducted quantitative analysis to assess microcirculation integrity.
Results:
- neuTube successfully reconstructed fine details of the cerebral blood vessel network.
- Analysis of ten human MRA images provided insights into vascular differences in CSVD patients.
- Further quantitative studies will enhance clinical applications.
Conclusion: This study demonstrates a non-invasive AI-driven approach to assess cerebral small vessel integrity, offering a promising tool for early diagnosis and monitoring of CSVD.

Mask R-CNN-Based Nuclei Detection and Segmentation (2019-2020)

Objective: Improve the detection and segmentation of densely packed nuclei in medical images using a Mask R-CNN-based approach.
Challenge: Traditional methods rely on fully annotated training images, which are time-intensive to create. To address this, the project explored learning from partially labeled exemplars to enhance efficiency.
Proposed Approach:
- Designed a centerness score and bounding box prediction task using partially labeled ground truth.
- Implemented predictions at FPN1, the first feature level of the backbone network, to capture most nuclei in the image.
- Integrated standard Mask R-CNN components, including Backbone, RPN module, RCNN head, and Mask head.
- Added a decomposed self-attention (SA) module to improve feature extraction.
My Role: Assisted my PI’s research, contributing to data preparation, model evaluation, and optimization of the Mask R-CNN framework.
Key Contribution: Enhanced segmentation performance by leveraging similarities among nuclei to improve prediction with partially labeled data.
Outcome: Improved segmentation accuracy while reducing dependency on fully annotated datasets. The source code is publicly available at GitHub.

Daewoong Pharmaceutical, Co., Ltd. (July 2021 – March 2022)
Ph.D. Internship – AI Engineer Intern

Developed an AI-powered defect detection system for cardiovascular stents, automating quality control and reducing human inspection by 80%.
Implemented a camera-based defect detection system, significantly improving inspection accuracy and operational efficiency.
Extended internship twice for exceptional performance, completing three consecutive terms.

To gain access to high-quality coursework, I participated in exchange programs at KAIST and Korea University, where I took advanced AI and deep learning courses. These courses had a competitive grading system, and my scores were converted to a perfect 100 under the relative grading scale.

In the deep learning course, each project required end-to-end implementation, including coding, model development, report writing, and a final presentation, all within just one week per project. Over the span of four projects, I consistently achieved top scores, demonstrating strong technical and analytical skills. Additionally, I submitted Kaggle code contributions as part of my coursework.

The following section showcases the projects I completed, highlighting my approach and methodologies. 🚀

These projects were part of my training and research during my Ph.D. studies, which I began in September 2018 and successfully completed early in September 2021. 🚀

Brain Tumor Segmentation with BraTS 2020 Dataset (2020 Deep Learning Project)

For the first deep learning assignment, I worked with the BraTS 2020 dataset, which is widely used for multimodal brain tumor segmentation. The dataset consists of pre-operative multi-institutional MRI scans from 19 different institutions, including glioblastoma (GBM/HGG) and lower-grade glioma (LGG) cases, with expert-annotated tumor regions. The imaging data is provided in NIfTI format (.nii.gz) and includes four MRI modalities:

T1-weighted (T1): Native MRI scan
T1Gd (T1-post contrast): Post-contrast enhanced T1-weighted scan
T2-weighted (T2): Fluid-sensitive scan
T2-FLAIR (Fluid Attenuated Inversion Recovery): Highlights edema and tumor regions

Each scan is manually annotated by board-certified neuroradiologists, with labeled tumor subregions:

Enhancing Tumor (ET - label 4)
Peritumoral Edema (ED - label 2)
Necrotic/Non-Enhancing Tumor Core (NCR/NET - label 1)

Additionally, survival data, including overall survival (OS) in days, patient age, and resection status, is provided for survival prediction tasks.

Challenge & Approach

This was my first assignment in the deep learning course, and a major challenge was the large size of the dataset, which significantly impacted training time. With only one week to complete the assignment, I needed to efficiently manage data preprocessing, model training, and submission to a Kaggle competition.

Handling Large-Scale Data
- The dataset required significant preprocessing, including normalization, skull-stripping, and data augmentation.
- Given the high computational cost, I optimized batch sizes and memory usage to ensure smooth training within the given time.
Model Development & Implementation
- I implemented a deep learning-based segmentation model trained on the multimodal MRI data.
- The U-Net architecture was used for tumor segmentation, leveraging multi-channel MRI inputs for feature extraction.
Results & Optimization
- I successfully submitted my model results within the deadline, adhering to the competition timeline.
- Later, with extended training time, I fine-tuned my model to improve segmentation accuracy and fully analyze model performance.

Key Takeaways

First hands-on experience with large-scale medical image datasets and deep learning segmentation models.
Gained insights into the computational challenges of training on large medical datasets within time constraints.
Successfully applied U-Net for brain tumor segmentation, refining the model post-submission for improved accuracy.

Survival Analysis using GAN (2020 Deep Learning Project)

In this project, the task was to predict patient survival based on demographic, clinical, and time-to-event data provided in an Excel dataset. However, a major challenge was the high proportion of missing data, making it difficult to train an accurate predictive model. Instead of discarding incomplete data, I focused on mathematically resolving this issue through an imputation strategy, which ultimately contributed to a higher evaluation score.

Approach & Methodology

Handling Missing Data with GAN-Based Imputation
- Traditional imputation methods, such as mean imputation, kNN imputation, or multiple imputation (MICE), often fail to capture the underlying complex relationships in medical data, leading to potential bias or information loss.
- To address this, I applied Generative Adversarial Networks (GANs) for missing data imputation. GANs are capable of learning the latent distribution of the data, generating plausible values that better reflect the original data structure.
- This approach was particularly beneficial in this study, as patient survival data often involves nonlinear interactions between clinical variables, which GANs are well-suited to model.
Survival Prediction using Deep Learning
- After data imputation, I applied a linear regression-based classification model using a deep learning approach to predict survival outcomes.
- The goal was to determine how well machine learning methods could classify patient survival based on available features.
Kaplan-Meier Survival Estimation
- To further validate the deep learning predictions, I conducted a Kaplan-Meier survival analysis, a widely used non-parametric statistical method for estimating survival functions.
- This provided an intuitive visualization of survival probabilities over time and allowed for comparison with the deep learning classification results.
Comparison of ML-based and Statistical Approaches
- Finally, I compared the Kaplan-Meier estimates with the deep learning model predictions, assessing their relative effectiveness in survival prediction.

Key Takeaways

GAN-based imputation effectively handled missing data, offering a more sophisticated alternative to traditional imputation methods by capturing the underlying data distribution.
Deep learning-based classification demonstrated strong predictive capabilities but required further validation.
Kaplan-Meier analysis provided a statistical benchmark, offering interpretability to complement ML-based predictions.
This project reinforced the importance of handling missing data in medical datasets and showcased the potential of AI-driven approaches in survival analysis.

Lateral Ventricle Segmentation in Brain CT

For this project, the goal was to segment the lateral ventricles in brain CT images using a deep learning-based approach. The dataset consisted of NIfTI files, including:

Brain CT Slices (input images)
Lateral Ventricle Segmentation Slices (Ground Truth labels)
Test Slices (for model evaluation)

Approach & Methodology

Model Selection & Training
- Implemented a single-channel CNN trained on Brain CT slices with corresponding segmentation labels.
- Used cross-validation to improve generalization and fine-tune model performance.
Evaluation & Testing
- Evaluated the trained model using 100 test slices provided in the dataset.
- Measured segmentation accuracy using common performance metrics for medical image segmentation.

Key Takeaways

Applied CNN-based segmentation for brain structure analysis using single-channel medical imaging data.
Gained experience in training deep learning models on NIfTI-structured datasets.
Learned how to evaluate model performance effectively using test data.

This project reinforced my understanding of deep learning for medical image segmentation and provided insights into the challenges of anatomical structure identification in CT scans. 🚀

AI-Driven Genomic Analysis Using Google Cloud Platform: RNA and Exome Sequencing (2018-2019)

Upon entering graduate school in 2018, my primary goal was to make the invisible visible. Genes are not directly observable to the human eye, yet their expression patterns hold crucial insights into biological processes and disease progression. I wanted to leverage artificial intelligence to decode these patterns—predicting when and how gene expression occurs—and further visualize the likelihood and pathways of recurrence and metastasis.

To pursue this vision, I undertook a project utilizing Google Cloud Platform to analyze RNA sequencing and exome sequencing data. By applying AI-driven models, I aimed to identify meaningful genetic patterns and translate them into actionable insights. This project, which I initiated immediately upon starting graduate school, reinforced my commitment to integrating AI with genomics to uncover hidden biological mechanisms.

• LAB Project : "Construction and Application of Bioinformatics Service System for Big Whole-Genome Data." , “Prediction and Application of Pathogenicity of Genome Variants by Machine Learning.”

• Studying RNA Seq Analysis of Human Diseases, Whole Exome Genome Analysis of Rare diseases, Genomic Prediction & Marker Selection, Development of Genomic Prediction and GWAS Mostly, I performed Whole Exome Sequencing, RNA Sequencing with Autism Spectrum Disorder Datasets

A significant portion of my work involved analyzing Whole Exome Sequencing and RNA Sequencing data from Autism Spectrum Disorder (ASD) datasets, leveraging Linux and Google Cloud-based computing (Google Genomics). The flexibility of Google Cloud GPUs played a critical role in enabling large-scale genomic analysis, as they provided high-performance computing power that could be accessed anytime, anywhere—offering a significant advantage for researchers handling big genomic data.

I have been fortunate to explore a wide range of research topics, and this would not have been possible without the guidance and support of the senior researchers I had the privilege to learn from. Their mentorship shaped my growth, and I deeply respect and appreciate the time, effort, and wisdom they shared with me. I am profoundly grateful to each of them for their generosity and guidance. I could never have come this far alone.👏

My PhD journey was unconventional, as I pursued my degree while working as a researcher at a government-funded institute. Frequent transitions of senior researchers, many of whom moved to university positions, led me to adapt to multiple research topics. While focusing on government projects limited my publications, it allowed me to explore diverse fields. I truly enjoyed my PhD journey, as I love learning new things and appreciated the opportunity to gain knowledge across disciplines. I am grateful for these experiences and look forward to continuing my research with the same curiosity and enthusiasm.🙂

Google Sites

Report abuse

Project

Key Expertise & Technologies

Invisible to Visible: Enhancing Soft Tissue Visibility in CT Imaging (2022–2024)Korea Intituete of Science and Technology, Bionics Research Center

Image Analysis of Human Cerebral Small Vessel Disease (2019–2021)Korea Intituete of Science and Technology, Brain Science Institute

A 3D Image Analysis of Human Cerebral Small Vessel Disease (CSVD) (First Author)

Mask R-CNN-Based Nuclei Detection and Segmentation (2019-2020)

Daewoong Pharmaceutical, Co., Ltd. (July 2021 – March 2022)Ph.D. Internship – AI Engineer Intern

Brain Tumor Segmentation with BraTS 2020 Dataset (2020 Deep Learning Project)

Challenge & Approach

Key Takeaways

Survival Analysis using GAN (2020 Deep Learning Project)

Lateral Ventricle Segmentation in Brain CT

Approach & Methodology

Key Takeaways

AI-Driven Genomic Analysis Using Google Cloud Platform: RNA and Exome Sequencing (2018-2019)

Invisible to Visible: Enhancing Soft Tissue Visibility in CT Imaging (2022–2024)
Korea Intituete of Science and Technology, Bionics Research Center

Image Analysis of Human Cerebral Small Vessel Disease (2019–2021)
Korea Intituete of Science and Technology, Brain Science Institute

Daewoong Pharmaceutical, Co., Ltd. (July 2021 – March 2022)
Ph.D. Internship – AI Engineer Intern