Intelligent Multidimensional Data Analysis for Imaging and Medical Advancement
As the mechanisms of biomolecular interactions, which are the key to finding the causes of diseases and developing new drugs, have yet to be fully understood, developing the concepts and tools of multidimensional data analysis and image recognition can help advance medical science and other fields to the next level. Having made a significant contribution to image and biomolecular pattern recognition techniques, Professor Yan Hong, a CityU expert in imaging science, has proposed new theories and the computation of complex tensors to expand its application in imaging, biology, medicine and beyond.
Professor Yan’s current research focuses on tensor computing to detect and analyse meaningful patterns in datasets. A tensor is a multidimensional array of data. In mathematics, a number can be considered as a tensor of order zero, a vector as a first-order tensor, and a matrix as a second-order tensor. These data representations and structures are now well-understood. “However, the existing mathematical theories and computation methods are far from mature for analysing higher-order tensors of order three or more. We need new concepts and theories for tensors, which cannot be simply extended from matrix theories,” explained Professor Yan, Wong Chun Hong Professor of Data Engineering and Chair Professor of Computer Engineering in the Department of Electrical Engineering.
New theories for analysing higher-order tensors
Although biomolecules and image analysis are studied in two different disciplines, biomolecular interactions appear to follow the same principle of the perception of images by computer technology. “Computers recognise an object in an image with consistent positions of points, lines, areas and their relations. Similarly, two molecules interact with each other because they fit consistently with complementary surfaces and charges,” he elaborated. “Therefore, understanding tensors is crucial. They provide a rigorous mathematical model to represent consistent features and their higher-order relations.”
Working in close collaboration with mathematicians, biologists, medical doctors and computer engineers, Professor Yan and his team have developed co-clustering methods, based on tensor models. While conventional machine learning and pattern recognition methods classify objects according to their features, their new approach can classify both objects and features.
“For example, a group of genes may be co-regulated under a group of conditions. These genes and conditions form co-clusters. If there are many genes and conditions, the computational time will increase exponentially,” he said. “But our group has solved this problem using tensor methods. Our new method enables the simultaneous detection of several types of co-clusters, which can even overlap in the data.”
Based on tensor and hypergraph models, the research team has developed efficient computer algorithms for matching datasets. They have solved an optimisation problem to deal with all the compatibilities among matched data entries through high-order relations.
Application in lung cancer cell mutation analysis
Furthermore, Professor Yan has applied tensor computing to cell division data analysis and biomolecular surface characterisation. Lung cancer is the leading cause of cancer deaths worldwide. Non-small-cell lung cancer (NSCLC) constitutes about 85% of all lung cancer cases. Mutation of the epidermal growth factor receptor (EGFR), a type of protein, is a common cause of NSCLC, whose incidence can reach 60% in East Asian populations.
Working with medical doctors at Queen Mary Hospital in Hong Kong, Professor Yan’s team analysed all known EGFR mutants and created a database of their 3D structures. The innovative methods proposed will help researchers understand the mechanisms of drug resistance and help doctors plan optimal personalised treatment for cancer patients.
In addition to contributing to medical advancement, Professor Yan has used tensor models to tackle other problems in science and engineering. One discovery involves detecting objects in images and tracking motion in videos, which does not require prior training and represents a major improvement over commonly used classifier-based systems.
Professor Yan and his team will continue to work on tensor and hypergraph theories with the aim of developing robust computer algorithms and parallel processor-based hardware and software, and applying them to many more real-world systems for image, video and biomedical data analysis.
This research article originated from CityU RESEARCH.