Zhiyong Lu grew up in an ancient city in China outside of Shanghai. Because he was good at math and the STEM subjects, a high school teacher encouraged Zhiyong to study computer science in college, which he did. Zhiyong attended Nanjing University, one of China’s top schools. In addition to computer science, he also studied English to prepare for graduate school in North America.
Choosing to continue studying computer science, Zhiyong matriculated to the University of Alberta in Edmonton, Canada to pursue a master’s degree. The University of Alberta was regarded as having a great computer science program and excelled in artificial intelligence, which was of interest to Zhiyong.
While working on his master’s, a computer science progressor suggested that Zhiyong consider focusing on a bioinformatics project for his thesis. This was Zhiyong’s first exposure to the field of bioinformatics and to the idea that computers could be used to solve biology problems. He ended up doing his thesis on using machine learning to do sub-cellular localization prediction.
“I had never thought about applying computers to biology . . . I found it fascinating that you can use computer science to solve a biology or medicine problem.”
Zhiyong’s interest in the intersection of computer science and biology led him to attend two bioinformatics conferences, which only deepened his curiosity. He made contacts in the field and decided to pursue a PhD in bioinformatics at the University of Colorado School of Medicine.
In working toward his PhD, Zhiyong took several courses in biology, which he hadn’t studied since high school and which proved challenging. He also took courses in natural language processing and AI, which have since become his research focus.
Early on in the PhD program, Zhiyong saw himself on the path of probably becoming a professor. But through the rigor of the program at the University of Colorado, he saw himself turn into a researcher and an independent scientist. Over a roughly five-year period, Zhiyong matured as a researcher, learned about writing and publishing, and got exposure to reviewing papers and writing grants.
“[The PhD program at the University of Colorado] turned me from a very good student and prepared me to be a researcher, to be an independent scientist.”
It turned out that as part of Zhiyong’s PhD thesis he was doing a great deal of work with data coming from the National Library of Medicine (NLM). This exposed him to NLM and caused others to joke with him that due to all of the data he was accessing from the NLM, perhaps the NLM would hire him. Little did they know that this would prove to be true, though Zhiyong says this particular project was not the reason he was hired by NLM.
Zhiyong joined the National Center for Biotechnology (NCBI), which is part of NLM, in 2007. He uses natural language processing and text mining to do biomedical literature mining. He is constantly engaging in interesting research and is also creating and improving real-world applications. An example is using natural language processing to make PubMed—a research tool used by 2.5 million people worldwide—even more effective.
“I think research is fun. Research is really exciting. And on top of that, if I can apply my research into real-world applications and make some real-world effect, I can have an impact on end users. That’s extremely rewarding.”
These days, Zhiyong is branching out by engaging in some medical image analysis. Specifically, he is working on deep learning and machine learning algorithms that take retinal images of people’s eyes and do a better job than doctors at diagnosing and predicting macular degeneration.
Whether a person is working in industry, academia, or government (as Zhiyong does), he sees it as a tremendously exciting time for the field of biomedical informatics.