The Team
Our team consists of exceptional data scientists and faculty collaborators across multiple institutions who are keen to further our understanding on cancers and are in a mission to end this deadly disease. Here is a short summary of who they are (in alphabetical order), and the expertise they hold.
MDA - UT MD Anderson Cancer Center, UT - University of Texas at Austin, Mayo - Mayo Clinic, NUS - National University of Singapore
Data Scientist |
Position |
Expertise |
Ali Pirani |
Data Scientist, MDA |
Spatial modeling, Graph Engineering |
Gayatri Kumar |
Post-doc, Mayo |
Spatial modelling |
Prahlad Bhat |
Undergraduate, UT |
Spatial modeling |
Matt Flick |
MD PhD Student, Mayo |
Graph Engineering |
Shruti Sridhar |
PhD Candidate, NUS |
Spatial Modeling |
Yang Liu |
PhD Candidate, MDA |
Graph Engineering |
Faculty Collaborator |
Position |
Expertise |
Dr. Anand Jeyasekharan, MD |
Assistant Prof, NUS |
Hematology |
Dr. Jason Huse, MD PhD |
Professor, MDA |
Neuropathology |
Dr. Kanishka Sircar, MD |
Professor, MDA |
Anatomical Pathology |
Dr. Krishna Bhat, PhD |
Associate Prof, Mayo |
Cancer Biology |
Dr. Leland Hu, MD |
Associate Prof, Mayo |
Radiology |
Dr. Nhan Tran, PhD |
Professor, Mayo |
Cancer Biology |
Dr. Yinyin Yuan, PhD |
Professor, MDA |
Machine Learning/AI |
A little about me
I am an Associate Professor leading a team on spatial and systems biology initiatives at The University of Texas MD Anderson Cancer Center.
Professionally, I am a data scientist (15 years experience) with expertise in machine learning, AI and bioinformatics.
My several years of data-driven training include statistical modeling and image processing with several software development skills primarily using R, and Python libraries and databases such as Neo4j and PostgreSQL.
My research interests are in systems biology and spatial modeling of tissues.
My science interest is diverse and includes cancer biology, formal logic, physical theories, computing, and mathematics.
My interest in biology and in particular cancer genomics has resulted in numerous publications. Also, I am very passionate about teaching
and have actively designed, developed and directed data science, machine learning and
biomedical informatics courses.
Professionally, I have a primary appointment in the Department of Translational Molecular Pathology with a
joint appointment in the Department of Neurosurgery at MD Anderson Cancer Center, Houston. Moreover, I hold a research collaborator appointment at the Mayo Clinic in Arizona.
I hold the following formal trainings:
Training |
Institution |
Date |
Post-doctoral |
Memorial Sloan-Kettering Cancer Center |
2011-2013 |
PhD, Computer Science |
Texas A&M University, College Station |
2002-2008 |
MS, Mathematics |
Texas A&M University, College Station |
2000-2002 |
MSc, Mathematics |
Indian Institute of Technology, Madras |
1998-2000 |
BSc, Mathematics |
University of Madras, Chennai |
1995-1998 |
Click below to view/download my CV.
Curriculum Vitae
Active Projects
Our team is working on data science projects that are focussed on two exciting and emerging areas (1)
Spatial Pathology (2) Systems Biology and AI. Here are the brief descriptions of some of the projects we do.
Geospatial modeling of tissues - Spatial point processes are powerful statistical frameworks for studying point patterns. By representing cells as points and annotating the measurements taken on those cells, such as gene expression at single cell level, it is appropriate to study their interactions using point processes. We extensively use spatstat, an R package, to model interactions and derive directed insights in cancers.
Biomarker discovery using graph database - Graph databases can help identify biomarkers by efficiently representing complex biological networks. By enabling powerful queries and community detection algorithms, graph databases make exploring the relationships between multiple genes easier, thus facilitating the discovery of potential biomarkers critical for diagnostics or therapeutic targets. We use Neo4j, a property graph database, and algorithms from the Graph Data Science Library (GDS) to derive insights and propose actionable biological targets. The concept paper for this effort can be viewed here.
Biomarker validation using Graph Neural Networks (GNNs) - Insilico validation of biomarkers is critical for providing actionable biological targets for experiments and prognosis. We apply GNNs (convolutional, attention etc.) to validate critical biomarkers discovered through the graph database. GNNs enhance biomarker discovery by leveraging the structure of biological networks, such as gene-gene interactions, and learning meaningful node/edge representations. By aggregating information from a node's neighbors, GNNs capture local and global patterns, allowing the model to predict relationships between genes and disease associations more effectively.
Past Projects
I have obtained several directed insights in biology through data science applied to genomics data.
A majority of my contributions in science are in omics and numerical algorithms applied to physics problems aka
computational physics.
Contributions in omics highlights my key work during the years 2009-2020, primarily at
New York University where I was a faculty and at Memorial Sloan-Kettering Cancer Center where I was a post-doc.
Contributions in computational physics marks my work as a graduate student at Texas A&M University between the years
2005-2008.
Teaching
Over the course of years and during my tenure at New York University School of Medicine, I developed and taught several topics
in biomedical informatics.
Programming for Data Analysis is about the fundementals of data science using the R programming language.
This course is primarily based on tidyverse and ggplot packages. We also cover a bit of mathematical modeling (such as optimization)
towards the end of the course, but the main focus of the course is for non-programmers to get some expereince in doing data analysis. As a case study for analysis
we use clinical databases (diabetes and critical care databases).
Machine Learning and AI consists of several important topics in this area such as
classification, ensemble methods, feature selection and regularization. The focus is on the depth of these topics from a statistical perspective. For example,
the lecture on ensemble methods would tell us the statistical basis for bootstrapping and random forests. The idea is to make students realize that machine
learning is not just programming or data exploration but it is actually statistics. In contrast, this lecture series also contains an "hands-on" tutorial on AI based image classification.
Methods in Quantitative Biology is a set of four disparate lectures I developed in Fall 2017
as a part of biomedical informatics program at NYU. I believe these topics are a part of core subjects in informatics that data science students need to gain
a good understanding. For example, algorithms are the core engines of any computing task and it is important to understand the analysis of their complexities.
Similarly, linear algebra is a very important topic and plays a crucial role, be it quantum computing or deep learning.
A consolidated version of Programming for Data Analysis and Methods in Quantitative Biology courses can be found
here.
Contact
Elements
Text
This is bold and this is strong. This is italic and this is emphasized.
This is superscript text and this is subscript text.
This is underlined and this is code: for (;;) { ... }
. Finally, this is a link.
Heading Level 2
Heading Level 3
Heading Level 4
Heading Level 5
Heading Level 6
Blockquote
Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget tempus euismod. Vestibulum ante ipsum primis in faucibus vestibulum. Blandit adipiscing eu felis iaculis volutpat ac adipiscing accumsan faucibus. Vestibulum ante ipsum primis in faucibus lorem ipsum dolor sit amet nullam adipiscing eu felis.
Preformatted
i = 0;
while (!deck.isInOrder()) {
print 'Iteration ' + i;
deck.shuffle();
i++;
}
print 'It took ' + i + ' iterations to sort the deck.';
Lists
Unordered
- Dolor pulvinar etiam.
- Sagittis adipiscing.
- Felis enim feugiat.
Alternate
- Dolor pulvinar etiam.
- Sagittis adipiscing.
- Felis enim feugiat.
Ordered
- Dolor pulvinar etiam.
- Etiam vel felis viverra.
- Felis enim feugiat.
- Dolor pulvinar etiam.
- Etiam vel felis lorem.
- Felis enim et feugiat.
Icons
Actions
Table
Default
Name |
Description |
Price |
Item One |
Ante turpis integer aliquet porttitor. |
29.99 |
Item Two |
Vis ac commodo adipiscing arcu aliquet. |
19.99 |
Item Three |
Morbi faucibus arcu accumsan lorem. |
29.99 |
Item Four |
Vitae integer tempus condimentum. |
19.99 |
Item Five |
Ante turpis integer aliquet porttitor. |
29.99 |
|
100.00 |
Alternate
Name |
Description |
Price |
Item One |
Ante turpis integer aliquet porttitor. |
29.99 |
Item Two |
Vis ac commodo adipiscing arcu aliquet. |
19.99 |
Item Three |
Morbi faucibus arcu accumsan lorem. |
29.99 |
Item Four |
Vitae integer tempus condimentum. |
19.99 |
Item Five |
Ante turpis integer aliquet porttitor. |
29.99 |
|
100.00 |