
Avi Ma'ayan, PhD
About Me
Dr. Ma’ayan is a Mount Sinai Endowed Professor in Bioinformatics, Director of the Mount Sinai Center for Bioinformatics, Professor in the Department of Pharmacological Sciences, Professor in the Department of Artificial Intelligence and Human Health, and faculty member of the Icahn Genomics Institute. Dr. Ma'ayan is also a Principal Investigator of the NIH Common Fund Data Resource Center (DRC) for the Common Fund Data Ecosystem (CFDE), a NCI-funded ITCR resource center, a NIDDK-funded diabetes hypothesis platform, and the NCI-funded Mount Sinai Proteogenomic Data Analysis Center. The Ma'ayan Laboratory applies computational methods to study the inner workings of regulatory networks in mammalian cells. His research team applies machine learning and other statistical mining techniques to study how intracellular regulatory systems function as networks to control cellular processes such as differentiation, dedifferentiation, apoptosis and proliferation. The Ma'ayan Laboratory develops bioinformatics software applications to enable experimental biologists to form novel hypotheses from high-throughput omics datasets, while aiming to better understand the structure and function of regulatory networks in mammalian cellular and multi-cellular complex systems.
Avi Ma'ayan's Publications on PubMed | Google Scholar | ResearchGate
Featured Software Tools Developed by the Ma'ayan Laboratory:
- Rummagene: Massive mining of gene sets from supporting materials of biomedical research publications
- RummaGEO: Massive mining of gene expression signatures from the Gene Expression Omnibus (GEO)
- Playbook Workflow Builder: Interactive platform to construct bioinformatics workflows
- D2H2: Platform to facilitate data-driven hypotheses for the diabetes research community
- Enrichr: Comprehensive search engine for gene sets
- Enrichr-KG: Knowledge graph implementation of Enrichr
- Harmonizome: Uniformly processed datasets for biological knowledge discovery
- Appyters: Collection of web-based applications to execute bioinformatics workflows
- ARCHS4: Uniform alignment of all human and mouse RNA-seq samples from the Gene Expression Omnibus (GEO)
- BioJupies: Automatically generates RNA-seq data analysis notebooks
- ChEA3: ChIP-X enrichment analysis version 3
- KEA3: Kinase enrichment analysis version 3
- TargetRanger: Tool to identify cell surface immunotherapeutic targets
- GeneRanger: Expression of human genes and proteins across human cell types, tissues, and cell lines across multiple atlases
- Geneshot: Search engine for ranking genes from arbitrary text queries
- SigCom LINCS: Comprehensive search engine and data portal for selected datasets from the LINCS program
- L1000FWD: Large-scale visualization of drug-induced transcriptomic signatures
- Clustergrammer: Visualization and analyis tool for high-dimensional biological data
- lncHUB2: Functional predictions of human long non-coding RNAs based on lncRNA-gene co-expression correlations
- FAIRshake: Platform for evaluating the adherence of digital objects with the Findable, Accessible, Interoperable, and Reusable (FAIR) principles
For a complete list of bioinformatics software applications developed by the Ma'ayan Lab, please visit the Resources page.
NIH-funded Centers:
- Data Resource Center (DRC) for the Common Fund Data Ecosystem (CFDE) (2023-2028)
- Mount Sinai's Proteogenomic Data Analysis Center (PGDAC) (2022-2027)
- ARCHS4 an Informatics Technology for Cancer Research (ITCR) Resource (2022-2027)
- Diabetes Data and Hypothesis Hub (D2H2) (2022-2025)
In the News:
- Mount Sinai Researchers Mined 800,000 Gene Sets by Scanning Supporting Materials of 6.4 Million Research Publications
- Icahn School of Medicine at Mount Sinai and the University of California San Diego Receive $8.5 Million Award to Establish a Data Integration Hub for NIH Common Fund Supported Programs
- Researchers Develop AI Model to Better Predict which Drugs May Cause Birth Defects
- Genes to Potentially Diagnose Long-Term Lyme Disease Identified
- Mount Sinai Designated as National Cancer Institute Proteogenomics Data Analysis Center
- Mount Sinai Lab Creates Shared Database to Help Scientists Find Drugs That Can Be Used to Treat COVID-19
- Ten Renowned Mount Sinai Faculty Members Honored at Convocation
- Mount Sinai Researchers Develop Software to Measure the Findability, Accessibility, Interoperability, and Reusability of Biomedical Digital Research Objects
- Mount Sinai Researchers Develop Tool that Analyzes Biomedical Data within Minutes
- Mount Sinai Researchers Receive NIH Grant to Develop New Ways to Share and Reuse Research Data
- Students Harness Big Data to Help Solve Medical Challenges
- Crowdsourcing for Scientific Discovery
- Genetics: Big Hopes for Big Data
Language
Position
Research Topics
Addiction, Aging, Bioinformatics, Biomedical Sciences, Biostatistics, Cancer, Computational Biology, Diabetes, Drug Design and Discovery, Gene Expressions, Gene Regulation, Genetics, Genomics, Kidney, Mass Spectrometry, Mathematical Modeling of Biomedical Systems, Mathematical and Computational Biology, Personalized Medicine, Pharmacogenomics, Pharmacology, Protein Complexes, Protein Kinases, Proteomics, Reprogramming, Signal Transduction, Stem Cells, Systems Biology, Systems Pharmacology, Technology & Innovation, Theoretical Biology, Transcription Factors, Viruses and Virology
Multi-Disciplinary Training Areas
Artificial Intelligence and Emerging Technologies in Medicine [AIET], Disease Mechanisms and Therapeutics (DMT), Genetics and Genomic Sciences [GGS]
About Me
Dr. Ma’ayan is a Mount Sinai Endowed Professor in Bioinformatics, Director of the Mount Sinai Center for Bioinformatics, Professor in the Department of Pharmacological Sciences, Professor in the Department of Artificial Intelligence and Human Health, and faculty member of the Icahn Genomics Institute. Dr. Ma'ayan is also a Principal Investigator of the NIH Common Fund Data Resource Center (DRC) for the Common Fund Data Ecosystem (CFDE), a NCI-funded ITCR resource center, a NIDDK-funded diabetes hypothesis platform, and the NCI-funded Mount Sinai Proteogenomic Data Analysis Center. The Ma'ayan Laboratory applies computational methods to study the inner workings of regulatory networks in mammalian cells. His research team applies machine learning and other statistical mining techniques to study how intracellular regulatory systems function as networks to control cellular processes such as differentiation, dedifferentiation, apoptosis and proliferation. The Ma'ayan Laboratory develops bioinformatics software applications to enable experimental biologists to form novel hypotheses from high-throughput omics datasets, while aiming to better understand the structure and function of regulatory networks in mammalian cellular and multi-cellular complex systems.
Avi Ma'ayan's Publications on PubMed | Google Scholar | ResearchGate
Featured Software Tools Developed by the Ma'ayan Laboratory:
- Rummagene: Massive mining of gene sets from supporting materials of biomedical research publications
- RummaGEO: Massive mining of gene expression signatures from the Gene Expression Omnibus (GEO)
- Playbook Workflow Builder: Interactive platform to construct bioinformatics workflows
- D2H2: Platform to facilitate data-driven hypotheses for the diabetes research community
- Enrichr: Comprehensive search engine for gene sets
- Enrichr-KG: Knowledge graph implementation of Enrichr
- Harmonizome: Uniformly processed datasets for biological knowledge discovery
- Appyters: Collection of web-based applications to execute bioinformatics workflows
- ARCHS4: Uniform alignment of all human and mouse RNA-seq samples from the Gene Expression Omnibus (GEO)
- BioJupies: Automatically generates RNA-seq data analysis notebooks
- ChEA3: ChIP-X enrichment analysis version 3
- KEA3: Kinase enrichment analysis version 3
- TargetRanger: Tool to identify cell surface immunotherapeutic targets
- GeneRanger: Expression of human genes and proteins across human cell types, tissues, and cell lines across multiple atlases
- Geneshot: Search engine for ranking genes from arbitrary text queries
- SigCom LINCS: Comprehensive search engine and data portal for selected datasets from the LINCS program
- L1000FWD: Large-scale visualization of drug-induced transcriptomic signatures
- Clustergrammer: Visualization and analyis tool for high-dimensional biological data
- lncHUB2: Functional predictions of human long non-coding RNAs based on lncRNA-gene co-expression correlations
- FAIRshake: Platform for evaluating the adherence of digital objects with the Findable, Accessible, Interoperable, and Reusable (FAIR) principles
For a complete list of bioinformatics software applications developed by the Ma'ayan Lab, please visit the Resources page.
NIH-funded Centers:
- Data Resource Center (DRC) for the Common Fund Data Ecosystem (CFDE) (2023-2028)
- Mount Sinai's Proteogenomic Data Analysis Center (PGDAC) (2022-2027)
- ARCHS4 an Informatics Technology for Cancer Research (ITCR) Resource (2022-2027)
- Diabetes Data and Hypothesis Hub (D2H2) (2022-2025)
In the News:
- Mount Sinai Researchers Mined 800,000 Gene Sets by Scanning Supporting Materials of 6.4 Million Research Publications
- Icahn School of Medicine at Mount Sinai and the University of California San Diego Receive $8.5 Million Award to Establish a Data Integration Hub for NIH Common Fund Supported Programs
- Researchers Develop AI Model to Better Predict which Drugs May Cause Birth Defects
- Genes to Potentially Diagnose Long-Term Lyme Disease Identified
- Mount Sinai Designated as National Cancer Institute Proteogenomics Data Analysis Center
- Mount Sinai Lab Creates Shared Database to Help Scientists Find Drugs That Can Be Used to Treat COVID-19
- Ten Renowned Mount Sinai Faculty Members Honored at Convocation
- Mount Sinai Researchers Develop Software to Measure the Findability, Accessibility, Interoperability, and Reusability of Biomedical Digital Research Objects
- Mount Sinai Researchers Develop Tool that Analyzes Biomedical Data within Minutes
- Mount Sinai Researchers Receive NIH Grant to Develop New Ways to Share and Reuse Research Data
- Students Harness Big Data to Help Solve Medical Challenges
- Crowdsourcing for Scientific Discovery
- Genetics: Big Hopes for Big Data
Language
Position
Research Topics
Addiction, Aging, Bioinformatics, Biomedical Sciences, Biostatistics, Cancer, Computational Biology, Diabetes, Drug Design and Discovery, Gene Expressions, Gene Regulation, Genetics, Genomics, Kidney, Mass Spectrometry, Mathematical Modeling of Biomedical Systems, Mathematical and Computational Biology, Personalized Medicine, Pharmacogenomics, Pharmacology, Protein Complexes, Protein Kinases, Proteomics, Reprogramming, Signal Transduction, Stem Cells, Systems Biology, Systems Pharmacology, Technology & Innovation, Theoretical Biology, Transcription Factors, Viruses and Virology
Multi-Disciplinary Training Areas
Artificial Intelligence and Emerging Technologies in Medicine [AIET], Disease Mechanisms and Therapeutics (DMT), Genetics and Genomic Sciences [GGS]
Video
Education
BSc, Fairleigh Dickinson University
MS, Fairleigh Dickinson University
PhD, Mount Sinai School of Medicine
Awards
2020
Mount Sinai Graduate School Alumni Award
Icahn School of Medicine at Mount Sinai
2013
Irma T. Hirschl Career Scientist Award
2011
Dr. Harold and Golden Lamport Research Award
Mount Sinai School of Medicine
2006
Doctoral Dissertation Award in the Graduate School of Biological Sciences
Mount Sinai School of Medicine
2006
Graduate School of Biological Sciences Award for Research Achievement
Mount Sinai School of Medicine
Research
Research Team:
Program Director: Sherry Jenkins, MS
Research Assistant Professor: Alexander Lachmann, PhD
Data Scientist: Daniel Clarke, MS
Bioinformatician: John Erol Evangelista, MS
Bioinformatics Software Engineers: Anna Byrd, MEng; Ido Diamant, BS; Andrew Lutsky, MS; Giacomo Marino, ScB, AB
Graduate Student: Abinanda Prabhakaran
2024 Undergrad and Post-bac Research Trainees: Bilal Ali, Eugenia Ampofo BA, Andrew Chung, Sophie Gideon, Eric Lee, Kareena Legare, Nathania Lingam, Tejal Nair, Lucas Sasaya, Andrew Stein
Summary of Research Studies:
Largest and Most Diverse Collection of Annotated Gene Sets
Gene set enrichment analysis is central to many biological and biomedical projects that measure mRNA and protein expression at the whole-genome scale. Gene set enrichment analysis is typically limited to few literature-base background knowledge libraries such as those created from the Gene Ontology and from pathway databases such as KEGG, WikiPathways, and Reactome. We have demonstrated that enrichment analysis can be expanded to using data from many other biological domains. For developing the tools Enrichr, Enrichr-KG, Rummagene, Rummageo, kinase enrichment analysis (KEA), ChIP-seq enrichment analysis (ChEA), and Harmonizome, we have integrated data from many key biomedical resources into useful gene set libraries. These libraries better inform enrichment analyses from omics studies. So far, over 2 million unique users used these bioinformatics software applications with a current rate of ~4,000 unique users per day.
Original Methods to Identify Differentially Expressed Genes, Perform Gene Set Enrichment Analyses, and Benchmark these Data Analysis Methods
One of the key statistical tests in the fields of transcriptomics is the identification of differentially expressed genes. We developed a multivariate method called the Characteristic Direction to better identify the “correct” differentially expressed genes. The Characteristic Direction method was extended to also perform improved enrichment analysis using a similar concept. Using a unique benchmarking strategy, we can objectively evaluate the Characteristic Direction method and many other leading methods for differential expression and enrichment analyses such as limma, GSEA and DESeq.
Translational Computational Research in Cancer and Kidney Disease
In collaboration with other experimental and computational biology laboratories, we have made great strides in the past several years in studying kidney disease, diabetes, HIV, and cancer. We have developed unique computational methods that led to the identification of potential targets and drugs for attenuating kidney fibrosis, diabetic kidney disease, and HIVAN. Our collaborative work also proposed treatment combinations for early-stage kidney disease intervention. These advances were possible by applying the unique algorithms that we developed which include: Expression2Kinases, SigCom LINCS, and TargetRanger.
Innovative Bioinformatics Software Infrastructure
To lower the barrier of entry for bioinformaticians and to streamline the development of bioinformatics software applications, we developed Appyters. With Appyters bioinformaticians can rapidly develop full-stack web-based bioinformatics applications from their Jupyter Notebook. Currently over 100 Appyters are available from the Appyters Catalog. For a CFDE Partnership project, our team developed the Playbook Workflow Builder, a platform that facilitates the visual dynamic construction of bioinformatics workflows. Along these efforts, we also created FAIRshake, a flexible framework for performing manual and automated evaluation of digital objects for adherence to defined community established standards.
For more information, please visit the Ma'ayan Laboratory website.
Locations
Publications
Selected Publications
- Harmonizome 3.0: Integrated knowledge about genes and proteins from diverse multi-omics resources. Ido Diamant, Daniel J.B. Clarke, John Erol Evangelista, Nathania Lingam, Avi Ma'Ayan. Nucleic Acids Research
- lncRNAlyzr: Enrichment Analysis for lncRNA Sets. John Erol Evangelista, Tahleel Ali-Nasser, Lauren E. Malek, Zhuorui Xie, Giacomo B. Marino, Assaf C. Bester, Avi Ma'ayan. Journal of Molecular Biology
- Protocol for using Multiomics2Targets to identify targets and driver kinases for cancer cohorts profiled with multi-omics assays. Giacomo B. Marino, Eden Z. Deng, Daniel J.B. Clarke, Ido Diamant, Adam C. Resnick, Weiping Ma, Pei Wang, Avi Ma'ayan. STAR Protocols