About Projects News
RENCI at UNC-Chapel Hill
Home | About | Resources | Research Computing | Projects | News | Default Style

The Carolina Center for Exploratory Genetic Analysis

Overview
The Carolina Center for Exploratory Genetic Analysis (CCEGA) is developing an interdisciplinary infrastructure to identify the complex genetic traits that underly human diseases, bringing together data from clinical studies, population studies and model systems. CCEGA believes the next breakthroughs in our understanding of biology and disease will be made possible by the integrated analysis of genetic data and its expression as phenotypes. CCEGA work centers on enabling this kind of multidisciplinary, multi-investigator research. The center involves three complementary groups of scientist at the University of North Carolina at Chapel Hill: (a) experimental geneticists, (b) quantitative experts in statistics and biostatistics, and (c) computer scientists with expertise in algorithm development, software construction, and high-performance computing.

Phase one of CCEGA focuses on building a community of investigators and deploying a  prototype infrastructure for analyzing relationships among genotypes and phenotypes in three contexts:

  • Family linkage studies, which examine the relationship between genotypes and  susceptibility to specific diseases and conditions, in this case alcoholic addiction.
  • Gene expression profile studies, which develop a picture of genes and cellular activity in order to identify patterns and signatures related to disease, in this case breast cancer.
  • Public health studies, which look at communities and their risk factors for diseases, in this case atherosclerosis.

The RENCI Contribution
To accommodate the diverse, multi-investigator databases necessary to answer these complex questions, RENCI is working with scientists to develop a prototype, extensible data model and provide access to data via a portal constructed using the Open Grid Computing Environment toolkit. The newest methods of integrated data analysis will be incorporated into a portal-based workflow. These include new techniques in linkage analysis (oligogenic analysis, multivariate linkage analysis, epistasis, and genotype by environment interaction), subspace clustering, and association analysis (quantitative trait and nucleotide analysis).

RENCI and its scientific partners also are exploring new visualization techniques for examining and interacting with large data sets and high performance computing for implementing computationally intensive analysis techniques. To reduce the barriers between data providers and data analyzers, CCEGA and RECNI conducts intensive, specialized workshops, colloquia and intramural meetings.

Funding
National Institutes of Health/National Center for Research Resources, Grant Number 5-P20-RR020751-01-02

Project Leaders
    Daniel A. Reed, RENCI (Principal Investigator)
Co-Principal Investigators at UNC Chapel Hill
    James Evans, Terry Magnuson, Karen Mohlke, Fernando Manuel Pardo, Charles Perou, Patrick Sullivan, David Threadgill, Kirk Wilhelmsen, Department of Genetics
    Susan Paulsen, Jan Prins, Wei Wang, Department of Computer Science
    Fred Wright, Fei Zou, Department of Biostatistics
    Bradley Hemminger, School of Information and Library Science
    Andrew Nobel, Department of Statistics
    Kari North, Department of Epidemiology
    Alexander Tropsha, School of Pharmacy
    K.T.L. Vaughan, Health Sciences Library
RENCI Team
    Xiaojun Guan
    Kevin Gamiel
    Clark Jeffries
    Jeff Tilson

Publications

Jeffries, C. Hairpin Database: Why and How? Genomic Impact of Eukaryotic Transposable Elements  conference, Asilomar, CA, April 2006

Jeffries, C. Bipartite and tripartite systems and matrices from genetic control research, Linear Algebra and its Applications 409 (2005) 70-78.

Jeffries, C., Jarstfer, M., Perkins, D.: Folded RNA from an intron of one gene might inhibit expression of a competing gene, in silico Biology 5 (2005), 0037.

Jeffries, C., Perkins, D., Jarstfer, M.: Systematic discovery of the grammar of translational inhibition by RNA hairpins, Journal of Theoretical Biology (accepted for publication).

J. Liu, S. Paulsen, X. Sun, W. Wang, A. Nobel, J. Prins, "Mining Approximate Frequent Itemsets In the Presence of Noise: Algorithm and Analysis", SIAM Conference on Data Mining (SDM), 2006.

J. Liu, S. Paulsen, W. Wang, A. Nobel, J. Prins, "Mining approximate frequent itemset from noisy data", Proceedings of the 5th IEEE International Conference on Data Mining (ICDM), 2005.

Hemminger BM, Saelim B, Sullivan PF. TAMAL: An integrated approach to choosing SNPs for genetic studies of human complex traits. Bioinformatics 2006.

Presentations

From the First CCEGA Workshop, January 21, 2005
Introduction and Context Dan Reed Chancellor's Eminent Professor Vice-Chancellor for Information Technology and CIO Director, Renaissance Computing Institute (RENCI)

Workshop Format Kirk Wilhelmsen, Department of Genetics

Addiction Family Study Kirk Wilhelmsen, Department of Genetics

Strong Heart Kari North, Epidemiology

Diabetes, Fusion Karen Mohlke, Department of Genetics

CATIE (Clinical Antipsychotic Trial of Intervention Effectiveness), Schizophrenia Pat Sullivan, Department of Genetics

Cystic Fibrosis Mike Knowles, Department of Medicine

Cancer Epidemiology Bob Millikan, Epidemiology

Head and Neck EpidemiologyAndy Olshan, Epidemiology

Renal Disease Gene Expression Ron Falk, Department of Medicine

ELSI/Prospective Studies Jim Evans, Department of Genetics


CCEGA Analysis Methods Workshop, May 4, 2005
Introduction NIH Site Visit, May 4, 2005

Linkage analysis / family-based association studies Kori North, Epidemiology

Model system for evaluation of data mining techniques Susan Paulsen, Computer Science

Subspace clustering methods Wei Wang, Computer Science

Visualization of high-dimensional data Leonard McMillan, Computer Science

Complex phenotypes: schizophrenia and ventricle morphology Guido Gerig, Psychiatry and Computer Science

Realistic simulation of genotypes Fred Wright, Biostatistics

Genetics viewpoint Pat Sullivan, Genetics


NIH Site Visit, May 4, 2005
Introduction Dan Reed Chancellor's Eminent Professor Vice-Chancellor for Information Technology and CIO Director, Renaissance Computing Institute (RENCI)

Project Overview Kirk Wilhelmsen, Department of Genetics

ELSI Working Group Jim Evans, Department of Genetics

Informatics Working Group Brad Hemminger, Information and Library Science

Analysis Working Group Jan Prins, Department of Computer Science

NIH Roadmap Program Greg Farber, NIH


CCEGA Workshop, Feb 2, 2007
Introduction Kirk Wilhelmsen, Department of Genetics

Data Modeling, Informatics Working Group Brad Hemminger, School of Information and Library Science

Realistic Simulation of Genotypes Fred Wright, William Barry, Department of Biostatistics

Random Forest on a Culled Set of SNPs Susan Paulsen, Jan Prins, Department of Computer Sciences

Preliminary Statistical Analysis of Bakeoff Data Fei Zou, Seunggeun Lee, Department of Biostatistics

Analysis of Simulated Genetic Data Based on Goodness of Fit Chi-square Test Alex Tropsha, Alexander Golbraikh, School of Pharmacy, Steve Marron, Department of Statistics

Bakeoff Summary Fred Wright, William Barry, Department of Biostatistics


Partners
RENCI
University of North Carolina at Chapel Hill :

Links
CCEGA Website
Working Groups

UNC Home | About | Resources | Research Computing | Projects | News | Text Style
RENCI @ UNC-Chapel Hill | ITS Manning | Chapel Hill, North Carolina 27117
phone: | fax: