- News Home
17 April 2014 12:48 pm ,
Vol. 344 ,
Officials last week revealed that the U.S. contribution to ITER could cost $3.9 billion by 2034—roughly four times the...
An experimental hepatitis B drug that looked safe in animal trials tragically killed five of 15 patients in 1993. Now,...
Using the two high-quality genomes that exist for Neandertals and Denisovans, researchers find clues to gene activity...
A new report from the Intergovernmental Panel on Climate Change (IPCC) concludes that humanity has done little to slow...
Astronomers have discovered an Earth-sized planet in the habitable zone of a red dwarf—a star cooler than the sun—500...
Three years ago, Jennifer Francis of Rutgers University proposed that a warming Arctic was altering the behavior of the...
- 17 April 2014 12:48 pm , Vol. 344 , #6181
- About Us
World's Largest Hub for Cancer Genomes Opens
1 May 2012 1:46 pm
Researchers in California today unveiled what they describe as the world's largest repository for cancer genomes. The database will make it easier for scientists to analyze the vast amounts of sequencing data pouring out of the U.S. National Cancer Institute's (NCI's) genome projects.
Cancer Genomics Hub (CGHub), built by a team at the University of California, Santa Cruz (UCSC), will hold raw sequencing data from The Cancer Genome Atlas (TCGA). The atlas is NCI's mammoth effort to sequence the DNA of normal cells and tumor cells from 10,000 people with 20 types of cancer. (In some cases the project is sequencing whole genomes; in other cases, only the 1% of the genome that codes for proteins.) CGHub will also hold data from NCI's childhood- and HIV-associated cancer genome projects. It will take over for NIH's National Center for Biotechnology Information, which had been collecting cancer sequencing data through last August.
Physically based at the San Diego Supercomputer Center, the CGHub computer system is ready to store 5 petabytes of DNA and RNA data from cancer patients. (TCGA is generating 10 terabytes of data a month, and will eventually produce 10 petabytes [10,000 terabytes] of data.)
TCGA is building a catalog of key cancer-driving genetic changes that researchers can use to develop treatments tailored to the genetics of an individual's tumor. A central database will allow researchers to compare mutations and miswired pathways across cancer types, says UCSC bioinformatician David Haussler, who is leading the project funded with a $10.3 million contract from NCI: "What's very important is to gather the data in one place and make it easy for researchers to do cross-dataset comparisons." CGHub will not hold data from other international cancer genome projects, however.
For now, researchers will be able to only download the data. But sending genome data across the Internet is becoming impractical as datasets balloon in size (see our 2011 story "Will Computers Crash Genomics?"). Haussler says that eventually, researchers will be able to work on the data remotely on CGHub's servers through cloud computing, as NIH is doing with Amazon for data from its 1000 Genomes Project.