- News Home
12 December 2013 1:00 pm ,
Vol. 342 ,
Stefan Behnisch has won awards for designing science labs and other buildings that are smart, sustainable, and...
The iconic 125-year-old Lick Observatory on Mount Hamilton near San Jose, California, is facing the threat of closure...
Recent results from the Curiosity Mars rover have helped scientists formulate a plan for the next phase of its mission...
A new, remarkably powerful drug that cripples the hepatitis C virus (HCV) came to market last week, but it sells for $...
In pretoothbrush populations, gumlines would often be marred by a thick, visible crust of calcium phosphate, food...
Evolutionary biologists have long studied how the Mexican tetra, a drab fish that lives in rivers and creeks but has...
Victorian astronomers spent countless hours laboriously charting the positions of stars in the sky. Such sky mapping,...
In an ambitious project to study 1000 years of sickness and health, researchers are excavating the graveyard of the now...
- 12 December 2013 1:00 pm , Vol. 342 , #6164
- About Us
World's Largest Hub for Cancer Genomes Opens
1 May 2012 1:46 pm
Researchers in California today unveiled what they describe as the world's largest repository for cancer genomes. The database will make it easier for scientists to analyze the vast amounts of sequencing data pouring out of the U.S. National Cancer Institute's (NCI's) genome projects.
Cancer Genomics Hub (CGHub), built by a team at the University of California, Santa Cruz (UCSC), will hold raw sequencing data from The Cancer Genome Atlas (TCGA). The atlas is NCI's mammoth effort to sequence the DNA of normal cells and tumor cells from 10,000 people with 20 types of cancer. (In some cases the project is sequencing whole genomes; in other cases, only the 1% of the genome that codes for proteins.) CGHub will also hold data from NCI's childhood- and HIV-associated cancer genome projects. It will take over for NIH's National Center for Biotechnology Information, which had been collecting cancer sequencing data through last August.
Physically based at the San Diego Supercomputer Center, the CGHub computer system is ready to store 5 petabytes of DNA and RNA data from cancer patients. (TCGA is generating 10 terabytes of data a month, and will eventually produce 10 petabytes [10,000 terabytes] of data.)
TCGA is building a catalog of key cancer-driving genetic changes that researchers can use to develop treatments tailored to the genetics of an individual's tumor. A central database will allow researchers to compare mutations and miswired pathways across cancer types, says UCSC bioinformatician David Haussler, who is leading the project funded with a $10.3 million contract from NCI: "What's very important is to gather the data in one place and make it easy for researchers to do cross-dataset comparisons." CGHub will not hold data from other international cancer genome projects, however.
For now, researchers will be able to only download the data. But sending genome data across the Internet is becoming impractical as datasets balloon in size (see our 2011 story "Will Computers Crash Genomics?"). Haussler says that eventually, researchers will be able to work on the data remotely on CGHub's servers through cloud computing, as NIH is doing with Amazon for data from its 1000 Genomes Project.