BDGP: Drosophila Gene Collection

Drosophila Gene Collection

After careful analysis of over 80,000 ESTs to find full-length cDNA clones, including verification of the 3' ends of these clones, we have selected 5,849 non-redundant cDNA clones for single colony purification and re-arraying (see What is the Drosophila Gene Collection? for more information). This is not a complete set of genes for the Drosophila genome, which has been reported to have 13,601 genes. Version 1.0 has been released as of August 2000 as bacterial glycerol stocks in 384-well plates; it has also been released to others as plasmid DNA stocks in 96-well plates. The stocks have been released to sites chosen by the Drosophila Board, as well as to distributors.

With the completion of the 2001 EST project, we have now analyzed over 240,000 ESTs. Using new computational methods based upon the annotated genomic sequence, we have identified more than 5,000 new cDNA clones to add to the DGC. DGCr2.0 has been released as of February 2003 as bacterial glycerol stocks in 384-well plates. Here is the list of labs and distributors .

Generation of the Release 3 annotation of the genome made extensive use of our full-insert sequence data. In the course of that effort, human curators identified a total of 1,860 clones that have become the DGCr3. The DGCr3 currently includes clones chosen to replace clones with truncated ORFs, clones for genes that are not currently represented in the DGC, plus clones that represent putative alternative splicing forms.

A directed screen for clones capturing unrepresented genes in the collection has led to a set of 2,148 non-redundant clones. This set represents rare transcripts, many of which alter current annotations of the genome. Note that from the nature of this screen and the rarity of the transcripts, some of these clones represent only fragments of the presumed transcripts or have may have other artifacts.

The DGC Gold collection is comprised of clones with verified full-length ORFs, and will be useful for proteomics and functional genomics.

Single flat file in multiple fasta format of the EST sequences of the cDNA clones comprising the DGC. Size of file is 35.0 MB

Single flat file in multiple fasta format of all currently available full insert sequences of the cDNA clones comprising the DGC. Size of file is 20.7 MB

M Stapleton, J Carlson, P Brokstein, C Yu, M Champe, R George, H Guarin, B Kronmiller, J Pacleb, S Park, K Wan, GM Rubin and SE Celniker Genome Biology (2002) 3 (12):research0080.1-0080.8

M Stapleton, G Liao, P Brokstein, L Hong, P Carninci, T Shiraki, Y Hayashizaki, M Champe, J Pacleb, K Wan, C Yu, J Carlson, R George, S Celniker, and GM Rubin Genome Research (2002) 12 (8):1294-1300

G Rubin et. al., Science (2000) 287: 2222-2224.