Date: 14 Aug 1997 ftp://www-hgc.lbl.gov/pub/genesets/Drosophila/GENIE_96/ This directory contains a set of GenBank flat-file format entries to be used to test and train gene-finding algorithms for Drosophila melanogaster. UCSC and LBNL provide these data sets "as is" in an effort to create a common data set to be used by all gene-finding algorithms. We encourage others to compare their results using these data sets. Accompanying each data set is a ".sets" file listing 5 test/train subsets. These subsets can be used for cross-validation. The data sets are: Multiple exon genes: multi_exon_GB.dat.gz multi_exon_GB.sets In addition, several Perl and shell routines are provided in the scripts directory (ftp://www-hgc.lbl.gov/pub/genesets/scripts/). ===== Martin Reese (LBNL) mgreese@lbl.gov