v1 - 12/17/2005 This dataset contained 5707 repeats obtained from the MIPS repeats, 137 repeats from the unirepeats set that didn't match the MIPS set, and 7 sequences from Genbank from known tomato repeats (TGR repeats). The unirepeat set were derived from bac ends using RepeatScout v2 - 08/06/2008 A combination of the PRI set from 2008-07-23 and a bac-end derived repeat set (unirepeats) v3 - 08/12/2008 A combination of the PRI set from 2008-08-12 and a bac_end derived repeat set (unirepeats) v4 - 11/04/2008 Combination of RepBase viridiplantae and PRI set. The unirepeat set was removed because it was too sensitive and interfered with annotation (possibly contained large gene family sequences as well). draft v5 - 02/25/2010 Combination of repeats of v4 plus the results of the analysis of tomato genome scaffolds v.1.03 with RepeatScout. These results were filtered and all the sequences with some gene homology (matches with Arabidopsis protein dataset using blastx and e-value < 1e-10), low complexity (analyzed by nseg program) or tandem simple repeats ( analyzed by trf program) were removed from the dataset. v5 - 03/23/2010 Combination of repeats of v4 plus the results of the repeat analysis of tomato genome scaffolds v.1.03 with RepeatScout. These results were filtered and all the sequences with some gene homology (matches with Arabidopsis protein dataset using blastx and e-value < 1e-10), low complexity (analyzed by nseg program) or tandem simple repeats ( analyzed by trf program) were removed from the dataset. Also the redundant sequences between repeat v4 and RepeatScout analysis were removed. Finally a RepeatMasker analysis were made using this dataset. All the sequences repeated less than 10 times were removed.