About the tomato unigene build 2
May 2008
A new unigene build for tomato has been assembled from the following data:
A new unigene build for tomato has been assembled from the following data:
- 323,277 ESTs from the tomato species
- Solanum lycopersicum with 307,350 sequences
- Solanum habrochaites with 8,255 sequences
- Solanum pennellii with 7,812 sequences
- Solanum pimpinellifolium with 8 sequences
- Solanum peruvianum with 42 sequences
- Solanum cheesmaniae with 4 sequences
- Solanum lycopersicoides with 2 sequences
- New EST sequences were obtained from:
- GenBank database (dbEST and mRNA for nucleotide)
- The new build contains 42,257 unigenes, of which 24,020 are contigs and 18,237 are singletons.
- Analyses performed on the unigenes:
- ESTScan and Longest6frame.pl - to predict peptides (39,967 and 43,366 peptides predicted respectively)
- InterproScan on peptides - to predict protein domains and associate Gene Ontology codes (6,626 and 1,482 different domains associated to the two different peptide datasets from the two different peptide prediction methods)
- BLAST against Genbank NR, Arabidopsis and Swissprot (30,791, 28,656 and 19,886 unigenes have any match with these protein datasets respectively)
- The range of unigene ids for this build is: SGN-U562593 through SGN-U604849.
- Sequence homology search using SGN Blast.
- Bulk download for a unigene accession (or list of accessions) using SGN Bulk download tool.
- Complete download of all the unigene sequences and annotations from the SGN ftp site.

