The Nicotiana benthamiana genome data: \annotation directory with the gene model annotation files. \assembly directory with the genome assembly files \repeats directory with the repeats dataset. Currently there are three genome assembly versions and one genome annotation. ============================================================ About genome assembly: ============================================================ Niben.v0.3, assembly made by with 1 Illumina HiSeq2000 run: 5 lanes of PE-500bp, 1 lane of MP-2Kb and 1 lane of MP-5Kb, using ABySS as assembler (Kmer 63). Niben.v0.4.3, same assembly that v0.3 using SOAPdenovo as assembler, and the SOAP scripts to correct the reads and to fill the gaps. Niben.v0.4.4, same assembly that v0.4.3 with the chloroplast sequence filtering. For more information: Bombarely A. et al. 2012 A draft genome sequence of Nicotiana benthamiana to enhance molecular plant-microbe biology research (Accepted for publication) http://dx.doi.org/10.1094/MPMI-06-12-0148-TA About sequence IDs and assembly files. There are two assembly files per version. Niben.genome.vX.X.scaffolds.nrcontigs.fasta, with the scaffolds and the contigs that are not included in the scaffolds (nrcontigs). Niben.genome.vX.X.contigs.fasta, with all the contigs. ID for scaffolds: NibenXXXScfYYYYYYYY where XXX is the assembly version and YYYYYYYY is the scaffold number. ID for nrcontigs: NibenXXXCtgYYYYYYYY where XXX is the assembly version and YYYYYYYY is the contig number. ID for contigs from scaffolds: NibenXXXScfYYYYYYYYCtgZZZ where XXX is the assembly version, YYYYYYYY is the scaffold number and ZZZ is the contig number starting for the scaffold 5-prime. ============================================================ About the genome annotation: ============================================================ The current version was performed over the assembly version Niben.v0.4.4 (scaffolds and nrcontigs) using Maker (Cantarel B. et al. 2008) and RNAseq data supplied by Prof. Gregory Martin. ID for Gene models: NbXYYYYYYYYgZZZZ where X is 'S' for scaffolds and 'C' for nrcontigs, YYYYYYYY is the number of the scaffold or the contig and ZZZZ is the number of the gene starting for the 5-prime of the scaffold or nrcontig sequence. See README file in the annotation folder for more information ============================================================ For any question, please contact with Prof. Gregory Martin (gbm7@cornell.edu) Ithaca, NY, Oct. 17th. 2012