PAA/SOL Meeting Genomics Session Meeting called to order: 8:32 am LUKAS: (opening remarks) Presentation - Jim Giovannoni, chromosomes 1, 10, and 11 There are 3 BAC libs being used for the sequencing project. They have been BAC-end sequenced in order to help with assembling the tiling path. Roughly 20% of the genome sequence is already contained in the BAC ends. In Steve Tanksley's lab, the HindIII BACs have been anchored to the physical map via overgo probes. There is also fingerprint contigging data for the HindIII library. My (JG's) lab makes these libraries available to whoever wants them. Filters are also available for them. Keep in mind that there will of course be some errors in these libraries. Please besure to verify your BACs before doing any extensive work. Also, be sure to check the SGN website regularly for news, and for new tools that are being developed directed toward the sequencing project. (slide: summary of new us tomato seq project) Coming up, SGN will be expanded, with emphasis on development of 'end game' tools for completing sequencing. To help fill in gaps, a fosmid library (approx 40kb/clone) is being developed. Presentation - Sung-Hwan Cho (KRIBB), chromosome 2 Using pachytene chromosomes, we measured the heterochromatic on chromosome 2 and found it to be 2MB shorter than displayed on the SGN website. We have identified 40 seed bacs on chromosome 2 for sequencing. We did 4 rounds of BAC extensions, identifying 42 more bacs for sequencing. 76 BACs are now in finishing, 6 in the pipeline. We have some problems: at this point, we have not been able to identify any more extension BACs from where we are now. Also, there is a big gap (about 2.29Mb) in the chromosome in which we cannot find any anchored bacs. We are eagerly awaiting the UK's new Mbo FPC map. JG: one thing that might help you in the mean time would be some new anchoring data on SGN from aligning markers to BAC sequences. Also, another possibility might be taking the appropriate penellii introgression line and try to identify some new markers from that. -- (9:02 pause for group picture) -- Presentation - Eileen Wang, chromosome 3 Right now, sequencing the chr3 euchromatic region. We are still doing the first of the seed BACs. We are also working toward contructing a genome-wide physical map of the genome, using manual editing of FPC results, PCR screening of anchor BACs. Like Korea, we also have a huge anchoring gap in our chromosome. On the long arm, we started with a bac anchored with T0772, turned out to be still in the heterochromatin. Progress: 20 bacs confirmed by fish, 18 on euchromatin, 2 in heterochromatin. 9 have been finished, 11 are at stage 2. there are about 5 bacs that have a lot of repeats and are difficult to finish. Through manual editing of the FPC map, we reduced the contig number from 6794 to 3000. We have it displayed on thehinese site, http://tomato.genetics.ac.cn/TomatoFPC/ We are also eager to see the Mbo FPC results from the UK. Presentation - Christine Nicholson and Karen McLaren (finishing group), chromosome 4 By the end of the year, we hope to have 80% of our bacs in sequencing. Our fish results are coming up with different marker orders from the tomato-expen2000 map around the centromere. We are also finding euchromatic regions within the heterochromatin. With our FPC contigging, we've generated over 43,000 fingerprints. Took these and incorporated into the original AGI build. We also confirmed our fingerprint results with blasts against the bac ends. Fingerprints are available on our ftp site. We are right now working on increasing our fpc results coverage of markers, currently at 57. Some questions for discussions: how do we determine gene space has been sequenced? how might we harmonize our HTGS phases with NCBI? What methods should we use to report clone order and orientation? Presentation - Jitendra Khurana, chromosome 5 3 centers participating in chromosome 5 in India. We found and sequenced 2 BACs that ultimately ended up to be on chromosome 7. An announcement: we plan to organize a workshop on metagenomics, 1-14th of November this year. Presentation - Roeland van Hamm, chromosome 6 New developments: only 23 seed bacs anchored, anchoring problems shown by FISH analysis. Our sequencing costs have gone down, so we will soon be sequencing much more. Adjusted our goals, will now sequence the complete euchromatic part. With our fish pipeline, we are preparing a multi-fish experiment to try to determine whether we also face large gaps in our seed bac layout. We are experimenting with 454 sequencing right now. We are also moving from AFLP to snapshot fingerprinting. We'll be starting closure sequencing in Q4 2006. Presentation - Erika Asamizu, chromosome 8 Status: 21 bacs finished so far. Found an interesting problem, sequenced a 40kb BAC, but the marker sequence used to anchor it was not found in the finished sequence (but was found in the shotgun reads). We think this may have been due to a deletion happening inside the bacterium used for cloning and the deleted clone preferentially replicated. Presentation - Farid Regad, chromosome 7 Funding is now secure, started Jan. 2006. 1 BAC finished, 9 being subcloned, 17 in phase 1, 3 in phase 2. Hope to have finished 21 by sept, 70 by next jan, 150 by next june, 277 by march 2008. Presentation - Antonio Granell, chromosome 9 finished 9 bacs, 11 more in pipeline. some clones are contaminated with other clones, some close have no restriction digest data available. Presentation - Silvana Grandillo, chromosome 12 moving from using PCR to using plasmid mini-preparations, avg insert 2kb. 18 seed bacs selected to date. total of 23 bacs in the seq pipeline. 4 seed bacs have been finished and submitted. a new bac extension tool is available at the cribi site. DISCUSSION LM: Back to the discussion points raised by chr4 team. How do we determine gene space has been sequenced? We don't really have the tools right now. We're working on a test set for training gene predictors. D. Buchan: how do we make the distinction between a bac being heterochromatic and euchromatic. ?: if you really wanted to, you can sequence heterochromatin. R. van Hamm: potato is sequencing the whole thing eileen wang: we did some analysis of 5 eu bacs and 16 hetero bacs, and the results are striking in that the hetero bacs are really full of repeats. before i left cornell, we identified some bacs around the boundaries. you can look at the bacs already done to see the gene and repeat density. lm: we are probably relatively close to the truth with our various estimates of gene density and total gene content. on sgn we have a paper up about this, and the other estimates put forth here seem to also agree with those numbers. also, if we estimate how many of our unigenes are in the sequenced set, we can make some estimates that way. SOL STEERING Dani Zamir: Description of future meeting sites. Description of Latin SOL. Descriptin of Andre Kessler's EcoSol. == Sol 100 == Chris Comer[sp?]: How to coordinate groups doing sequencing (of cDNA?) in other 3000 Sol. species? The costs of genome sequencing make it feasible to consider sequencing all Sol. Start with 100, called Sol 100. ``Let's do it!'' Big round of applause. Sandy Knapp: issues: what kind of plants, plant material is needed? Is sequencing 100 genomes (for a start) be a good idea? DZ: we need to deliver results before people will invest in this (in the next year). Of the 100 species selected, 80 will be Sol., 20 Asterids (including coffee). Selected species must be systematically correct. Must contact people who know about the species and what's interesting about them. Rob Last: one possible criterion for interestingness might be what plants can be useful for energy as in biofuels SK: Criteria must come from the community. Another prospective criterion is medicinal use DZ: How will this be funded? Funding agencies (e.g. the DoE) can perhaps be sold on both the practical benefits of sequencing prospective biofuel species, but also in funding networks of knowledge about the family. By next year in Korea a better picture is needed. Roeland ??: what time frame? DZ: by 2008 the euchromatin in Sol. lyco. will be done. By 2007 in Korea, a paper describing the goals of Sol 100 should be prepared. Part of this project should include sequencing of heterochromatin in tomato. RL: How quickly will the costs of sequencing come down? SK: we have to watch the costs, and make sure that the grant proposals are put in such that the funding arrives when the costs bottom out. DZ: we should prepare the plan irrespective of the cost of sequencing. SK: we could do things incrementally as things get cheaper. this requires a stratified sampling scheme. Giovanni Giuliano: a scaffold with Sanger will be necessary to take advantage of the new technologies, to make de novo sequencing. RL: in addition to a sampling strategy, tactics for getting funding will be needed SK: in addition, we'll need a sampling strategy DZ: Is Sol 100 how the Sol project will have to go after 2008? show of hands, majority said yes. RL: What about functional genomics: DZ: EU-Sol is a functional genomics project. UC Davis has a project submitted. Latin Sol is also a functional genomics project. (potato person 1): What are the plans for coordinating tomato and potato sequencing projects? DZ: Once sequencing has gotten far enough for both, then things can be joined. Heiko Schoof: now is the right time. DZ: maybe now is the right time. this would demonstrate that genomes can be integrated, which would help for Sol 100's credibility. GG: how is the potato sequencing project structured? Robin Buell from TIGR: bac by bac, funding has been obtained, international effort. goal to finish by 2009. whole chromosome, not just euchromatin. RB: current limiting factor is unavailability of bac sequence. Nicotiana sequence will help. GG: how will the potato-tomato-nicotiana integration be done? DZ: SK, SZ, others will figure it out. HS: invite the potato people to the tomato bac annotation meeting in October. Break for lunch.