Initial data release and announcement of the 10,000 Fish ...

文章推薦指數: 80 %
投票人數:10人

To date, we have assembled the genomes of the first 10 species. For the 10 assembled fish genomes, the average contig N50 and the average ... SkiptoMainContent Advertisement SearchMenu AccountMenu Menu SignIn Register NavbarSearchFilter ThisissueAllGigaScience AllJournals MobileMicrositeSearchTerm Search SignIn Register Articles Submit AuthorGuidelines SubmissionSite OpenAccess Alerts About AboutGigaScience EditorialBoard ReviewerGuidelines EditorialPolicies AuthorshipGuidelines Articles Submit AuthorGuidelines SubmissionSite OpenAccess Alerts About AboutGigaScience EditorialBoard ReviewerGuidelines EditorialPolicies AuthorshipGuidelines Close searchfilter Thisissue AllGigaScience AllJournals searchinput Search AdvancedSearch SearchMenu ArticleNavigation Closemobilesearchnavigation ArticleNavigation Volume9 Issue8 August2020 ArticleContents Abstract Introduction MaterialsandMethods Conclusion Availabilityofsupportingdataandmaterials Abbreviations Competinginterests Funding Authors'contributions Supplementarydata ACKNOWLEDGEMENTS References ArticleNavigation ArticleNavigation Initialdatareleaseandannouncementofthe10,000FishGenomesProject(Fish10K) GuangyiFan, GuangyiFan BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,ChinaBGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,ChinaStateKeyLaboratoryofAgriculturalGenomics,BGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,China   https://orcid.org/0000-0001-7365-1590 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar YueSong, YueSong BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China   https://orcid.org/0000-0002-2974-6442 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar LiandongYang, LiandongYang KeyLaboratoryofAquaticBiodiversityandConservation,InstituteofHydrobiology,ChineseAcademyofSciences,No.7DonghuSouthRoad,WuchangDistrict,Wuhan,HubeiProvince,China   https://orcid.org/0000-0001-7570-0296 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar XiaoyunHuang, XiaoyunHuang BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar SuyuZhang, SuyuZhang BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China   https://orcid.org/0000-0002-0653-9846 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar MengqiZhang, MengqiZhang BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China   https://orcid.org/0000-0002-5641-0557 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar XianweiYang, XianweiYang BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China   https://orcid.org/0000-0003-4388-9674 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar YueChang, YueChang BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China   https://orcid.org/0000-0002-6902-9931 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar HeZhang, HeZhang BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,ChinaBGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,China   https://orcid.org/0000-0001-9294-1403 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar YongxinLi, YongxinLi CenterforEcologicalandEnvironmentalSciences,NorthwesternPolytechnicalUniversity,1DongxiangRoad,Chang'anDistrict,Xi'anShaanxi,710129,China Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar ...Showmore ShanshanLiu, ShanshanLiu BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China   https://orcid.org/0000-0002-5756-1728 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar LiliYu, LiliYu BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China   https://orcid.org/0000-0003-0435-0385 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar JefferyChu, JefferyChu Frasergen,DonghuHigh-TechDevelopmentZone,DonghuHigh-TechDevelopmentZone,Wuhan,HubeiProvince,China Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar IngeSeim, IngeSeim IntegrativeBiologyLaboratory,CollegeofLifeSciences,NanjingNormalUniversity,No.1WenyuanRoadQixiaDistrict,Nanjing,210023,ChinaComparativeandEndocrineBiologyLaboratory,TranslationalResearchInstitute-InstituteofHealthandBiomedicalInnovation,SchoolofBiomedicalSciences,QueenslandUniversityofTechnology,Brisbane4102,Queensland,Australia   https://orcid.org/0000-0001-8594-7217 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar ChenguangFeng, ChenguangFeng CenterforEcologicalandEnvironmentalSciences,NorthwesternPolytechnicalUniversity,1DongxiangRoad,Chang'anDistrict,Xi'anShaanxi,710129,China Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar ThomasJNear, ThomasJNear DepartmentofEcology&EvolutionaryBiology,YaleUniversity,NewHaven,CT06511,USA Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar RodAWing, RodAWing BiologicalandEnvironmentalSciences&EngineeringDivision,KingAbdullahUniversityofScienceandTechnology,Thuwal23955-6900,KingdomofSaudiArabia   https://orcid.org/0000-0001-6633-6226 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar WenWang, WenWang CenterforEcologicalandEnvironmentalSciences,NorthwesternPolytechnicalUniversity,1DongxiangRoad,Chang'anDistrict,Xi'anShaanxi,710129,China   https://orcid.org/0000-0002-7801-2066 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar KunWang, KunWang CenterforEcologicalandEnvironmentalSciences,NorthwesternPolytechnicalUniversity,1DongxiangRoad,Chang'anDistrict,Xi'anShaanxi,710129,China   https://orcid.org/0000-0001-6059-6529 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar JingWang, JingWang KeyLaboratoryofMarineEcologyandEnvironmentalSciences,InstituteofOceanology,ChineseAcademyofSciences,7NanhaiRoad,Qingdao,Shandong266071,ChinaMarineEcologyandEnvironmentalScienceLaboratory,PilotNationalLaboratoryforMarineScienceandTechnology,1WenhaiRoad,Aoshanwei,Jimo,Qingdao,Shandong,266237,ChinaCenterforOceanMega-Science,ChineseAcademyofSciences,No.7,NanhaiRoad,QingdaoCity,266400,China Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar XunXu, XunXu BGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,China   https://orcid.org/0000-0002-5338-5173 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar HuanmingYang, HuanmingYang BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,ChinaBGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,China   https://orcid.org/0000-0002-0858-3410 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar XinLiu, XinLiu BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,ChinaBGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,ChinaStateKeyLaboratoryofAgriculturalGenomics,BGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,China   https://orcid.org/0000-0003-3256-2940 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar NanshengChen, NanshengChen KeyLaboratoryofMarineEcologyandEnvironmentalSciences,InstituteofOceanology,ChineseAcademyofSciences,7NanhaiRoad,Qingdao,Shandong266071,ChinaMarineEcologyandEnvironmentalScienceLaboratory,PilotNationalLaboratoryforMarineScienceandTechnology,1WenhaiRoad,Aoshanwei,Jimo,Qingdao,Shandong,266237,ChinaCenterforOceanMega-Science,ChineseAcademyofSciences,No.7,NanhaiRoad,QingdaoCity,266400,ChinaDepartmentofMolecularBiologyandBiochemistry,SimonFraserUniversity,Burnaby,BC,V5A1S6,Canada   https://orcid.org/0000-0002-6361-964X Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar ShunpingHe ShunpingHe KeyLaboratoryofAquaticBiodiversityandConservation,InstituteofHydrobiology,ChineseAcademyofSciences,No.7DonghuSouthRoad,WuchangDistrict,Wuhan,HubeiProvince,China Correspondenceaddress.ShunpingHe.E-mail:[email protected].   https://orcid.org/0000-0001-9087-7890 Searchforotherworksbythisauthoron: OxfordAcademic GoogleScholar Equalcontribution. AuthorNotes GigaScience,Volume9,Issue8,August2020,giaa080,https://doi.org/10.1093/gigascience/giaa080 Published: 18August2020 Articlehistory Received: 13April2020 Revisionreceived: 23June2020 Accepted: 03July2020 Published: 18August2020 PDF SplitView Views Articlecontents Figures&tables Video Audio SupplementaryData Annotate Cite Cite GuangyiFan,YueSong,LiandongYang,XiaoyunHuang,SuyuZhang,MengqiZhang,XianweiYang,YueChang,HeZhang,YongxinLi,ShanshanLiu,LiliYu,JefferyChu,IngeSeim,ChenguangFeng,ThomasJNear,RodAWing,WenWang,KunWang,JingWang,XunXu,HuanmingYang,XinLiu,NanshengChen,ShunpingHe,Initialdatareleaseandannouncementofthe10,000FishGenomesProject(Fish10K),GigaScience,Volume9,Issue8,August2020,giaa080,https://doi.org/10.1093/gigascience/giaa080 SelectFormat Selectformat .ris(Mendeley,Papers,Zotero) .enw(EndNote) .bibtex(BibTex) .txt(Medlars,RefWorks) Downloadcitation Close PermissionsIcon Permissions Share Email Twitter Facebook More NavbarSearchFilter ThisissueAllGigaScience AllJournals MobileMicrositeSearchTerm Search SignIn Register Close searchfilter Thisissue AllGigaScience AllJournals searchinput Search AdvancedSearch SearchMenu Abstract BackgroundWithmorethan30,000species,fish—includingbony,jawless,andcartilaginousfish—arethelargestvertebrategroup,andincludesomeoftheearliestvertebrates.Despitetheircriticalrolesinmanyecosystemsandhumansociety,fishgenomicslagsbehindworkonbirdsandmammals.Thisseverelylimitsourunderstandingofevolutionandhindersprogressontheconservationandsustainableutilizationoffish.ResultsHere,weannouncetheFish10Kproject,aportionoftheEarthBioGenomeProjectaimingtosequence10,000representativefishgenomesinasystematicfashionwithin10years,andweofficiallywelcomecollaboratorstojointhiseffort.Asasteptowardsthisgoal,wehereindescribeafeasibleworkflowfortheprocurementandstorageofbiospecimens,aswellassequencingandassemblystrategies.ConclusionsToillustrate,wepresentthegenomesof10fishspeciesfromacohortof93specieschosenfortechnologydevelopment. Introduction Fishgenomessequencedtodate Asofthetimeofthiswriting,genomeassembliesarepubliclyavailableforfewerthan1%offishspecies(244species,assourcedfromNCBIwhenaccessedon21April2020;SupplementaryTable1).Theirassemblylengthsrangefrom302.36Mb(Diretmusargenteus)to4.47Gb(Scyliorhinustorazame),withanaveragelengthof872.64Mb.TheaveragescaffoldN50andcontigN50valuesare8.82Mband914.18Kb,respectively,whilethemedianscaffoldN50andcontigN50are613.59Kband20.82Kb,respectively.Thereare112specieswithascaffoldN50ofmorethan1Mb,ofwhich43haveacontigN50above1Mb(Fig. 1).Thesegenomeshavefueledanumberofstudiesonthephylogenyandevolutionoffish(e.g.,theAfricancoelacanth),evolutionaryprocessesofspecificfishsubgroups(e.g.,elephantsharkgenomeillustratingthephylogeneticrelationshipofChondrichthyesasasistergrouptobonyvertebrates),geneticmechanismsofadaptationtodifferentenvironments(e.g.,thedeep-seaMarianaTrenchsnailfishandcave-dwellingfish),andspecificbiologicalprocesses(e.g.,theevolutionaryprocessofZWsexchromosomes).Nevertheless,thecurrentfishgenomesequencingresultsareonlyadropintheocean,andnumerouscriticalresearchquestionsremaintoberesolved.Anon-exhaustivelistincludesgainingcomprehensiveandclearunderstandingsoffishphylogeny,genomesizediversityandchromosomeevolution,diverseenvironmentaladaptations,morphologyevolution,respiratorysystem,immunesystem,andtheevolutionandfunctionofultraconservedelementsandconservednonexonicelements. Figure1:OpeninnewtabDownloadslideAssemblystatisticsoffishgenomesinpublicdatabases.(a)Summaryofgenomesize.(b,c)N50statistics.Ascaffoldisasetofcontigslinkedtogetherwithgapsintroducedinbetween.N50isthemediancontigsizeofthegenomicassembly.It'sametricthatcouldbeusedtoevaluatethequalityofgenomeassembly.Figure1:OpeninnewtabDownloadslideAssemblystatisticsoffishgenomesinpublicdatabases.(a)Summaryofgenomesize.(b,c)N50statistics.Ascaffoldisasetofcontigslinkedtogetherwithgapsintroducedinbetween.N50isthemediancontigsizeofthegenomicassembly.It'sametricthatcouldbeusedtoevaluatethequalityofgenomeassembly.Theeraofgenomeconsortiums WiththerapiddevelopmentofDNAsequencingtechnology,thisisthetimeforlarge-scale,collaborativegenomicstudiestomapthevertebratetreeoflife.ThefirstsuchprojectwastheGenome10K,establishedin2009,whichaimedtosequenceandassemblegenomesofabout10,000vertebratespecies[1].TheEarthBioGenomeProjectaimstosequence,catalog,andcharacterizethegenomesofallofEarth'seukaryoticbiodiversity[2].TheVertebrateGenomesProject(VGP)waslaunchedin2017togeneratechromosome-level,haplotype-phasedgenomeassembliesofallvertebratespecies[3].TheBird10,000GenomesProject(B10K)wasinitiated[4]afterasuccessfulphylogenomicstudyon45aviangenomesin2014.TheB10Kprojectsaimstosequenceandassembleallknownbirdspeciesin3phases.Despitecurrentchallengesinfunding,sampling,sequencing,assembly,anddataanalysis,theseprojectshavealreadymadesubstantialprogress.Forfish,whichmakeupmorethanhalfofallvertebratespecies,noexclusivefishgenomeprojectshavebeeninitiatedatasimilarscale.Toourknowledge,theonlylarge-scalegenomicstudywasFish-T1K(Transcriptomesof1,000Fishes),whichaimedtosequencethetranscriptomes(RNAsequence)ofray-finnedfish[5].However,theinsightsgainedfromtranscriptomedataalonearerelativelylimited.Acceleratingfishgenomicsbylarge-scalegenomesequencingeffortswouldundoubtedlyboostresearchintofishbiodiversity,speciation,andadaptation,aswellasaidingtheconservationandsustainableutilizationoffish.TheFish10KGenomeProject WehereannouncetheFish10KGenomeProject,asub-projectoftheEarthBioGenomeProjectaimingtosample,sequence,assemble,andanalyzegenomesof10,000fishspecies.Weareproposinganeffectiveandintegratedworkflowinwhichmajorchallengesareaddressedandinwhichhigh-qualityreferencegenomesareconstructed(chromosomelevelinPhasesIandIIandascaffold-levelassemblywithascaffoldN50largerthan1MbinPhaseIII).Throughdevelopingandapplyingeffectiveanalysismethods,wewillbeabletoaddresscriticalevolutionaryandbiologicalresearchquestionsrelatedtofish.Inordertoprovetheefficiencyofourworkflowandthefeasibilityofthislarge-scalegenomeproject,10speciesfrom93collectedsamplesareusedtovalidateournewsequencingtechnology,andthesegenomeshavebeenreleasedaspartofapilotproject.MaterialsandMethods Feasibilitytestandthereleaseof10fishgenomes Inordertoestablishcost-effectivestrategiesandassessthefeasibilityofalarge-scalegenomeproject,weinitiatedapilotstudyinJune2017.Overthenext2years,wewenton4expeditionsacrosslakes,rivers,andcoastalwatersofChina,collecting324fishspecies.Aftercarefuldocumentationofsampleinformationandspeciesidentification,thetissuesof93specieswereselectedforDNAextraction,and10ofthesespecieswereusedforsequencing.Weusedsingle-tubelongfragmentreadstechnology(stLFR)[6]andtheDNBSEQplatformtosequencethespecies,generatinglong-read(NanoporeorPacBio)andHi-Cdataforasubsetofthespecies.Inthisway,wewereabletotestthefeasibilityof3differentsequencingandassemblystrategies(Fig. 2):stLFRdataalone(syntheticlongreadsgeneratedusingasecond-generationsequencingplatform;StrategyI);stLFRdatacombinedwithlow-depth,longreads(∼10×rawNanoporedatatofillinthegaps;StrategyII);andhigh-depth,longreads(∼80×rawNanoporedata)combinedwithsecond-generationshortreads(eithershortinsertsizelibrariesorstLFR;StrategyIII;Table 1andTable2).Todate,wehaveassembledthegenomesofthefirst10species.Forthe10assembledfishgenomes,theaveragecontigN50andtheaveragescaffoldN50are2.83Mband7.59Mb,respectively.Theaveragebenchmarkinguniversalsingle-copyorthologs(BUSCO)completenessestimateis96%.AcomparisonofassemblystatisticsrevealedthatassembliesgeneratedwithStrategyIIandStrategyIIIweremorecontinuous.Toillustrateoureffort,wearereleasingthegenomesof10representativebonyfishgenomescoveringthe3assemblystrategies.ThecontigN50sof7ofthesegenomesaremorethan1Mbandaminimumof93%ofBUSCOgeneswerefound,indicatingthegenomeassembliesareofhighquality.Therewere3genomesassembledatthechromosomelevel,withmorethan92%ofscaffoldsanchoredusingHi-Cdata. Figure2:OpeninnewtabDownloadslideThesequencingandassemblystrategies.Inthepreferredstrategy(StrategyII),high-qualityDNAfragments(≥40Kb)areusedtoconstructastLFRlibrary,whichissequencedusingtheDNBSEQplatform.Low–sequencingdepthlongreadsareonlyusedtoimprovethecontinuityofhighlycomplexregions(increasethecontigN50).InthealternativeStrategyI,high-depthlongreadsareusedtoconstructcontigs,whilelow-depthstLFRreadsareusedtopolishthecontigandlinkthescaffolds.Hi-Cdataareusedtogenerateachromosome-levelassembly.Figure2:OpeninnewtabDownloadslideThesequencingandassemblystrategies.Inthepreferredstrategy(StrategyII),high-qualityDNAfragments(≥40Kb)areusedtoconstructastLFRlibrary,whichissequencedusingtheDNBSEQplatform.Low–sequencingdepthlongreadsareonlyusedtoimprovethecontinuityofhighlycomplexregions(increasethecontigN50).InthealternativeStrategyI,high-depthlongreadsareusedtoconstructcontigs,whilelow-depthstLFRreadsareusedtopolishthecontigandlinkthescaffolds.Hi-Cdataareusedtogenerateachromosome-levelassembly. Table1:Assemblystatisticsofthe10releasedgenomeassembliesStrategy . Species . Commonname . Estimatedgenomesize,Mb . Assemblysize,Mb . ScaffoldN50,bp . ContigN50,bp . BUSCO,% . Anchored,% . I Diodonholocanthus Longspinedporcupinefish 722.9 643.4 6,098,089 2,149,931 95.7 –  Heterotisniloticusa Africanbonytongue 743.4 669.7 9,615,753 2,307,881 97.6 96.8  Oxyeleotrismarmorata Marblegoby 589.7 502.6 13,190,768 1,270,297 92.9 –  Datnioidesundecimradiatus Mekongtigerperch 623.1 595.7 9,741,635 2,175,996 97.2 –  Chaetodontrifasciatus Melonbutterflyfish 698.5 668.3 9,974,986 1,859,054 97.3 – II Nasovlamingii Bignoseunicornfish 961.4 861.3 5,736,754 182,642 97.8 –  Chelmonrostratusa Copperbandbutterflyfish 711.4 638.9 2,627,953 294,414 98.4 94.4  Helostomatemminckiia Kissinggourami 729.7 635.4 913,351 95,536 96.3 91.8 III Pseudobramasimoni – 940.9 929.1 13,799,189 13,799,189 95.7 –  Rhodeusocellatus Rosybitterling 850.5 902.4 4,198,183 4,198,183 94.5 – Strategy . Species . Commonname . Estimatedgenomesize,Mb . Assemblysize,Mb . ScaffoldN50,bp . ContigN50,bp . BUSCO,% . Anchored,% . I Diodonholocanthus Longspinedporcupinefish 722.9 643.4 6,098,089 2,149,931 95.7 –  Heterotisniloticusa Africanbonytongue 743.4 669.7 9,615,753 2,307,881 97.6 96.8  Oxyeleotrismarmorata Marblegoby 589.7 502.6 13,190,768 1,270,297 92.9 –  Datnioidesundecimradiatus Mekongtigerperch 623.1 595.7 9,741,635 2,175,996 97.2 –  Chaetodontrifasciatus Melonbutterflyfish 698.5 668.3 9,974,986 1,859,054 97.3 – II Nasovlamingii Bignoseunicornfish 961.4 861.3 5,736,754 182,642 97.8 –  Chelmonrostratusa Copperbandbutterflyfish 711.4 638.9 2,627,953 294,414 98.4 94.4  Helostomatemminckiia Kissinggourami 729.7 635.4 913,351 95,536 96.3 91.8 III Pseudobramasimoni – 940.9 929.1 13,799,189 13,799,189 95.7 –  Rhodeusocellatus Rosybitterling 850.5 902.4 4,198,183 4,198,183 94.5 – ThecommonnameswereobtainedfromtheFishBasewebsite(https://www.fishbase.se/search.php).aChromosome-levelgenomeassembly(Hi-Cdatagenerated). Openinnewtab Table1:Assemblystatisticsofthe10releasedgenomeassembliesStrategy . Species . Commonname . Estimatedgenomesize,Mb . Assemblysize,Mb . ScaffoldN50,bp . ContigN50,bp . BUSCO,% . Anchored,% . I Diodonholocanthus Longspinedporcupinefish 722.9 643.4 6,098,089 2,149,931 95.7 –  Heterotisniloticusa Africanbonytongue 743.4 669.7 9,615,753 2,307,881 97.6 96.8  Oxyeleotrismarmorata Marblegoby 589.7 502.6 13,190,768 1,270,297 92.9 –  Datnioidesundecimradiatus Mekongtigerperch 623.1 595.7 9,741,635 2,175,996 97.2 –  Chaetodontrifasciatus Melonbutterflyfish 698.5 668.3 9,974,986 1,859,054 97.3 – II Nasovlamingii Bignoseunicornfish 961.4 861.3 5,736,754 182,642 97.8 –  Chelmonrostratusa Copperbandbutterflyfish 711.4 638.9 2,627,953 294,414 98.4 94.4  Helostomatemminckiia Kissinggourami 729.7 635.4 913,351 95,536 96.3 91.8 III Pseudobramasimoni – 940.9 929.1 13,799,189 13,799,189 95.7 –  Rhodeusocellatus Rosybitterling 850.5 902.4 4,198,183 4,198,183 94.5 – Strategy . Species . Commonname . Estimatedgenomesize,Mb . Assemblysize,Mb . ScaffoldN50,bp . ContigN50,bp . BUSCO,% . Anchored,% . I Diodonholocanthus Longspinedporcupinefish 722.9 643.4 6,098,089 2,149,931 95.7 –  Heterotisniloticusa Africanbonytongue 743.4 669.7 9,615,753 2,307,881 97.6 96.8  Oxyeleotrismarmorata Marblegoby 589.7 502.6 13,190,768 1,270,297 92.9 –  Datnioidesundecimradiatus Mekongtigerperch 623.1 595.7 9,741,635 2,175,996 97.2 –  Chaetodontrifasciatus Melonbutterflyfish 698.5 668.3 9,974,986 1,859,054 97.3 – II Nasovlamingii Bignoseunicornfish 961.4 861.3 5,736,754 182,642 97.8 –  Chelmonrostratusa Copperbandbutterflyfish 711.4 638.9 2,627,953 294,414 98.4 94.4  Helostomatemminckiia Kissinggourami 729.7 635.4 913,351 95,536 96.3 91.8 III Pseudobramasimoni – 940.9 929.1 13,799,189 13,799,189 95.7 –  Rhodeusocellatus Rosybitterling 850.5 902.4 4,198,183 4,198,183 94.5 – ThecommonnameswereobtainedfromtheFishBasewebsite(https://www.fishbase.se/search.php).aChromosome-levelgenomeassembly(Hi-Cdatagenerated). Openinnewtab Table2:Samplecollectiontemplate.Species . Length,cm . Weight,g . Sex . Meta . Intestinal . Muscle . Liver . Time . Place . Longitudeandlatitude . Photo . Samplingperson . Identificationperson . Status . Sebastiscusmarmoratus 11.5 27 ♂ √ √ √ √ 20,190,421 Xiamen N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Living Pisodonophiscancrivorus 12 31 ♂ √ √ √ √ 20,190,421 Ningde N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Fresh Odontobutisobscura 13.3 35 ♀ √ √ √ √ 20,190,421 Hangzhou N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Frozen Species . Length,cm . Weight,g . Sex . Meta . Intestinal . Muscle . Liver . Time . Place . Longitudeandlatitude . Photo . Samplingperson . Identificationperson . Status . Sebastiscusmarmoratus 11.5 27 ♂ √ √ √ √ 20,190,421 Xiamen N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Living Pisodonophiscancrivorus 12 31 ♂ √ √ √ √ 20,190,421 Ningde N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Fresh Odontobutisobscura 13.3 35 ♀ √ √ √ √ 20,190,421 Hangzhou N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Frozen  Openinnewtab Table2:Samplecollectiontemplate.Species . Length,cm . Weight,g . Sex . Meta . Intestinal . Muscle . Liver . Time . Place . Longitudeandlatitude . Photo . Samplingperson . Identificationperson . Status . Sebastiscusmarmoratus 11.5 27 ♂ √ √ √ √ 20,190,421 Xiamen N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Living Pisodonophiscancrivorus 12 31 ♂ √ √ √ √ 20,190,421 Ningde N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Fresh Odontobutisobscura 13.3 35 ♀ √ √ √ √ 20,190,421 Hangzhou N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Frozen Species . Length,cm . Weight,g . Sex . Meta . Intestinal . Muscle . Liver . Time . Place . Longitudeandlatitude . Photo . Samplingperson . Identificationperson . Status . Sebastiscusmarmoratus 11.5 27 ♂ √ √ √ √ 20,190,421 Xiamen N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Living Pisodonophiscancrivorus 12 31 ♂ √ √ √ √ 20,190,421 Ningde N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Fresh Odontobutisobscura 13.3 35 ♀ √ √ √ √ 20,190,421 Hangzhou N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Frozen  Openinnewtab TheFish10KGenomeProject:from100to10,000 WiththeexperiencegainedintheFish10Kpilotstudyandourpublishedresults(e.g.,thegenomeofMekongtigerperch[Datnioidesundecimradiatus]providesinsightsintothephylogeneticpositionofLobotiformesandbiologicalconservation),webelievethattheprojectcanscaleup.Thus,weareproposingaroadmap(Fig. 3)inwhichwewillconstructhigh-qualityreferencegenomesforrepresentativespeciesinallorders(PhaseI)andfamilies(PhaseII),inconcertwiththegenerationofdraftgenomesequencesforadditionalrelatedspecies(PhaseIII).AninterrogationofFishBase(https://www.fishbase.se)andFishesoftheWorld[7]revealedinformationon34,115fishspeciesfrom∼5,000genera,∼529families,and∼80orders(SupplementaryTable2).Thespeciesweredividedinto6lineages(Elasmobranchii,Holocephali,Actinopterygii,Sarcopterygii,Cyclostomes,andMyxini),inwhichElasmobranchiiandHolocephalibelongtoChondrichthyes(cartilaginousfish)andActinopterygiiandSarcopterygiibelongtoOsteichthyes(bonyfish).Asmentionedabove,therearereferencegenomesavailableforatleast1specieseachof56orders,whilefortherestoftheordersreferencegenomesarerequired(Fig. 4).Also,therearefishorderswithalargenumberofspecies(e.g.,Perciformeshas62families;Siluriformeshas40families;andScorpaeniformeshas39families),suggestingthatadditionalhigh-qualityreferencegenomesarerequiredtorepresentthediversebiologicalcharacteristics.Thus,inPhaseIweaimtosequence450bonyfishand50cartilaginousfishspecies,coveringall80orders(SupplementaryTable3).InPhaseII,weaimtosequenceapproximately3,000species,coveringalmostall∼500fishfamilies.InPhaseIII,wewillsequence∼6,500fishgenomes,covering∼5,000genera. Figure3:OpeninnewtabDownloadslideTheroadmapandorganizationofFish10K.Fish10Kisdividedinto3phases,basedontheevolutionaryrelationshipoffish,and3workinggroups(steeringcommittee,scientificgroups,andspeciesgroups).Figure3:OpeninnewtabDownloadslideTheroadmapandorganizationofFish10K.Fish10Kisdividedinto3phases,basedontheevolutionaryrelationshipoffish,and3workinggroups(steeringcommittee,scientificgroups,andspeciesgroups). Figure4:OpeninnewtabDownloadslidePhylogeneticstreeoffish.Jawedvertebrates(gnathostomes)aredividedinto2majorgroups:cartilaginousfish(Chondrichthyes;inorange)andbonyvertebrates(Osteichthyes;inblueandgreen).Bonyfisharegroupedinto2subgroups(Sarcopterygii;green)and(Actinopterygii;blue).Thenumberoffamiliesandspeciesinthe5largestordersarelabeled.Theremaining10ordersofbonyfish(Caproiformes,Callionymiformes,Gobiesociformes,Icosteiformes,Lepisosteiformes,Moroniformes,Scombrolabraciformes,Scorpaeniformes,Trachichthyiformes,andTrachiniformes)and2ordersofcartilaginousfish(RhinopristiformesandSquatiniformes)arenotincludedinthephylogenetictree,duetotheiruncertainpositions.Figure4:OpeninnewtabDownloadslidePhylogeneticstreeoffish.Jawedvertebrates(gnathostomes)aredividedinto2majorgroups:cartilaginousfish(Chondrichthyes;inorange)andbonyvertebrates(Osteichthyes;inblueandgreen).Bonyfisharegroupedinto2subgroups(Sarcopterygii;green)and(Actinopterygii;blue).Thenumberoffamiliesandspeciesinthe5largestordersarelabeled.Theremaining10ordersofbonyfish(Caproiformes,Callionymiformes,Gobiesociformes,Icosteiformes,Lepisosteiformes,Moroniformes,Scombrolabraciformes,Scorpaeniformes,Trachichthyiformes,andTrachiniformes)and2ordersofcartilaginousfish(RhinopristiformesandSquatiniformes)arenotincludedinthephylogenetictree,duetotheiruncertainpositions.Sampling,sequencing,assembly,andannotation Samplingisacriticalchallengeinanylarge-scalegenomeconsortium.Weproposeacentralizedsamplingmode(i.e.,mirroringour93-speciespilotphase)withseveralcenterssetuptocollectsamples.Inadditiontothesesamplingcenters,wewouldliketoobtainfurthersamplesfromacrosstheworld.Weintendtomakesureallsamplesaretakenandtransportedwiththefullcapture,licensing,andlegalpermitsrequiredfromtheappropriateauthorities,incompliancewiththepermitsofendangeredandnon-endangeredspecies.Additionally,wewillobtainpriorinformedconsentforaccessinggeneticresourcesandsharingthebenefitsarisingfromthisproject(followingtheobligationsoftheNagoyaprotocol).Tomakesurewehaveenoughinformationforfurtheranalysisandtomaximizethevalueofthesegenomedata,weproposeasamplingstandardfortheproject.Withassociatedmetadatadesignedtoincludeasmuchinformationaspossible(includingthesourceandgeo-location),wewillbestressingtheimportanceofcollectingimagesofeachspecimen,andofadequatestorageconditions(frozenorvoucherspecimen).Forsequencing,weproposetousebothsecond-andthird-generationsequencingtechnologiestogeneratehigh-qualitygenomeassemblies.Basedonourpilotstudy,andconsideringthefeasibilityofobtainingtherequiredamountofhighmolecularDNA,forthemajorityofthespecieswehavechosenastrategycombiningstLFRdata,low-depthNanoporedata,andHi-Cdata(StrategyIIinFig. 2).Formorecomplexgenomes,wewillgeneratehigh-depthNanoporesequencedatatoensurethatgoodassembliescanbeachieved(StrategyIII,stLFRdata+high-depthNanoporedata+Hi-Cdata;Fig. 2).Forkeyspecies(tobedeterminedbytheworkinggroups;seebelow),wewillemployaPacificBiosciencescircularconsensussequencing,long,high-fidelityapproach,allowingthegenerationofhighlyaccuratelongreads.Forthelarge-scalesequencingof6,000speciesinPhaseIII,weproposetoemploystLFRalone(StrategyIinFig. 2).Thegenomeassemblycriteriawillrefertothemetricstandardofreferencegenomes,forwhichwehavegroupingwiththeVGP(NCBIBioprojectPRJ489243).Fish10kdatasharing AspertheFortLauderdale[8]andTorontoInternationalDataReleaseWorkshopguidelines[9],allsequencingdata(includingrawdata,assemblies,andannotations)willbedepositedinNCBI,aswellastheGigaDBandChinaNationalGeneBankrepositories.ThewebsiteofFish10K(http://icg-ocean.genomics.cn/index.php/fish10kintroduction)willprovidedetailedinformationontheprojectstatus,aswellascontinuouslyupdatedinformationonthesequencedspecies.Italsoprovidesaportalfordatadownloads(particularlyforassembledgenomes).OrganizationofFish10Kconsortium Fish10KhasbeeninitiatedbyacoregroupofresearchersthatformsthesteeringcommitteeofFish10K(Fig. 3).Thesteeringcommitteeoverseestheprojectandisresponsibleforfundraising,expandingthesteeringcommittee,organizingthescientificgroupsandsamplinggroups,coordinatingsampling,identifyingthesequencingcenterswherethesequencingwillbedone,assigningresponsibilities(e.g.,sequencing,assembly)tothosecenters,andcreatinganalysisstrategies.Thesteeringcommitteeisalsoresponsibleforthegenerationofgenomicdata.Variousscientificgroupswillfocusontechnicalandscientificquestionsrelatedtothisproject.Wewishtoreceiveproposalsfromresearcherswhowouldliketotakepartinthesescientificgroups.WealsoinviteresearcherswhoarestudyingfishspecieswhicharerareorextincttojoinFish10Kasmembersinthesamplinggroup(withorwithoutassociatedfundingforsequencing).Inadditiontoobtainingthegenomesequencesoftheirareaofinterest,joiningtheconsortiumprovidesimmediateaccesstoallgenomescurrentlybeingassembledbyFish10K[10].Conclusion Fish10Kwillgenerateanunprecedented,comprehensivedatasetoffish:thelargestandmostdiversevertebrategroup.Oureffortwillallowustocompletethegenomictreeforfishand,inconcertwithotherprojects,suchasVGPandB10K,completetreesforvertebratesingeneral.Availabilityofsupportingdataandmaterials The10fishgenomeassembliesinthepilotstudyhavebeendepositedintheChinaNationalGeneBank(https://db.cngb.org/cnsa)withaccessioncodesCNP0000597andCNP0000691.Thesequencingdataof10fisharealsodepositedinNCBIwithbioprojectnumbersPRJNA597275andPRJNA558872.TheindividualgenomesalsoallhaveindividualDOIsinGigaDB,linkedfromaprojectpage[10].Abbreviations B10K:Bird10,000GenomesProject;BUSCO:benchmarkinguniversalsingle-copyorthologs;Fish10K:10,000FishGenomesProject;stLFR:single-tubelongfragmentreadstechnology;VGP:VertebrateGenomesProject.Competinginterests SomeoftheauthorsareemployeesofBGIGroup.Theauthorsotherwisedeclarethattheyhavenocompetinginterests.Funding Thisworkwassupportedbythespecialfundingof“Bluegranary”scientificandtechnologicalinnovationofChina(2018YFD0900301-05).Authors'contributions S.H.,N.C.,X.X.,X.L.,W.W.,andG.F.conceivedanddesignedthestudy.L.Y.,M.Z.andS.L.performedsamplecollectionandsequencing.Y.S.,S.Z.,andX.H.performedtheassemblyandannotation.X.L.,Y.S.,andG.F.wrotethemanuscript.N.C.andallotherauthorsrevisedandreadthemanuscript.Supplementarydata SupplementaryTablesareavailableonline.ACKNOWLEDGEMENTS TheworkalsoreceivedthetechnicalsupportfromChinaNationalGeneBankReferences 1.KoepfliKP,PatenB,TheGenome10KProject:awayforward.AnnuRevAnimBiosci.2015;3(1):57–111.GoogleScholarCrossrefSearchADSPubMedWorldCat 2.LewinHA,RobinsonGE,KressWJ,etal. EarthBioGenomeProject:sequencinglifeforthefutureoflife.ProcNatlAcadSciUSA.2018;115(17):4325–33.GoogleScholarCrossrefSearchADSPubMedWorldCat 3.VertebrateGenomesProject.VertebrateGenomesProject(VGP),https://genome10k.soe.ucsc.edu/vertebrate-genomes-project/.Accessed10September2019.OpenURLPlaceholderTextWorldCat4.ZhangG,RahbekC,GravesGRetal. Genomics:birdsequencingprojecttakesoff.Nature.2015;522(7554):34,doi:10.1038/522034d.GoogleScholarCrossrefSearchADSPubMedWorldCat 5.SunY,HuangY,LiX,etal. Fish-T1K(transcriptomesof1,000fishes)Project:large-scaletranscriptomedataforfishevolutionstudies.GigaSci.2016;5(1):18–22.GoogleScholarCrossrefSearchADSWorldCat 6.WangO,ChinR,ChengX,etal. Efficientanduniquecobarcodingofsecond-generationsequencingreadsfromlongDNAmoleculesenablingcost-effectiveandaccuratesequencing,haplotyping,anddenovoassembly.GenomeRes.2019;29(5):798–808.GoogleScholarCrossrefSearchADSPubMedWorldCat 7.NelsonJS,GrandeTC,WilsonMV.FishesoftheWorld,Edmonton,Canada.5thed.JohnWiley&Sons,2016.GoogleScholarCrossrefSearchADSGooglePreviewWorldCatCOPAC 8.TheFortLauderdaleAgreement.ReaffirmationandextensionofNHGRIrapiddatareleasepolicies:large-scalesequencingandothercommunityresourceprojects.https://www.genome.gov/10506537/reaffirmation-and-extension-of-nhgri-rapid-data-release-policies.Accessed10September2019.OpenURLPlaceholderTextWorldCat9.Toronto International DataReleaseWorkshopA,BirneyE,HudsonTJ,GreenEDetal. Prepublicationdatasharing.Nature.2009;461(7261):168–70.GoogleScholarCrossrefSearchADSPubMedWorldCat 10.GuangyiF,YueS,LiandongYetal. Supportingdatafor"Fish10Kpilotprojectdata.GigaScienceDatabase.2020,10.5524/100661.GoogleScholarOpenURLPlaceholderTextWorldCatCrossref  Authornotes Equalcontribution.©TheAuthor(s)2020.PublishedbyOxfordUniversityPress.ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense(http://creativecommons.org/licenses/by/4.0/),whichpermitsunrestrictedreuse,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycited. IssueSection: DataNote Downloadallslides Supplementarydata Supplementarydata giaa080_Supplemental_Tables-zipfile Advertisement 3,961 Views 4 Citations ViewMetrics × Emailalerts Articleactivityalert Advancearticlealerts Newissuealert Inprogressissuealert ReceiveexclusiveoffersandupdatesfromOxfordAcademic Relatedarticlesin WebofScience GoogleScholar Citingarticlesvia WebofScience(4) GoogleScholar Crossref Latest MostRead MostCited AnoverviewoftheNationalCOVID-19ChestImagingDatabase:dataqualityandcohortanalysis Denovoscreeningofdisease-resistantgenesfromthechromosome-levelgenomeofrareminnowusingCRISPR-cas9randommutation CNVpytor:atoolforcopynumbervariationdetectionandanalysisfromreaddepthandalleleimbalanceinwhole-genomesequencing Chromosome-levelgenomeassembliesofChannaargusandChannamaculataandcomparativeanalysisoftheirtemperatureadaptability Studyingmutationrateevolutioninprimates—theeffectsofcomputationalpipelinesandparameterchoices Advertisement Advertisement AboutGigaScience EditorialBoard AuthorGuidelines Facebook Twitter AdvertisingandCorporateServices JournalsCareerNetwork OnlineISSN2047-217XCopyright©2021BGI AboutUs ContactUs Careers Help Access&Purchase Rights&Permissions OpenAccess PotentiallyOffensiveContent Connect JoinOurMailingList OUPblog Twitter Facebook YouTube Tumblr Resources Authors Librarians Societies Sponsors&Advertisers Press&Media Agents Explore ShopOUPAcademic OxfordDictionaries Epigeum OUPWorldwide UniversityofOxford OxfordUniversityPressisadepartmentoftheUniversityofOxford.ItfurtherstheUniversity'sobjectiveofexcellenceinresearch,scholarship,andeducationbypublishingworldwide Copyright©2021OxfordUniversityPress CookiePolicy PrivacyPolicy LegalNotice SiteMap Accessibility Close ThisFeatureIsAvailableToSubscribersOnly SignInorCreateanAccount Close ThisPDFisavailabletoSubscribersOnly ViewArticleAbstract&PurchaseOptions Forfullaccesstothispdf,signintoanexistingaccount,orpurchaseanannualsubscription. Close



請為這篇文章評分?