To date, we have assembled the genomes of the first 10 species. For the 10 assembled fish genomes, the average contig N50 and the average ...
SkiptoMainContent
Advertisement
SearchMenu
AccountMenu
Menu
SignIn
Register
NavbarSearchFilter
ThisissueAllGigaScience
AllJournals
MobileMicrositeSearchTerm
Search
SignIn
Register
Articles
Submit
AuthorGuidelines
SubmissionSite
OpenAccess
Alerts
About
AboutGigaScience
EditorialBoard
ReviewerGuidelines
EditorialPolicies
AuthorshipGuidelines
Articles
Submit
AuthorGuidelines
SubmissionSite
OpenAccess
Alerts
About
AboutGigaScience
EditorialBoard
ReviewerGuidelines
EditorialPolicies
AuthorshipGuidelines
Close
searchfilter
Thisissue
AllGigaScience
AllJournals
searchinput
Search
AdvancedSearch
SearchMenu
ArticleNavigation
Closemobilesearchnavigation
ArticleNavigation
Volume9
Issue8
August2020
ArticleContents
Abstract
Introduction
MaterialsandMethods
Conclusion
Availabilityofsupportingdataandmaterials
Abbreviations
Competinginterests
Funding
Authors'contributions
Supplementarydata
ACKNOWLEDGEMENTS
References
ArticleNavigation
ArticleNavigation
Initialdatareleaseandannouncementofthe10,000FishGenomesProject(Fish10K)
GuangyiFan,
GuangyiFan
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,ChinaBGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,ChinaStateKeyLaboratoryofAgriculturalGenomics,BGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,China
https://orcid.org/0000-0001-7365-1590
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
YueSong,
YueSong
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China
https://orcid.org/0000-0002-2974-6442
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
LiandongYang,
LiandongYang
KeyLaboratoryofAquaticBiodiversityandConservation,InstituteofHydrobiology,ChineseAcademyofSciences,No.7DonghuSouthRoad,WuchangDistrict,Wuhan,HubeiProvince,China
https://orcid.org/0000-0001-7570-0296
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
XiaoyunHuang,
XiaoyunHuang
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
SuyuZhang,
SuyuZhang
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China
https://orcid.org/0000-0002-0653-9846
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
MengqiZhang,
MengqiZhang
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China
https://orcid.org/0000-0002-5641-0557
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
XianweiYang,
XianweiYang
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China
https://orcid.org/0000-0003-4388-9674
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
YueChang,
YueChang
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China
https://orcid.org/0000-0002-6902-9931
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
HeZhang,
HeZhang
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,ChinaBGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,China
https://orcid.org/0000-0001-9294-1403
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
YongxinLi,
YongxinLi
CenterforEcologicalandEnvironmentalSciences,NorthwesternPolytechnicalUniversity,1DongxiangRoad,Chang'anDistrict,Xi'anShaanxi,710129,China
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
...Showmore
ShanshanLiu,
ShanshanLiu
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China
https://orcid.org/0000-0002-5756-1728
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
LiliYu,
LiliYu
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,China
https://orcid.org/0000-0003-0435-0385
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
JefferyChu,
JefferyChu
Frasergen,DonghuHigh-TechDevelopmentZone,DonghuHigh-TechDevelopmentZone,Wuhan,HubeiProvince,China
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
IngeSeim,
IngeSeim
IntegrativeBiologyLaboratory,CollegeofLifeSciences,NanjingNormalUniversity,No.1WenyuanRoadQixiaDistrict,Nanjing,210023,ChinaComparativeandEndocrineBiologyLaboratory,TranslationalResearchInstitute-InstituteofHealthandBiomedicalInnovation,SchoolofBiomedicalSciences,QueenslandUniversityofTechnology,Brisbane4102,Queensland,Australia
https://orcid.org/0000-0001-8594-7217
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
ChenguangFeng,
ChenguangFeng
CenterforEcologicalandEnvironmentalSciences,NorthwesternPolytechnicalUniversity,1DongxiangRoad,Chang'anDistrict,Xi'anShaanxi,710129,China
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
ThomasJNear,
ThomasJNear
DepartmentofEcology&EvolutionaryBiology,YaleUniversity,NewHaven,CT06511,USA
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
RodAWing,
RodAWing
BiologicalandEnvironmentalSciences&EngineeringDivision,KingAbdullahUniversityofScienceandTechnology,Thuwal23955-6900,KingdomofSaudiArabia
https://orcid.org/0000-0001-6633-6226
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
WenWang,
WenWang
CenterforEcologicalandEnvironmentalSciences,NorthwesternPolytechnicalUniversity,1DongxiangRoad,Chang'anDistrict,Xi'anShaanxi,710129,China
https://orcid.org/0000-0002-7801-2066
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
KunWang,
KunWang
CenterforEcologicalandEnvironmentalSciences,NorthwesternPolytechnicalUniversity,1DongxiangRoad,Chang'anDistrict,Xi'anShaanxi,710129,China
https://orcid.org/0000-0001-6059-6529
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
JingWang,
JingWang
KeyLaboratoryofMarineEcologyandEnvironmentalSciences,InstituteofOceanology,ChineseAcademyofSciences,7NanhaiRoad,Qingdao,Shandong266071,ChinaMarineEcologyandEnvironmentalScienceLaboratory,PilotNationalLaboratoryforMarineScienceandTechnology,1WenhaiRoad,Aoshanwei,Jimo,Qingdao,Shandong,266237,ChinaCenterforOceanMega-Science,ChineseAcademyofSciences,No.7,NanhaiRoad,QingdaoCity,266400,China
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
XunXu,
XunXu
BGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,China
https://orcid.org/0000-0002-5338-5173
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
HuanmingYang,
HuanmingYang
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,ChinaBGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,China
https://orcid.org/0000-0002-0858-3410
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
XinLiu,
XinLiu
BGI-Qingdao,2HengyunshanRoad,WestCoastNewArea,266426,Qingdao,ChinaBGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,ChinaStateKeyLaboratoryofAgriculturalGenomics,BGI-Shenzhen,Building11,BeishanIndustrialZone,YantianDistrict,Shenzhen,518083,China
https://orcid.org/0000-0003-3256-2940
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
NanshengChen,
NanshengChen
KeyLaboratoryofMarineEcologyandEnvironmentalSciences,InstituteofOceanology,ChineseAcademyofSciences,7NanhaiRoad,Qingdao,Shandong266071,ChinaMarineEcologyandEnvironmentalScienceLaboratory,PilotNationalLaboratoryforMarineScienceandTechnology,1WenhaiRoad,Aoshanwei,Jimo,Qingdao,Shandong,266237,ChinaCenterforOceanMega-Science,ChineseAcademyofSciences,No.7,NanhaiRoad,QingdaoCity,266400,ChinaDepartmentofMolecularBiologyandBiochemistry,SimonFraserUniversity,Burnaby,BC,V5A1S6,Canada
https://orcid.org/0000-0002-6361-964X
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
ShunpingHe
ShunpingHe
KeyLaboratoryofAquaticBiodiversityandConservation,InstituteofHydrobiology,ChineseAcademyofSciences,No.7DonghuSouthRoad,WuchangDistrict,Wuhan,HubeiProvince,China
Correspondenceaddress.ShunpingHe.E-mail:[email protected].
https://orcid.org/0000-0001-9087-7890
Searchforotherworksbythisauthoron:
OxfordAcademic
GoogleScholar
Equalcontribution.
AuthorNotes
GigaScience,Volume9,Issue8,August2020,giaa080,https://doi.org/10.1093/gigascience/giaa080
Published:
18August2020
Articlehistory
Received:
13April2020
Revisionreceived:
23June2020
Accepted:
03July2020
Published:
18August2020
PDF
SplitView
Views
Articlecontents
Figures&tables
Video
Audio
SupplementaryData
Annotate
Cite
Cite
GuangyiFan,YueSong,LiandongYang,XiaoyunHuang,SuyuZhang,MengqiZhang,XianweiYang,YueChang,HeZhang,YongxinLi,ShanshanLiu,LiliYu,JefferyChu,IngeSeim,ChenguangFeng,ThomasJNear,RodAWing,WenWang,KunWang,JingWang,XunXu,HuanmingYang,XinLiu,NanshengChen,ShunpingHe,Initialdatareleaseandannouncementofthe10,000FishGenomesProject(Fish10K),GigaScience,Volume9,Issue8,August2020,giaa080,https://doi.org/10.1093/gigascience/giaa080
SelectFormat
Selectformat
.ris(Mendeley,Papers,Zotero)
.enw(EndNote)
.bibtex(BibTex)
.txt(Medlars,RefWorks)
Downloadcitation
Close
PermissionsIcon
Permissions
Share
Email
Twitter
Facebook
More
NavbarSearchFilter
ThisissueAllGigaScience
AllJournals
MobileMicrositeSearchTerm
Search
SignIn
Register
Close
searchfilter
Thisissue
AllGigaScience
AllJournals
searchinput
Search
AdvancedSearch
SearchMenu
Abstract
BackgroundWithmorethan30,000species,fish—includingbony,jawless,andcartilaginousfish—arethelargestvertebrategroup,andincludesomeoftheearliestvertebrates.Despitetheircriticalrolesinmanyecosystemsandhumansociety,fishgenomicslagsbehindworkonbirdsandmammals.Thisseverelylimitsourunderstandingofevolutionandhindersprogressontheconservationandsustainableutilizationoffish.ResultsHere,weannouncetheFish10Kproject,aportionoftheEarthBioGenomeProjectaimingtosequence10,000representativefishgenomesinasystematicfashionwithin10years,andweofficiallywelcomecollaboratorstojointhiseffort.Asasteptowardsthisgoal,wehereindescribeafeasibleworkflowfortheprocurementandstorageofbiospecimens,aswellassequencingandassemblystrategies.ConclusionsToillustrate,wepresentthegenomesof10fishspeciesfromacohortof93specieschosenfortechnologydevelopment.
Introduction
Fishgenomessequencedtodate
Asofthetimeofthiswriting,genomeassembliesarepubliclyavailableforfewerthan1%offishspecies(244species,assourcedfromNCBIwhenaccessedon21April2020;SupplementaryTable1).Theirassemblylengthsrangefrom302.36Mb(Diretmusargenteus)to4.47Gb(Scyliorhinustorazame),withanaveragelengthof872.64Mb.TheaveragescaffoldN50andcontigN50valuesare8.82Mband914.18Kb,respectively,whilethemedianscaffoldN50andcontigN50are613.59Kband20.82Kb,respectively.Thereare112specieswithascaffoldN50ofmorethan1Mb,ofwhich43haveacontigN50above1Mb(Fig. 1).Thesegenomeshavefueledanumberofstudiesonthephylogenyandevolutionoffish(e.g.,theAfricancoelacanth),evolutionaryprocessesofspecificfishsubgroups(e.g.,elephantsharkgenomeillustratingthephylogeneticrelationshipofChondrichthyesasasistergrouptobonyvertebrates),geneticmechanismsofadaptationtodifferentenvironments(e.g.,thedeep-seaMarianaTrenchsnailfishandcave-dwellingfish),andspecificbiologicalprocesses(e.g.,theevolutionaryprocessofZWsexchromosomes).Nevertheless,thecurrentfishgenomesequencingresultsareonlyadropintheocean,andnumerouscriticalresearchquestionsremaintoberesolved.Anon-exhaustivelistincludesgainingcomprehensiveandclearunderstandingsoffishphylogeny,genomesizediversityandchromosomeevolution,diverseenvironmentaladaptations,morphologyevolution,respiratorysystem,immunesystem,andtheevolutionandfunctionofultraconservedelementsandconservednonexonicelements.
Figure1:OpeninnewtabDownloadslideAssemblystatisticsoffishgenomesinpublicdatabases.(a)Summaryofgenomesize.(b,c)N50statistics.Ascaffoldisasetofcontigslinkedtogetherwithgapsintroducedinbetween.N50isthemediancontigsizeofthegenomicassembly.It'sametricthatcouldbeusedtoevaluatethequalityofgenomeassembly.Figure1:OpeninnewtabDownloadslideAssemblystatisticsoffishgenomesinpublicdatabases.(a)Summaryofgenomesize.(b,c)N50statistics.Ascaffoldisasetofcontigslinkedtogetherwithgapsintroducedinbetween.N50isthemediancontigsizeofthegenomicassembly.It'sametricthatcouldbeusedtoevaluatethequalityofgenomeassembly.Theeraofgenomeconsortiums
WiththerapiddevelopmentofDNAsequencingtechnology,thisisthetimeforlarge-scale,collaborativegenomicstudiestomapthevertebratetreeoflife.ThefirstsuchprojectwastheGenome10K,establishedin2009,whichaimedtosequenceandassemblegenomesofabout10,000vertebratespecies[1].TheEarthBioGenomeProjectaimstosequence,catalog,andcharacterizethegenomesofallofEarth'seukaryoticbiodiversity[2].TheVertebrateGenomesProject(VGP)waslaunchedin2017togeneratechromosome-level,haplotype-phasedgenomeassembliesofallvertebratespecies[3].TheBird10,000GenomesProject(B10K)wasinitiated[4]afterasuccessfulphylogenomicstudyon45aviangenomesin2014.TheB10Kprojectsaimstosequenceandassembleallknownbirdspeciesin3phases.Despitecurrentchallengesinfunding,sampling,sequencing,assembly,anddataanalysis,theseprojectshavealreadymadesubstantialprogress.Forfish,whichmakeupmorethanhalfofallvertebratespecies,noexclusivefishgenomeprojectshavebeeninitiatedatasimilarscale.Toourknowledge,theonlylarge-scalegenomicstudywasFish-T1K(Transcriptomesof1,000Fishes),whichaimedtosequencethetranscriptomes(RNAsequence)ofray-finnedfish[5].However,theinsightsgainedfromtranscriptomedataalonearerelativelylimited.Acceleratingfishgenomicsbylarge-scalegenomesequencingeffortswouldundoubtedlyboostresearchintofishbiodiversity,speciation,andadaptation,aswellasaidingtheconservationandsustainableutilizationoffish.TheFish10KGenomeProject
WehereannouncetheFish10KGenomeProject,asub-projectoftheEarthBioGenomeProjectaimingtosample,sequence,assemble,andanalyzegenomesof10,000fishspecies.Weareproposinganeffectiveandintegratedworkflowinwhichmajorchallengesareaddressedandinwhichhigh-qualityreferencegenomesareconstructed(chromosomelevelinPhasesIandIIandascaffold-levelassemblywithascaffoldN50largerthan1MbinPhaseIII).Throughdevelopingandapplyingeffectiveanalysismethods,wewillbeabletoaddresscriticalevolutionaryandbiologicalresearchquestionsrelatedtofish.Inordertoprovetheefficiencyofourworkflowandthefeasibilityofthislarge-scalegenomeproject,10speciesfrom93collectedsamplesareusedtovalidateournewsequencingtechnology,andthesegenomeshavebeenreleasedaspartofapilotproject.MaterialsandMethods
Feasibilitytestandthereleaseof10fishgenomes
Inordertoestablishcost-effectivestrategiesandassessthefeasibilityofalarge-scalegenomeproject,weinitiatedapilotstudyinJune2017.Overthenext2years,wewenton4expeditionsacrosslakes,rivers,andcoastalwatersofChina,collecting324fishspecies.Aftercarefuldocumentationofsampleinformationandspeciesidentification,thetissuesof93specieswereselectedforDNAextraction,and10ofthesespecieswereusedforsequencing.Weusedsingle-tubelongfragmentreadstechnology(stLFR)[6]andtheDNBSEQplatformtosequencethespecies,generatinglong-read(NanoporeorPacBio)andHi-Cdataforasubsetofthespecies.Inthisway,wewereabletotestthefeasibilityof3differentsequencingandassemblystrategies(Fig. 2):stLFRdataalone(syntheticlongreadsgeneratedusingasecond-generationsequencingplatform;StrategyI);stLFRdatacombinedwithlow-depth,longreads(∼10×rawNanoporedatatofillinthegaps;StrategyII);andhigh-depth,longreads(∼80×rawNanoporedata)combinedwithsecond-generationshortreads(eithershortinsertsizelibrariesorstLFR;StrategyIII;Table 1andTable2).Todate,wehaveassembledthegenomesofthefirst10species.Forthe10assembledfishgenomes,theaveragecontigN50andtheaveragescaffoldN50are2.83Mband7.59Mb,respectively.Theaveragebenchmarkinguniversalsingle-copyorthologs(BUSCO)completenessestimateis96%.AcomparisonofassemblystatisticsrevealedthatassembliesgeneratedwithStrategyIIandStrategyIIIweremorecontinuous.Toillustrateoureffort,wearereleasingthegenomesof10representativebonyfishgenomescoveringthe3assemblystrategies.ThecontigN50sof7ofthesegenomesaremorethan1Mbandaminimumof93%ofBUSCOgeneswerefound,indicatingthegenomeassembliesareofhighquality.Therewere3genomesassembledatthechromosomelevel,withmorethan92%ofscaffoldsanchoredusingHi-Cdata.
Figure2:OpeninnewtabDownloadslideThesequencingandassemblystrategies.Inthepreferredstrategy(StrategyII),high-qualityDNAfragments(≥40Kb)areusedtoconstructastLFRlibrary,whichissequencedusingtheDNBSEQplatform.Low–sequencingdepthlongreadsareonlyusedtoimprovethecontinuityofhighlycomplexregions(increasethecontigN50).InthealternativeStrategyI,high-depthlongreadsareusedtoconstructcontigs,whilelow-depthstLFRreadsareusedtopolishthecontigandlinkthescaffolds.Hi-Cdataareusedtogenerateachromosome-levelassembly.Figure2:OpeninnewtabDownloadslideThesequencingandassemblystrategies.Inthepreferredstrategy(StrategyII),high-qualityDNAfragments(≥40Kb)areusedtoconstructastLFRlibrary,whichissequencedusingtheDNBSEQplatform.Low–sequencingdepthlongreadsareonlyusedtoimprovethecontinuityofhighlycomplexregions(increasethecontigN50).InthealternativeStrategyI,high-depthlongreadsareusedtoconstructcontigs,whilelow-depthstLFRreadsareusedtopolishthecontigandlinkthescaffolds.Hi-Cdataareusedtogenerateachromosome-levelassembly.
Table1:Assemblystatisticsofthe10releasedgenomeassembliesStrategy
. Species
. Commonname
. Estimatedgenomesize,Mb
. Assemblysize,Mb
. ScaffoldN50,bp
. ContigN50,bp
. BUSCO,%
. Anchored,%
. I Diodonholocanthus Longspinedporcupinefish 722.9 643.4 6,098,089 2,149,931 95.7 – Heterotisniloticusa Africanbonytongue 743.4 669.7 9,615,753 2,307,881 97.6 96.8 Oxyeleotrismarmorata Marblegoby 589.7 502.6 13,190,768 1,270,297 92.9 – Datnioidesundecimradiatus Mekongtigerperch 623.1 595.7 9,741,635 2,175,996 97.2 – Chaetodontrifasciatus Melonbutterflyfish 698.5 668.3 9,974,986 1,859,054 97.3 – II Nasovlamingii Bignoseunicornfish 961.4 861.3 5,736,754 182,642 97.8 – Chelmonrostratusa Copperbandbutterflyfish 711.4 638.9 2,627,953 294,414 98.4 94.4 Helostomatemminckiia Kissinggourami 729.7 635.4 913,351 95,536 96.3 91.8 III Pseudobramasimoni – 940.9 929.1 13,799,189 13,799,189 95.7 – Rhodeusocellatus Rosybitterling 850.5 902.4 4,198,183 4,198,183 94.5 – Strategy
. Species
. Commonname
. Estimatedgenomesize,Mb
. Assemblysize,Mb
. ScaffoldN50,bp
. ContigN50,bp
. BUSCO,%
. Anchored,%
. I Diodonholocanthus Longspinedporcupinefish 722.9 643.4 6,098,089 2,149,931 95.7 – Heterotisniloticusa Africanbonytongue 743.4 669.7 9,615,753 2,307,881 97.6 96.8 Oxyeleotrismarmorata Marblegoby 589.7 502.6 13,190,768 1,270,297 92.9 – Datnioidesundecimradiatus Mekongtigerperch 623.1 595.7 9,741,635 2,175,996 97.2 – Chaetodontrifasciatus Melonbutterflyfish 698.5 668.3 9,974,986 1,859,054 97.3 – II Nasovlamingii Bignoseunicornfish 961.4 861.3 5,736,754 182,642 97.8 – Chelmonrostratusa Copperbandbutterflyfish 711.4 638.9 2,627,953 294,414 98.4 94.4 Helostomatemminckiia Kissinggourami 729.7 635.4 913,351 95,536 96.3 91.8 III Pseudobramasimoni – 940.9 929.1 13,799,189 13,799,189 95.7 – Rhodeusocellatus Rosybitterling 850.5 902.4 4,198,183 4,198,183 94.5 – ThecommonnameswereobtainedfromtheFishBasewebsite(https://www.fishbase.se/search.php).aChromosome-levelgenomeassembly(Hi-Cdatagenerated).
Openinnewtab
Table1:Assemblystatisticsofthe10releasedgenomeassembliesStrategy
. Species
. Commonname
. Estimatedgenomesize,Mb
. Assemblysize,Mb
. ScaffoldN50,bp
. ContigN50,bp
. BUSCO,%
. Anchored,%
. I Diodonholocanthus Longspinedporcupinefish 722.9 643.4 6,098,089 2,149,931 95.7 – Heterotisniloticusa Africanbonytongue 743.4 669.7 9,615,753 2,307,881 97.6 96.8 Oxyeleotrismarmorata Marblegoby 589.7 502.6 13,190,768 1,270,297 92.9 – Datnioidesundecimradiatus Mekongtigerperch 623.1 595.7 9,741,635 2,175,996 97.2 – Chaetodontrifasciatus Melonbutterflyfish 698.5 668.3 9,974,986 1,859,054 97.3 – II Nasovlamingii Bignoseunicornfish 961.4 861.3 5,736,754 182,642 97.8 – Chelmonrostratusa Copperbandbutterflyfish 711.4 638.9 2,627,953 294,414 98.4 94.4 Helostomatemminckiia Kissinggourami 729.7 635.4 913,351 95,536 96.3 91.8 III Pseudobramasimoni – 940.9 929.1 13,799,189 13,799,189 95.7 – Rhodeusocellatus Rosybitterling 850.5 902.4 4,198,183 4,198,183 94.5 – Strategy
. Species
. Commonname
. Estimatedgenomesize,Mb
. Assemblysize,Mb
. ScaffoldN50,bp
. ContigN50,bp
. BUSCO,%
. Anchored,%
. I Diodonholocanthus Longspinedporcupinefish 722.9 643.4 6,098,089 2,149,931 95.7 – Heterotisniloticusa Africanbonytongue 743.4 669.7 9,615,753 2,307,881 97.6 96.8 Oxyeleotrismarmorata Marblegoby 589.7 502.6 13,190,768 1,270,297 92.9 – Datnioidesundecimradiatus Mekongtigerperch 623.1 595.7 9,741,635 2,175,996 97.2 – Chaetodontrifasciatus Melonbutterflyfish 698.5 668.3 9,974,986 1,859,054 97.3 – II Nasovlamingii Bignoseunicornfish 961.4 861.3 5,736,754 182,642 97.8 – Chelmonrostratusa Copperbandbutterflyfish 711.4 638.9 2,627,953 294,414 98.4 94.4 Helostomatemminckiia Kissinggourami 729.7 635.4 913,351 95,536 96.3 91.8 III Pseudobramasimoni – 940.9 929.1 13,799,189 13,799,189 95.7 – Rhodeusocellatus Rosybitterling 850.5 902.4 4,198,183 4,198,183 94.5 – ThecommonnameswereobtainedfromtheFishBasewebsite(https://www.fishbase.se/search.php).aChromosome-levelgenomeassembly(Hi-Cdatagenerated).
Openinnewtab
Table2:Samplecollectiontemplate.Species
. Length,cm
. Weight,g
. Sex
. Meta
. Intestinal
. Muscle
. Liver
. Time
. Place
. Longitudeandlatitude
. Photo
. Samplingperson
. Identificationperson
. Status
. Sebastiscusmarmoratus 11.5 27 ♂ √ √ √ √ 20,190,421 Xiamen N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Living Pisodonophiscancrivorus 12 31 ♂ √ √ √ √ 20,190,421 Ningde N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Fresh Odontobutisobscura 13.3 35 ♀ √ √ √ √ 20,190,421 Hangzhou N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Frozen Species
. Length,cm
. Weight,g
. Sex
. Meta
. Intestinal
. Muscle
. Liver
. Time
. Place
. Longitudeandlatitude
. Photo
. Samplingperson
. Identificationperson
. Status
. Sebastiscusmarmoratus 11.5 27 ♂ √ √ √ √ 20,190,421 Xiamen N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Living Pisodonophiscancrivorus 12 31 ♂ √ √ √ √ 20,190,421 Ningde N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Fresh Odontobutisobscura 13.3 35 ♀ √ √ √ √ 20,190,421 Hangzhou N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Frozen
Openinnewtab
Table2:Samplecollectiontemplate.Species
. Length,cm
. Weight,g
. Sex
. Meta
. Intestinal
. Muscle
. Liver
. Time
. Place
. Longitudeandlatitude
. Photo
. Samplingperson
. Identificationperson
. Status
. Sebastiscusmarmoratus 11.5 27 ♂ √ √ √ √ 20,190,421 Xiamen N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Living Pisodonophiscancrivorus 12 31 ♂ √ √ √ √ 20,190,421 Ningde N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Fresh Odontobutisobscura 13.3 35 ♀ √ √ √ √ 20,190,421 Hangzhou N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Frozen Species
. Length,cm
. Weight,g
. Sex
. Meta
. Intestinal
. Muscle
. Liver
. Time
. Place
. Longitudeandlatitude
. Photo
. Samplingperson
. Identificationperson
. Status
. Sebastiscusmarmoratus 11.5 27 ♂ √ √ √ √ 20,190,421 Xiamen N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Living Pisodonophiscancrivorus 12 31 ♂ √ √ √ √ 20,190,421 Ningde N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Fresh Odontobutisobscura 13.3 35 ♀ √ √ √ √ 20,190,421 Hangzhou N24°11′59.58″E118°25′1.92″ √ Dr.Meng Prof.He Frozen
Openinnewtab
TheFish10KGenomeProject:from100to10,000
WiththeexperiencegainedintheFish10Kpilotstudyandourpublishedresults(e.g.,thegenomeofMekongtigerperch[Datnioidesundecimradiatus]providesinsightsintothephylogeneticpositionofLobotiformesandbiologicalconservation),webelievethattheprojectcanscaleup.Thus,weareproposingaroadmap(Fig. 3)inwhichwewillconstructhigh-qualityreferencegenomesforrepresentativespeciesinallorders(PhaseI)andfamilies(PhaseII),inconcertwiththegenerationofdraftgenomesequencesforadditionalrelatedspecies(PhaseIII).AninterrogationofFishBase(https://www.fishbase.se)andFishesoftheWorld[7]revealedinformationon34,115fishspeciesfrom∼5,000genera,∼529families,and∼80orders(SupplementaryTable2).Thespeciesweredividedinto6lineages(Elasmobranchii,Holocephali,Actinopterygii,Sarcopterygii,Cyclostomes,andMyxini),inwhichElasmobranchiiandHolocephalibelongtoChondrichthyes(cartilaginousfish)andActinopterygiiandSarcopterygiibelongtoOsteichthyes(bonyfish).Asmentionedabove,therearereferencegenomesavailableforatleast1specieseachof56orders,whilefortherestoftheordersreferencegenomesarerequired(Fig. 4).Also,therearefishorderswithalargenumberofspecies(e.g.,Perciformeshas62families;Siluriformeshas40families;andScorpaeniformeshas39families),suggestingthatadditionalhigh-qualityreferencegenomesarerequiredtorepresentthediversebiologicalcharacteristics.Thus,inPhaseIweaimtosequence450bonyfishand50cartilaginousfishspecies,coveringall80orders(SupplementaryTable3).InPhaseII,weaimtosequenceapproximately3,000species,coveringalmostall∼500fishfamilies.InPhaseIII,wewillsequence∼6,500fishgenomes,covering∼5,000genera.
Figure3:OpeninnewtabDownloadslideTheroadmapandorganizationofFish10K.Fish10Kisdividedinto3phases,basedontheevolutionaryrelationshipoffish,and3workinggroups(steeringcommittee,scientificgroups,andspeciesgroups).Figure3:OpeninnewtabDownloadslideTheroadmapandorganizationofFish10K.Fish10Kisdividedinto3phases,basedontheevolutionaryrelationshipoffish,and3workinggroups(steeringcommittee,scientificgroups,andspeciesgroups).
Figure4:OpeninnewtabDownloadslidePhylogeneticstreeoffish.Jawedvertebrates(gnathostomes)aredividedinto2majorgroups:cartilaginousfish(Chondrichthyes;inorange)andbonyvertebrates(Osteichthyes;inblueandgreen).Bonyfisharegroupedinto2subgroups(Sarcopterygii;green)and(Actinopterygii;blue).Thenumberoffamiliesandspeciesinthe5largestordersarelabeled.Theremaining10ordersofbonyfish(Caproiformes,Callionymiformes,Gobiesociformes,Icosteiformes,Lepisosteiformes,Moroniformes,Scombrolabraciformes,Scorpaeniformes,Trachichthyiformes,andTrachiniformes)and2ordersofcartilaginousfish(RhinopristiformesandSquatiniformes)arenotincludedinthephylogenetictree,duetotheiruncertainpositions.Figure4:OpeninnewtabDownloadslidePhylogeneticstreeoffish.Jawedvertebrates(gnathostomes)aredividedinto2majorgroups:cartilaginousfish(Chondrichthyes;inorange)andbonyvertebrates(Osteichthyes;inblueandgreen).Bonyfisharegroupedinto2subgroups(Sarcopterygii;green)and(Actinopterygii;blue).Thenumberoffamiliesandspeciesinthe5largestordersarelabeled.Theremaining10ordersofbonyfish(Caproiformes,Callionymiformes,Gobiesociformes,Icosteiformes,Lepisosteiformes,Moroniformes,Scombrolabraciformes,Scorpaeniformes,Trachichthyiformes,andTrachiniformes)and2ordersofcartilaginousfish(RhinopristiformesandSquatiniformes)arenotincludedinthephylogenetictree,duetotheiruncertainpositions.Sampling,sequencing,assembly,andannotation
Samplingisacriticalchallengeinanylarge-scalegenomeconsortium.Weproposeacentralizedsamplingmode(i.e.,mirroringour93-speciespilotphase)withseveralcenterssetuptocollectsamples.Inadditiontothesesamplingcenters,wewouldliketoobtainfurthersamplesfromacrosstheworld.Weintendtomakesureallsamplesaretakenandtransportedwiththefullcapture,licensing,andlegalpermitsrequiredfromtheappropriateauthorities,incompliancewiththepermitsofendangeredandnon-endangeredspecies.Additionally,wewillobtainpriorinformedconsentforaccessinggeneticresourcesandsharingthebenefitsarisingfromthisproject(followingtheobligationsoftheNagoyaprotocol).Tomakesurewehaveenoughinformationforfurtheranalysisandtomaximizethevalueofthesegenomedata,weproposeasamplingstandardfortheproject.Withassociatedmetadatadesignedtoincludeasmuchinformationaspossible(includingthesourceandgeo-location),wewillbestressingtheimportanceofcollectingimagesofeachspecimen,andofadequatestorageconditions(frozenorvoucherspecimen).Forsequencing,weproposetousebothsecond-andthird-generationsequencingtechnologiestogeneratehigh-qualitygenomeassemblies.Basedonourpilotstudy,andconsideringthefeasibilityofobtainingtherequiredamountofhighmolecularDNA,forthemajorityofthespecieswehavechosenastrategycombiningstLFRdata,low-depthNanoporedata,andHi-Cdata(StrategyIIinFig. 2).Formorecomplexgenomes,wewillgeneratehigh-depthNanoporesequencedatatoensurethatgoodassembliescanbeachieved(StrategyIII,stLFRdata+high-depthNanoporedata+Hi-Cdata;Fig. 2).Forkeyspecies(tobedeterminedbytheworkinggroups;seebelow),wewillemployaPacificBiosciencescircularconsensussequencing,long,high-fidelityapproach,allowingthegenerationofhighlyaccuratelongreads.Forthelarge-scalesequencingof6,000speciesinPhaseIII,weproposetoemploystLFRalone(StrategyIinFig. 2).Thegenomeassemblycriteriawillrefertothemetricstandardofreferencegenomes,forwhichwehavegroupingwiththeVGP(NCBIBioprojectPRJ489243).Fish10kdatasharing
AspertheFortLauderdale[8]andTorontoInternationalDataReleaseWorkshopguidelines[9],allsequencingdata(includingrawdata,assemblies,andannotations)willbedepositedinNCBI,aswellastheGigaDBandChinaNationalGeneBankrepositories.ThewebsiteofFish10K(http://icg-ocean.genomics.cn/index.php/fish10kintroduction)willprovidedetailedinformationontheprojectstatus,aswellascontinuouslyupdatedinformationonthesequencedspecies.Italsoprovidesaportalfordatadownloads(particularlyforassembledgenomes).OrganizationofFish10Kconsortium
Fish10KhasbeeninitiatedbyacoregroupofresearchersthatformsthesteeringcommitteeofFish10K(Fig. 3).Thesteeringcommitteeoverseestheprojectandisresponsibleforfundraising,expandingthesteeringcommittee,organizingthescientificgroupsandsamplinggroups,coordinatingsampling,identifyingthesequencingcenterswherethesequencingwillbedone,assigningresponsibilities(e.g.,sequencing,assembly)tothosecenters,andcreatinganalysisstrategies.Thesteeringcommitteeisalsoresponsibleforthegenerationofgenomicdata.Variousscientificgroupswillfocusontechnicalandscientificquestionsrelatedtothisproject.Wewishtoreceiveproposalsfromresearcherswhowouldliketotakepartinthesescientificgroups.WealsoinviteresearcherswhoarestudyingfishspecieswhicharerareorextincttojoinFish10Kasmembersinthesamplinggroup(withorwithoutassociatedfundingforsequencing).Inadditiontoobtainingthegenomesequencesoftheirareaofinterest,joiningtheconsortiumprovidesimmediateaccesstoallgenomescurrentlybeingassembledbyFish10K[10].Conclusion
Fish10Kwillgenerateanunprecedented,comprehensivedatasetoffish:thelargestandmostdiversevertebrategroup.Oureffortwillallowustocompletethegenomictreeforfishand,inconcertwithotherprojects,suchasVGPandB10K,completetreesforvertebratesingeneral.Availabilityofsupportingdataandmaterials
The10fishgenomeassembliesinthepilotstudyhavebeendepositedintheChinaNationalGeneBank(https://db.cngb.org/cnsa)withaccessioncodesCNP0000597andCNP0000691.Thesequencingdataof10fisharealsodepositedinNCBIwithbioprojectnumbersPRJNA597275andPRJNA558872.TheindividualgenomesalsoallhaveindividualDOIsinGigaDB,linkedfromaprojectpage[10].Abbreviations
B10K:Bird10,000GenomesProject;BUSCO:benchmarkinguniversalsingle-copyorthologs;Fish10K:10,000FishGenomesProject;stLFR:single-tubelongfragmentreadstechnology;VGP:VertebrateGenomesProject.Competinginterests
SomeoftheauthorsareemployeesofBGIGroup.Theauthorsotherwisedeclarethattheyhavenocompetinginterests.Funding
Thisworkwassupportedbythespecialfundingof“Bluegranary”scientificandtechnologicalinnovationofChina(2018YFD0900301-05).Authors'contributions
S.H.,N.C.,X.X.,X.L.,W.W.,andG.F.conceivedanddesignedthestudy.L.Y.,M.Z.andS.L.performedsamplecollectionandsequencing.Y.S.,S.Z.,andX.H.performedtheassemblyandannotation.X.L.,Y.S.,andG.F.wrotethemanuscript.N.C.andallotherauthorsrevisedandreadthemanuscript.Supplementarydata
SupplementaryTablesareavailableonline.ACKNOWLEDGEMENTS
TheworkalsoreceivedthetechnicalsupportfromChinaNationalGeneBankReferences
1.KoepfliKP,PatenB,TheGenome10KProject:awayforward.AnnuRevAnimBiosci.2015;3(1):57–111.GoogleScholarCrossrefSearchADSPubMedWorldCat 2.LewinHA,RobinsonGE,KressWJ,etal. EarthBioGenomeProject:sequencinglifeforthefutureoflife.ProcNatlAcadSciUSA.2018;115(17):4325–33.GoogleScholarCrossrefSearchADSPubMedWorldCat 3.VertebrateGenomesProject.VertebrateGenomesProject(VGP),https://genome10k.soe.ucsc.edu/vertebrate-genomes-project/.Accessed10September2019.OpenURLPlaceholderTextWorldCat4.ZhangG,RahbekC,GravesGRetal. Genomics:birdsequencingprojecttakesoff.Nature.2015;522(7554):34,doi:10.1038/522034d.GoogleScholarCrossrefSearchADSPubMedWorldCat 5.SunY,HuangY,LiX,etal. Fish-T1K(transcriptomesof1,000fishes)Project:large-scaletranscriptomedataforfishevolutionstudies.GigaSci.2016;5(1):18–22.GoogleScholarCrossrefSearchADSWorldCat 6.WangO,ChinR,ChengX,etal. Efficientanduniquecobarcodingofsecond-generationsequencingreadsfromlongDNAmoleculesenablingcost-effectiveandaccuratesequencing,haplotyping,anddenovoassembly.GenomeRes.2019;29(5):798–808.GoogleScholarCrossrefSearchADSPubMedWorldCat 7.NelsonJS,GrandeTC,WilsonMV.FishesoftheWorld,Edmonton,Canada.5thed.JohnWiley&Sons,2016.GoogleScholarCrossrefSearchADSGooglePreviewWorldCatCOPAC 8.TheFortLauderdaleAgreement.ReaffirmationandextensionofNHGRIrapiddatareleasepolicies:large-scalesequencingandothercommunityresourceprojects.https://www.genome.gov/10506537/reaffirmation-and-extension-of-nhgri-rapid-data-release-policies.Accessed10September2019.OpenURLPlaceholderTextWorldCat9.Toronto International DataReleaseWorkshopA,BirneyE,HudsonTJ,GreenEDetal. Prepublicationdatasharing.Nature.2009;461(7261):168–70.GoogleScholarCrossrefSearchADSPubMedWorldCat 10.GuangyiF,YueS,LiandongYetal. Supportingdatafor"Fish10Kpilotprojectdata.GigaScienceDatabase.2020,10.5524/100661.GoogleScholarOpenURLPlaceholderTextWorldCatCrossref
Authornotes
Equalcontribution.©TheAuthor(s)2020.PublishedbyOxfordUniversityPress.ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense(http://creativecommons.org/licenses/by/4.0/),whichpermitsunrestrictedreuse,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycited.
IssueSection:
DataNote
Downloadallslides
Supplementarydata
Supplementarydata
giaa080_Supplemental_Tables-zipfile
Advertisement
3,961
Views
4
Citations
ViewMetrics
×
Emailalerts
Articleactivityalert
Advancearticlealerts
Newissuealert
Inprogressissuealert
ReceiveexclusiveoffersandupdatesfromOxfordAcademic
Relatedarticlesin
WebofScience
GoogleScholar
Citingarticlesvia
WebofScience(4)
GoogleScholar
Crossref
Latest
MostRead
MostCited
AnoverviewoftheNationalCOVID-19ChestImagingDatabase:dataqualityandcohortanalysis
Denovoscreeningofdisease-resistantgenesfromthechromosome-levelgenomeofrareminnowusingCRISPR-cas9randommutation
CNVpytor:atoolforcopynumbervariationdetectionandanalysisfromreaddepthandalleleimbalanceinwhole-genomesequencing
Chromosome-levelgenomeassembliesofChannaargusandChannamaculataandcomparativeanalysisoftheirtemperatureadaptability
Studyingmutationrateevolutioninprimates—theeffectsofcomputationalpipelinesandparameterchoices
Advertisement
Advertisement
AboutGigaScience
EditorialBoard
AuthorGuidelines
Facebook
Twitter
AdvertisingandCorporateServices
JournalsCareerNetwork
OnlineISSN2047-217XCopyright©2021BGI
AboutUs
ContactUs
Careers
Help
Access&Purchase
Rights&Permissions
OpenAccess
PotentiallyOffensiveContent
Connect
JoinOurMailingList
OUPblog
Twitter
Facebook
YouTube
Tumblr
Resources
Authors
Librarians
Societies
Sponsors&Advertisers
Press&Media
Agents
Explore
ShopOUPAcademic
OxfordDictionaries
Epigeum
OUPWorldwide
UniversityofOxford
OxfordUniversityPressisadepartmentoftheUniversityofOxford.ItfurtherstheUniversity'sobjectiveofexcellenceinresearch,scholarship,andeducationbypublishingworldwide
Copyright©2021OxfordUniversityPress
CookiePolicy
PrivacyPolicy
LegalNotice
SiteMap
Accessibility
Close
ThisFeatureIsAvailableToSubscribersOnly
SignInorCreateanAccount
Close
ThisPDFisavailabletoSubscribersOnly
ViewArticleAbstract&PurchaseOptions
Forfullaccesstothispdf,signintoanexistingaccount,orpurchaseanannualsubscription.
Close