Giant lungfish genome elucidates the conquest of land by ...
文章推薦指數: 80 %
Lungfishes belong to lobe-fined fish (Sarcopterygii) that, ... The vast size of this genome, which is about 14× larger than that of humans, ... Skiptomaincontent Thankyouforvisitingnature.com.YouareusingabrowserversionwithlimitedsupportforCSS.Toobtain thebestexperience,werecommendyouuseamoreuptodatebrowser(orturnoffcompatibilitymodein InternetExplorer).Inthemeantime,toensurecontinuedsupport,wearedisplayingthesitewithoutstyles andJavaScript. Advertisement nature articles article Giantlungfishgenomeelucidatestheconquestoflandbyvertebrates DownloadPDF Subjects EvolutionarybiologyEvolutionarygeneticsGenomeevolutionPhylogenetics AbstractLungfishesbelongtolobe-finedfish(Sarcopterygii)that,intheDevonianperiod,‘conquered’thelandandultimatelygaverisetoalllandvertebrates,includinghumans1,2,3.Herewedeterminethechromosome-qualitygenomeoftheAustralianlungfish(Neoceratodusforsteri),whichisknowntohavethelargestgenomeofanyanimal.Thevastsizeofthisgenome,whichisabout14×largerthanthatofhumans,isattributablemostlytohugeintergenicregionsandintronswithhighrepeatcontent(around90%),thecomponentsofwhichresemblethoseoftetrapods(comprisingmainlylonginterspersednuclearelements)morethantheydothoseofray-finnedfish.Thelungfishgenomecontinuestoexpandindependently(itstransposableelementsarestillactive),throughmechanismsdifferenttothoseoftheenormousgenomesofsalamanders.The17 fullyassembledlungfishmacrochromosomesmaintainsyntenytoothervertebratechromosomes,andallmicrochromosomesmaintainconservedancienthomologywiththeancestralvertebratekaryotype.Ourphylogenomicanalysesconfirmpreviousreportsthatlungfishoccupyakeyevolutionarypositionastheclosestlivingrelativestotetrapods4,5,underscoringtheimportanceoflungfishforunderstandinginnovationsassociatedwithterrestrialization.Lungfishpreadaptationstolivingonlandincludethegainoflimb-likeexpressionindevelopmentalgenessuchashoxc13andsall1intheirlobedfins.Increasedratesofevolutionandtheduplicationofgenesassociatedwithobligateair-breathing,suchaslungsurfactantsandtheexpansionofodorantreceptorgenefamilies(whichencodeproteinsinvolvedindetectingairborneodours),contributetothetetrapod-likebiologyoflungfishes.Thesefindingsadvanceourunderstandingofthismajortransitionduringvertebrateevolution. DownloadPDF MainLungfish(Dipnoi)sharewithland-dwellingvertebratestheabilitytobreatheairthoughlungs,whicharehomologoustoourown.Sincetheirdiscoveryinthenineteenthcentury,lungfishhaveattractedscientificinterestandwereinitiallythoughttobeamphibians6,7.Wenowknowthattheyaremorecloselyrelatedtotetrapodsthantoray-finnedfish.Oftheextantlungfishspecies(ofwhichthereareonlysix),fourliveinAfrica,oneinSouthAmericaandone(N. forsteri)inAustralia.LungfishappearedinthefossilrecordintheDevonianperiod,around400 millionyearsago(Ma)1.Somescholarshiphasdiscussedlungfishas‘livingfossils’,becausetheirmorphologybarelychangedovermillionsofyears:forexample,>100-million-year-oldfossilsfromAustraliastronglyresemblethesurvivingspecies(whichrepresentsoneoftheoldestknownanimalgenera,discoveredexactly150 yearsago)2.Owingtotheancestralcharacters(suchasbodyshape,largescalesandpaddle-shapedfins)ofN. forsteri,itresembles‘archetypal’extinctlungfishmuchmorethanthetwootherlineagesofextantlungfish.TheSouthAmericanand, inparticular,theAfricanlungfishhavealmostcompletely losttheirscalessecondarilyandhavesimplifiedtheirfinmorphologyintothinfilaments,althoughtheydoshowthealternatinggaitsthataretypicalofterrestriallocomotion.Togetherwiththecoelacanthsandtetrapods,lungfisharemembersoftheSarcopterygii(lobe-finnedfish);however,owingtotheshortbranchthatseparatesthesethreeancientlineagesithasremaineddifficulttoresolvetheirrelationships.DevelopmentsofpowerfulDNAsequencingandcomputationalmethodsenableustonowrevisitlong-standingevolutionaryquestionsregardingtheserelationshipsusingwhole-genome-deriveddatasetswithmorerobustorthologyinferencesthanhavehithertobeenpossible.Previousanalysesusinglargetranscriptomicdatasetshavetendedtosupportthehypothesisthatlungfisharetheclosestlivingrelativesoftetrapods4,5.Lungfisharethereforecrucialforunderstandingtheevolutionandpreadaptationsthataccompaniedthetransitionofvertebratelifefromwatertoland.Thismajorevolutionaryeventrequiredanumberofevolutionaryinnovations,includinginrespiration,limbs,posture,thepreventionofdesiccation,nitrogenexcretion,reproductionandolfaction.Lungfishareknowntohavethelargestanimalgenome(http://www.genomesize.com/search.php),butthemechanismsthatledtoandmaintainedtheirgenomesizesarepoorlyunderstood.Therefore,theAustralianlungfishmightprovideinsightsbothintotetrapodinnovationsandevolution,andthestructureofgiantgenomes.Genomesequencing,assemblyandannotationThelargestanimalgenomesequencedsofaristhe32-Gb8genomeoftheaxolotlsalamander(Ambystomamexicanum).Toovercomethechallengesofsequencingandassemblingtheeven-largergenomesoflungfish,weusedlong-andultra-long-readNanoporetechnologytogenerate1.2 Tbin3 batches:601 GbwithanN50read-lengthof9 kb;532 GbwithanN50of27 kb;and1.5 GbwithanN50of46 kb,allfromajuvenileAustralianlungfish.WeassembledthesethreebatchesintocontigsusingtheMARVELassembler8(ExtendedDataFig.1a,Methods).Thisyieldeda37-GbassemblywithanN50contigsize1.86 Mb(SupplementaryTable1).Tocorrectforinsertionsand/ordeletions,gaps,single-nucleotidepolymorphismsandsmalllocalmisalignmentsintheprimaryassembly,weused1.4-TbDNAand499.8-GbRNAIlluminareads.Thegenome-correctionDNAdata—sequencedatmorethan30×coverage—wereusedtoestimategenomesizethroughfrequenciesofk-mers(ExtendedDataFig.2).Weascertainedthehighcompletenessofthe37-Gbassemblybyobservingthat88.2%oftheDNAand84%oftheRNAsequencing(RNA-seq)readsalignedtothegenome,whichgivesanestimatedtotalgenomesizeof43 Gb(about30%largerthantheaxolotl8).Thismatchesthek-mervaluebutissmallerthanthatpredictedbyflowcytometry(52 Gb9)andFeulgenphotometry(75 Gb10).Next,wescaffoldedthecontigsusing271-Gbchromosomeconformationcapture(Hi-C) IlluminaPE250readstoachromosome-scaleassemblywithanN50of1.75 Gb(ExtendedDataFig.1d,Methods).WealsousedHi-Cdatatodetectmisjoins,bybinningHi-Ccontactsalongthediagonalandidentifyingpointsthatweredepletedofcontacts(ExtendedDataFig.1e).Thelargestscaffoldscorrespondtothe17 macrochromosomesarmsofthekaryotypeofN. forsteri.Wealsoassembledalltenmicrochromosomesintosinglescaffolds(SupplementaryInformation).Weconstructedacomprehensivemulti-tissuedenovotranscriptomeassembly(BUSCOscoreofover98%corevertebrategenes)usingRNAextractedfromthesameindividuallungfish.Forannotationofprotein-codinggenes,wecombinedevidencefromtranscriptalignmentsandhomology-basedgeneprediction.Thisresultedin31,120 high-fidelitygenemodels.WeassessedthecompletenessofthegenomeassemblyusingthepredictedgenesetandtheBUSCOpipeline,detecting91.4%ofcorevertebrategenes(233 genes)and90.9%ofvertebrateconservedgenes(2,586 genes)(SupplementaryTable2).Wepredicted17,095 noncodingRNAs(ncRNAs),including1,042transferRNAs(tRNAs),1,771 ribosomalRNAs(rRNAs)and3,974 microRNAs(SupplementaryTable3,SupplementaryInformation).Phylogenyoflungfish,coelacanthandtetrapodsPhylogeneticrelationshipsamongcoelacanths,lungfishesandtetrapodshavebeendebated4,5,11.WeusedBayesianphylogenomics(Fig.1)with697 one-to-oneorthologuesfor10 vertebrates,withacomplexmixturemodelthatcanovercomelong-branchattractionartefacts4andalsousednoncodingconservedgenomicelements(96,601 alignedsites)(ExtendedDataFig.3a).Bothdatasetsunequivocallysupportlungfish4,5astheclosestlivingrelativesoflandvertebrates,withwhichtheysharedalastcommonancestoraround420 Ma(ExtendedDataFig.3b).Fig.1:Bayesianphylogenybasedon697 one-to-one orthologues.ThisanalysisusedtheCAT-GTRmodelinPhyloBayesMPI.Allbranchesweresupportedbyposteriorprobabilitiesof1.Theproteinandanoncodingconservedgenomicelementdatasets(ExtendedDataFig.3a)recoveredidenticalandhighlysupportedvertebraterelationships(posteriorprobability = 1.0and100%bootstrapforallbranches).Scalebarisexpectedaminoacidreplacementspersite.FullsizeimageSyntenyconservedofmacro-andmicrochromosomesLineage-specificpolyploidyeventsareimportantevolutionaryforces12thatcanalsoleadtogenomeexpansionsinlungfish9,13.Despitethemassivegenomeexpansioninlungfishrelativetootheranimals,thelungfishchromosomalscaffoldsstronglyresembletheancestralchordatekaryotype(Fig.2a,ExtendedDataFig.4a,b).Onthebasisof17 chordatelinkagegroups(CLGs)14,15and6,337 markersmappedontothelungfishgenome,weuncoveredconservedsynteniccorrespondencebetweenlungfishchromosomesandCLGs(Fig.2a).Theancestorofvertebratesunderwenttworoundsofwhole-genomeduplication.LungfishalsoretainedmoreancientCLGchromosomalfusionsthroughthesetworoundsofvertebrateduplication15.Inlungfish,CLGfusionsfrombeforethesecondroundofwhole-genomeduplicationsarepreservedintactbutsubstantiallyexpanded(Fig.2b).AlmostalladditionalCLGfusionshappenedrecently,asindicatedbysharpsyntenicboundaries(Fig.2b).This,alongwiththe‘vertebrate-typical’genenumberofN. forsteri,confirmsthediploidyofthegenome.Fig.2:Conservedsyntenyandchromosomalexpansioninlungfish.a,MappingofCLGsontolungfishchromosomes.Orthologousgenefamilynumbersareshown.Eachdotrepresentsanorthologousgenefamily,CLGsareaspreviouslydefined15.Scaffolds01–17representlungfishmacrochromosomes,andscaffolds18–27representmicrochromosomes.SignificantlyenrichedCLGsonlungfishchromosomesindicatedbyrectangles(forrawdata,seeExtendedDataFig.4f).b,Expansionofhomologouschromosomesinlungfish(left),comparedtospottedgar(right)(hereonlyLG8isshown;theotherchromosomesareinExtendedDataFig.4a).ChromosomesarepartitionedintobinsandCLGcontentisprofiled;chromosomalpositionisplottednexttoeachchromosome.LG8ingarhasaprominentjawed-vertebrate-specificfusionoftheCLGsEandO,whichisretainedthroughoutthewholechromosomeinlungfish(despitethelatterbeing>30-foldlarger).ThesmallboxinthemiddleistheunexpandedLG8ofspottedgar.c,Preservationofmicrochromosomes.Chickenmicrochromosomesareplotted(forgar,seeExtendedDataFig.4d)alongwiththeirlungfishhomologueswith>50 orthologues.Scaffolds01–17representlungfishmacrochromosomes,andscaffolds18–27representmicrochromosomes.Forchicken,onlymicrochromosomesareshown.Significantlyenrichedchickenmicrochromosomesonlungfishchromosomesindicatedbyrectangles(forrawdata,seeFig.4e).Mostchickenmicrochromosomesareinone-to-onecorrespondencewithlungfish,butsomelungfishmicrochromosomeshaverecentlybeenincorporatedintomacrochromosomes.Theselungfishmacrochromosomes(forexample,scaffold 01orscaffold 02)havesignificantassociationwithbothchickenmacro-andmicrochromosomes.However,thosefusionsarerecentinlungfish,becausethepositionsofchickenorthologuesarerestrictedtospecificareasofthelungfishchromosomes, asisevidentfromthesharpsyntenicboundaries(indicatedbypinkarrowsonscaffold01,scaffold02andscaffold06).Silhouettesarefromapreviouspublication36.SignificancesweredeterminedbyFischer’sexacttest,P value ≤ 0.01.FullsizeimageAlltenlungfishmicrochromosomes(inferredfromkaryotype9andourassembly(ExtendedDataFig.4))couldbehomologizedtothemicrochromosomesofchickenandgar(Fig.2c,ExtendedDataFig.4c,d)—andeventheymostlyretainedtheirco-linearity.This,alongwiththeconservationofsomemicrochromosomesingar,chickenandgreenanole15,16,suggeststhatmicrochromosomesmaydatebacktotheearliestvertebrates.Thecompleteretentionofmicrochromosomesinthemassivelyexpandedlungfishgenomesuggeststhatstabilizingselectionmaintainstheseancestralunits.Insupportofthis,lungfishmicrochromosomesshow—onaverage—highergenedensitiesandalowerdensityoflonginterspersednuclearelements(LINEs),whicharethemajorcontributorstogenomesize(ExtendedDataFig.4b);thisalsosuggestsdifferentexpansiondynamicsofvertebratemicro-andmacrochromosomes.HallmarksofthegiantlungfishgenomeAmaximumlikelihoodreconstructionoftheancestralgenomesizesofvertebratesshows2 majorindependentgenome-expansioneventsinlungfishandsalamanderlineages(ExtendedDataFig.3c),initiallyatsimilarratesinbothlineages(161–165 Mbpermillionyears)butsubsequentlyatslowerratesintheAustralianlungfish(about39 Mbpermillionyears),butpossiblynotintheotherlineagesofextantlungfishes.Thegenomeexpansionhappenedinearlylungfishes(around400–200Ma),andslowedduringthebreakupofGondwana(fromaround200 Matopresent)(ExtendedDataFig.3c).Independently,genomesizeincreasedinsalamandersintwoindependentwavesofDNA-repeatexpansion(Fig.3b,ExtendedDataFigs.3c,5).LINEsmakeupmuchoftherecentgenomegrowthofthelungfish(<15%divergence,around9%(4 Gb),alsoinanearlierburstinlungfishbutnotaxolotl)(ExtendedDataFig.5a).Becausemobilizedtransposableelementscaninterruptgenefunction,onemightspeculatethatsuchburstsofactivityoftransposableelementsmighthavecausednovelgenefunctions.Fig.3:Compositionofrepetitiveelementsinthelungfishgenome.a,Thepiechartsshowoverallcompositionofrepetitiveelementsfromunmaskedassembly(firsttransposableelementannotation)(left),togetherwiththeannotationfromthehardmaskedgenome(secondtransposableelementannotation)(right).Thebarchartshowsthelandscapeofmajorclassesoftransposableelements.Kimurasubstitutionlevel(%)foreachcopyagainstitsconsensussequenceusedasproxyforexpansionhistoryofthetransposableelements.Oldercopies(oldexpansion)accumulatedmoremutationsandshowhigherdivergencefromtheconsensussequences.RC, rolling-circletransposons;SINE,shortinterspersednuclearelement;TE,transposableelement.b,Principalcomponent(PC)analysisofcompositionofrepetitiveelements(LTR,LINE,SINE,DNAandunknown,filteredby80/80rule)ofvertebrates.FullsizeimageAlthoughsyntenicallyhighlyconserved,thelungfishgenomehasundergoneextremeexpansionthroughtheaccumulationoftransposableelements.Weperformedstandardrepeat-maskingproceduresonthe37-Gbgenomeassembly,whichidentified67.3%(24.65 Gb)asrepetitive(Fig.3a,SupplementaryTable4).Toourknowledge,thisisthehighestrepetitiveDNAcontentinagenomefoundin theanimalkingdom.Wetestedwhethertheremaining13 Gbofthegenomehavesignaturesofrepetitivenessthatareobscuredbygenomesizebyapplyingasecondroundofrepeatannotationonthehard-maskedgenome.Thisrevealedanadditional23.92%ofrepetitiveDNA(Fig.3a),whichwasmostlyclassifiedas‘unknown’(adding11%totheunknownportionofrepetitiveDNA)or‘LINE’(8.5%)(SupplementaryTables5,6).Intotal,around90%ofthelungfishgenomeisrepetitive,anditexpandedintwowaves(Fig.3a,ExtendedDataFig.5).Toinvestigatewhethertransposableelementsarestillactive,weanalysedpoly(A)-RNA-derivedRNA-seqdatathatprobablyrelatestoproteinsrelevantfortranspositionactivity.Allmajorcategoriesoftransposableelements(1,106outof1,821(60.7%))wereexpressed(ExtendedDataFig.6a).Transposableelementfamilieswithhighercopynumberswerealsohighlyexpressedinallthreetissueswetested.This,andthefindingofsimilarcopiesformanytransposableelementfamilies,suggeststhatseveraltypesoftransposableelementremainactiveandcontributetotheongoingexpansionofthelungfishgenome.Identificationofinsertionpolymorphismsbetweentwo,ideallyrelativelycloselyrelatedlungfishspecies(suchasProtopterusfromAfrica)arenecessarytoconfirmtransposableelementactivity.Apparently,thetransposonsilencingmachinerydidnotadapttoreduceoverabundanttransposableelementsbycopynumberexpansionorstructuralchanges(SupplementaryTable7).Therepeatlandscape(proportionsofmajorclassesoftransposableelement)oflungfishresemblestetrapods(includingaxolotl),whereasthethirdextantsarcopterygianlineage(thecoelacanths)ismore‘fish’-like(Fig.3b).Thetwolargestanimalgenomesyetsequencedexpandedthroughdifferenttemporaldynamics.Whereaslongterminalrepeat(LTR)elementsarethemostabundantclassoftransposableelement(59%)inaxolotl8,LINEs(25.7%;mostlyCR1andL2elements)dominateinlungfish(ExtendedDataFigs.5,6).Thesetworetrotransposonclassesbelongtothesamecopy-and-paste(andnotcut-and-paste)category,butpropagateviadifferentmechanisms17.Althoughglobalrepeatcompositionsdifferbetweenlungfishandaxolotl,thesameLTRclassaffectstheirgenicregions(ExtendedDataFig.6,SupplementaryInformation).Tofurtherunderstandgenomegrowthinlungfish,wecomparedthegenomestructureofN. forsteriwiththatofothergenomes(ExtendedDataFigs.6c,d,7).Althoughcompactgenomeshavesmallintrons,intragenicnoncodingregionsusuallyincreasewithgenomesize18.Thelargestintronofthelungfishis5.8 Mb(inthedmbt1gene)andaverageintronsizeis50 kbasinaxolotl,comparedto1 kbinfuguand6 kbinhuman.IntronsintheN. forsterigenomecompriseabout8 Gb(21%ofgenome)—asimilarproportiontothatinhuman(21%),buthalfthatoffugu(40%).Thissuggeststhatsimilarmechanismsaffectthegenicandintergeniccompartments,followingexpectationsforgenomesizeevolution19.Inmostgenes,thefirstintrontypicallyisthelargest.Thebiologicalrelevanceofthisremainsunclear.Thefirstintronsinlungfishandaxolotlarealsomuchlargerthandownstreamintrons(ExtendedDataFig.7),whichindicatesthattherelativelylargerfirstintronsinsmallergenomesareprobablynotduetothespacerequirementsofregulatoryorstructuralmotifs20.Ithaspreviouslybeensuggestedthatthesizeofintragenicnoncodingsequencesandtheextentofintronexpansionareassociatedwithorganismalfeatures(suchasmetabolicrate18)orfunctionalcategoriesofgene8(forexample,developmentalornondevelopmentalgenes).Similartoaxolotl8,theintronsindevelopmentalgenesinlungfisharesmallerthaninnondevelopmentalgenes(P = 2.166 × 10−8,Mann–WhitneyUtest)(SupplementaryTable8).Genomicpreadaptationsinfish–tetrapodtransitionPositiveselectionanalysisuncovered259 genes,manyofwhicharerelatedtooestrogenandcategoriesrelatedtofemalereproduction(SupplementaryInformation,SupplementaryTable9).Wecomparedtheseratedynamics(16,471 genefamilies)(SupplementaryTables10,11),andfoundthatinthelungfishlineage24 familieshavecontractedand107 familieshaveexpanded—possiblyrelatedtoevolutionaryinnovations.AirbreathingandtheevolutionoflungsAllland-livingvertebratesandadultlungfishareairbreathers.Thepulmonarysurfactantprotein Bfamilyofgeneshasexpandedconsiderablyinthelungfishgenome.Surfactantsarenecessarycomponentsofthelipoproteinmixturethatcoversthelungsurfaceandensuresproperpulmonaryfunction.Inlungfish,thenumberofsurfactantgenesincreasedtoanumbertypicalfortetrapods(2–3×morethanincartilaginousandbonyfish)(SupplementaryTable12).Thismayindicateanadaptationtoairbreathinginlungfish.Wefurtherinvestigatedtheexpressionofshh,whichencodesanimportantregulatoroflungdevelopment21,duringlungfishembryogenesis(ExtendedDataFig.8a).shhisstronglyexpressedinthedevelopinglungs(embryosatstages43–48),visualizingthedevelopmentoftheright-sidedlung(Neoceratodushasaunilaterallung).Thislungdevelopsinamannernotablysimilartothoseofamphibians22.Altogether,thishighlightsmolecularsignaturesoflungsthatwerenecessaryfortheconquestoflandbysarcopterygians.OlfactionandevolutionofthevomeronasalorganWealsonotedexpansionsofgenesinvolvedinolfaction.Thegenecomplementofreceptorsforairborneodorants(whichislargeandcomplexintetrapodsandsmallinfish)isconsiderablyexpandedinlungfish,whereasseveralreceptorclassesforwaterborneodourshaveshrunk—inparticular,zetaandetareceptors,whichaboundinteleostfishes(SupplementaryTable13).Thevomeronasalorgan(VNO)ispresentinmosttetrapods23,24,beinglinkedtopheromonereceptionandexpressingalargerepertoireofvomeronasalreceptorgenes(particularlyinamphibians).InN. forsteri,thevomeronasalreceptorgenefamily—knownfromfishandevenlampreys,althoughitsfunctioninthesespeciesisunknown—hasexpandedconsiderably.Lungfishpossessa‘VNOprimordium’25.Thenotableexpansionofthevomeronasalreceptorgenefamily(especiallyV2Rgenes)inN. forsteri(SupplementaryTable14)showsthattheVNOisatetrapodinnovation,whichemergedinthewater-to-landtransition.LobedfinsandevolutionofterrestriallocomotionSarcopterygianshaveelaboratedendochondralskeletons:lobedfinsthataredistallybranched,formingdigitsthataresuitableforsubstrate-basedlocomotion.Ouranalysisindicatessarcopterygianoriginsfor31 conservedtetrapodlimb-enhancerelements26(Fig.4a,ExtendedDataFig.8b).Thehs72(refs.27,28)enhancer(relatedtosall1)drivesautopodalexpression(Fig.4b).Wefoundsall1stronglyexpressedinlungfishembryos,inexpressionpatternssimilartothosereportedfortetrapods29(Fig.4b)butabsentduringzebrafishfindevelopment30.Similarfunctionsofsall1duringmouselimbdevelopment29suggestthatthisgenecontributedtotheacquisitionofsarcopterygianlobedfinsalreadyinlungfish.Fig.4:Regulatorypreadaptationoflobedfinandhoxdgeneregulation.a,Analysisof330 validatedmouseandhumanlimbenhancersshowsdeepevolutionaryoriginofthelimbregulatoryprogram;31enhancersareassociatedwiththeemergenceofthelobedfin.b,Thehs72enhancerlocatedneartheSall127,28genedrivesstrongLacZinmouseautopods(n = 3outof3 embryos,LacZ-stainedembryoscourtesyofVISTAenhancer26)(top).sall1isexpressedinasimilarautopodial-likedomaininlungfishpectoralfins(n = 2outof2 fins)(bottom).dpf,dayspost-fertilization.c,Left,hoxc13isexpressedinadistallungfishareathatoverlapswiththecentralmetapterygialaxis(sox9)andfinfold(and1)(arrowheads)(n = 2outof2 fins).Right,similarexpressionpresentinaxolotllimbs(arrowhead)(n = 4outof4 limbs),indicatingadeepsarcopterygianoriginforthisexpressiondomain.d,Duringlungfishfindevelopment,hoxd11andhoxd13areexpressedinmostlynonoverlappingproximalandposterior–distalfindomains(n = 4outof4 finseach).e,ThelungfishhoxdclusterhasincreasedinsizecomparedtomouseandXenopus,butmaybesmallerthantheaxolotlhoxdcluster.Inlungfishandaxolotlexpansionhasoccurredinthe3′and5′regionsofthecluster,whereasthecentralhoxd8,hoxd9,hoxd10andhoxd11region(lilacbox)remainedstableatapproximately25 kb,formingaseparate‘minicluster’.Thehoxdclusterisregulatedby3′andlong-rangeenhancers.hoxd9,hoxd10andhoxd11(lilac),andhoxd13(green),aresubjecttoenhancersharing33andco-expressedinthedistallimbinmouseandXenopus33,37,whereastheincreasedgenomicdistancebetweenhoxd13andhoxd9,hoxd10andhoxd11hasdisruptedtheirco-expressioninthedistalappendagesoflungfishandaxolotl.Thepreservedclusteringofhoxd8,hoxd9,hoxd10andhoxd11canbeexplainedbyenhancersharing3′ofthecluster33,whichprobablyplacesconstraintsontheirintergenicdistances.AxolotlandXenopushoxd11 andhoxd13afterref.37;lungfishhoxd11 andhoxd13domainsafterref.36andd(SupplementaryTable16listsprimersforprobes).Scalebars,0.2 mm.Silhouettesarefromref.36.FullsizeimageHoxclustersandtefin-to-limbtransitionThe4 clustersofhoxgenesinNeoceratodus(hoxa,hoxb,hoxcandhoxd)comprise43 genes(ExtendedDataFig.9);thepresenceofhoxb10andhoxa14inlungfishconfirmstheirlossatthefish-to-tetrapodtransition11.OurRNA-seqanalysisoftheexpressionofhoxgenesinthefinsoflarvalNeoceratodus(ExtendedDataFig.8c)showedanunexpectedexpressionofhoxcgenes.Theexpressionofhoxcgenesinpairedfinsorlimbshaspreviouslybeenreportedonlyformammals31,relatedtothenailbed.Weobservedhoxc13expressioninaxolotllimbs(Fig.4c),butitwasabsentinthepectoralfinsofray-finnedfish(ExtendedDataFig.8d).TranscriptlocalizationinNeoceratodusembryosshowedexpressionofhoxc13inthedistalfin(Fig.4c).Thisindicatesanearlygainofhoxc13expressioninsarcopterygians,suggestingco-optionofthisdomainintetrapodstopatterndermallimbelements(suchasnails,hoovesandclaws).Togetherwithsall1,thisdemonstratesanearlysarcopterygianoriginoflimb-likegeneexpressionthatwasreadyfortetrapodco-option,facilitatingthefin-to-limbtransitionandcolonizationoftheland.HoxclusterexpansionversusregulationConsistentwiththeoverallgenomeexpansion,thehoxclustersofNeoceratodusarelargerthaninmouse,chickenandXenopus,buthaveanunevenpatternofexpansion(ExtendedDataFig.9).Theclusteringofhoxdgenesresultsintheircoregulationbyenhancers3′and5′ofthecluster,leadingtoco-expressionofhoxd9,hoxd10,hoxd11,hoxd12andhoxd13inthedistalappendages32,33,34,35.DuringfindevelopmentinNeoceratodus,expressionofhoxd11isnearlyabsentfromthehoxd13territory36(Fig.4d)whereasinaxolotlhoxd9,hoxd10andhoxd11areexcludedfromthehoxd13digitdomain37(ExtendedDataFig.8e).Suchapparentlossofcoregulationbetweenhoxd13andhoxd9,hoxd10andhoxd11issimilartothatcausedbyexperimentallyincreaseddistancesinthehoxdcluster32,andsuggestsadisruptionofenhancersharingcausedbytheexpansionoftheintergenicregionsbetweenhoxd11 andhoxd13(Fig.4e).Weperformedadditionalanalysesinmouse,Xenopus,lungfishandaxolotl,whichshowedthat—despite5–10×differencesinthesizeofthehoxdcluster—theregioncomprisinghoxd8,hoxd9,hoxd10andhoxd11remainedfixedataround25 kb(Fig.4e).Thisapparentconstraintisprobablyduetosharingofenhancerslocatedatthe3′endofthecluster33.Altogether,thisindicatesthathoxdexpansionhaspartiallydisruptedlong-rangeenhancersharing,butthat—conversely—suchmechanismshavelocallyalsoconstrainedintergenicdistances.Wehavesequencedandassembledatthechromosomelevel(SupplementaryTable15)thelargestanimalgenome,andhavesubstantiatedthehypothesisthatlungfisharetheclosestlivingrelativesoftetrapods.Despitetheuniquegenomeexpansionhistoryoflungfish,genicorganizationandchromosomalhomologyismaintainedevenatthelevelofmicrochromosomes.Genomicpreadaptationsinlungfishforthewater-to-landtransitionofvertebratesincludealargercomplementoflung-expressedsurfactantgenes,whichmighthavefacilitatedtheevolutionofair-breathingthroughalung.Inaddition,thenumberofVNOolfactoryreceptors(aswellasotherreceptorgenefamiliesthatpermitdetectionofairborneodours)increasedinthelineagethatledtoair-breathinglungfish.Theunevenexpansionofhoxclustersdemonstratestheregulatoryconsequencesof,andconstraintson,genomeexpansion.Theevolutionarytrajectoryoflimbenhancersshowsanearly-fishoriginofthelimbregulatoryprogram,withimportantchangestowardspreadaptationsforterrestrializationprecedingthefin-to-limbtransition.Geneexpressiondomainsthatcharacterizethetetrapodlimb,butwhichwerepreviouslypresumedtobeabsentfromfins(suchasthoseofsall1andhoxc13),appearedinthelobe-finnedlineage.Suchnoveltiesmighthavepredisposedthesarcopterygianstoconquertheland,demonstratinghowthelungfishgenomecancontributetoabetterunderstandingofthismajortransitioninvertebrateevolution.MethodsNostatisticalmethodswereusedtopredeterminesamplesize.Theexperimentswerenotrandomizedandinvestigatorswerenotblindedtoallocationduringexperimentsandoutcomeassessment.BiologicalmaterialsBiopsymaterialforDNAandRNAisolationwasobtainedfromajuvenileAustralianlungfish(N. fosteri)importedfromAustralia(CITESpermitno.:PWS2017-AU-000242).Owingtotheimmaturestatusofthegonad,thesexcouldnotbedetermined.Thesamespecimenwasusedforgenomesequencing(muscle),constructionoftheHi-Clibrary(spleen)andtranscriptomesequencingofbrain,gonadandliver.Thesecondsetofreadswasgeneratedfromlungfishembryos(embryonicstage52,GenBankaccessionnumbersSRR6297462–6297470)36.EmbryoswerebredandcollectedunderpermitARA2009.039atMacquarieUniversity.DNAextraction,genomesequencingandassemblyHighmolecularweight(HMW)andultra-HMWDNAwaspreparedbyFutureGenomicsandNextomics,andsequencedusingNanoporetechnology(forstatistics,seeSupplementaryTable1).gDNAforgenomecorrectionfromsnap-frozenlungfishmuscletissue(0.3g)wasisolatedbyastandardgDNAisolationprotocol.LibrarypreparationwasperformedusingtheWestburgNGSDNAlibrarykit.ThefinallibrarywasexcisedbyPippinprepwith400-bpDNAsizeandsequenced(IlluminaNova-seqS2;PE150)atViennaBioCenterNGSfacility.Hi‐Clibrarywasgeneratedaspreviouslydescribed38,39,withmodificationsdetailedinSupplementaryMethods.FinalHi‐Clibrariesweresequenced(IlluminaNova-seqSP;PE150)atViennaBioCenterNGSfacility.GenomeassemblyNinety-sixmillionreadscomprising1.2TbwereassembledusingtheMARVELgenomeassembler8.Wefirstaligned1%ofthereadsagainstallotherreads.Fromthese1%-against-allalignments,wederivedinformationontherepetitiveelementspresentinthereadsandusedtransitivetransfertorepeat-annotateallreadsusedintheassembly.Regionsweredeemedrepetitivewhenthedepthofthealignmentsforagivenreadexceededtheexpecteddepthfourfold.Giventhealignmentofthe1%againsteveryotherreadintheassembly,wethentransferredtherepeatannotationofthe1%usingthealignmentstotherespectivepositioninthealignedreads.Here,theassumptionisthatwhenregion(a,b)inreadAalignsto(c,d)inreadBandfora≤ rb≤ re≤ b(inwhichrbandrearerepetitiveelements);thisthancanbemappedusingthealignmenttoacorrespondingregioninB,whichthencanbetaggedasrepetitiveaswell.Thefinalrepeat-maskingtrackcovered28.7%ofthe1.2Tb.Wethenprocessedwithanall-against-allalignmentwithrepeatmaskinginplace,yieldingfivebillionalignments.Onthebasisofthesealignments,wederivedreadqualitiesat100-bpresolution,highlightinglowsequencingqualityregionsinthereads.Usingthealignmentsandthereadqualitiesstructuralweaknesses(chimericbreaks,high-noiseregionsandothersequencingartefacts)inthereadswererepaired(SupplementaryMethods,ExtendedDataFig.10).Repairedreadswerethenusedforanewroundofalignments,againwithrepeatmasking,inplace.Afteralignment,thedefaultMARVELassemblypipelineproceededasshownintheincludedexamplesofthesourcedistribution(ExtendedDataFig.1).ForthecurrentMARVELsourcecoderepository,seehttps://github.com/schloi/MARVEL.Forsampleexecutionscripts,seehttps://github.com/schloi/MARVEL/tree/master/examples.ScaffoldingWeusedanagglomerativehierarchical-clustering-basedscaffoldingapproachusingvariousnormalizations(ExtendedDataFig.1).Fordetails,seeSupplementaryMethods.Wecreatedinitialclustersbyselectingthelargestcontigswiththefewestcontactsbetweenthem,eachcontigservingasasinglecluster.Wethenaddedcontigsonthebasisofuniqueassignabilitytoclusters.Thiswasfollowedbyscaffoldingtheclusterseparately,visualinspectionofanapproximatecontactmapderivedduringthescaffoldingprocessandreturnofwronglyassignedcontigstothesetofunassignedcontigs.Wecreatedcontactmapsforallclustersandmergedorsplitclustersonthebasisofthesignalwithinthose.Theprocessofassigningcontigs,scaffolding,mergingandsplittingclusterswasrepeateduntilnomoreusefulchangescouldbemadetotheclusters(SupplementaryTable15forcomparisonofchromosomeandscaffoldDNAcontent).Forthepublicsourcecoderepository,seehttps://github.com/schloi/MARVEL/.TheMARVELassemblerandscaffolderhaspreviouslybeenusedtoobtainachromosome-scaleaxolotlgenomeassembly,whichhasbeenvalidatedincomparisontothepreviouslypublishedchromosome-scalemeioticscaffolding40andisavailableaspreviouslydescribed41.GenomeassemblycorrectionForcorrectionoferrors(insertionsand/ordeletions(indels),basesubstitutionsandsmallgaps)remainingafterthegenomeassembly,weappliedatwo-stepprocedureusingDNA-sequencingandRNA-seqreadsseparately.Inbrief,wesequencedthesamegenomicDNAsampleandgenerated4,693,324,032high-qualityreadpairs(2 × 150bp)(30×coverage).Additionally,weusedtheRNA-seqreadsfromthedenovotranscriptomeassemblytocorrectindels,butnotbasesubstitutions,intranscribedregions(SupplementaryMethods,SupplementaryResults,ExtendedDataFig.10).TranscriptomeassemblyRNAwasisolatedfrombrain,spinalcord,eyes,gut,gonad,liver,jaw,gills,pectoralfin,caudalfin,trunkmusclesandlarvalfin.LibrarieswereconstructedusingNEBNextUltraIIDirectionalRNAlibrarypreparationkit(NewEnglandBiolabs),IlluminaTruSeqRNAsamplepreparationkit(Illumina)orLexogenTotalRNA-seqLibraryPrepKitV2(Lexogen).Paired-endsequencing,performedwithIlluminaplatforms,yieldedapproximately1,150millionrawreads.Rawreads,filteredandcorrectedusingTrimmomaticv.0.3642andRCorrectorv.1.0.243,wereassembledusingdenovoandreference-guidedapproaches.Fordenovoassembly,onlyreadsderivedfrompoly(A)-selectedRNAwereprocessedusingtheOysterRiverProtocol(ORP)v.2.2.844.Inbrief,readswereassembledusingTrinityv.2.8.4(k-mer = 25),SPAdesv.3.13.345(k-mer = 55),SPAdes(k-mer = 75)andTrans-Abyssv.2.0.146(k-mer = 32).ThefourdifferentassemblieswerethenmergedusingtheOrthoFusermodule47,48implementedinORP.Completenessofthedenovo-assembledtranscriptomewasassessedwithBUSCOv.349usingcorevertebrategenesandVertebratagenes(vertebrata_odb9database)inthegVolantewebserver50.Forreference-guidedassembly,allreadswerealignedtotheN.forsterigenome(eachsampleindependently)usingtheprogramHISAT2v.2.1.051(maximumintronlengthsetto3Mb).TheresultingmappingfileswereparsedbyStringTiev.1.3.652andtranscriptsreconstructedfromeachalignedsampleweremergedinasingleconsensus.gtffile.RepeatsandtransposableelementsannotationNeoceratodusforsterirepeatsequenceswerepredictedusingRepeatMasker(v.4.0.7)withdefaulttransposableelementDfamdatabaseandadenovorepeatlibraryconstructedusingRepeatModeler(v.1.0.10),includingtheRECON(v.1.0.8),RepeatScout(v.1.0.5)andrmblast(v.2.6.0),withdefaultparameters.TransposableelementsnotclassifiedbyRepeatModelerwereanalysedusingPASTEC(https://urgi.versailles.inra.fr/Tools/)andDeepTE53.RepeatsequencesofA. mexicanum(AmexG_v3.0.0,https://www.axolotl-omics.org/)werepredictedusingthesameapproach.RepetitivesequencesofAnoliscarolinensis(GenBankaccessionGCA_000090745.2),Xenopustropicalis(GCA_000004195.4),Rhinatremabivittatum(GCA_901001135.1),Latimeriachalumnae(GCA_000325985.2),Lepisosteusoculatus(GCA_000242695.1),Daniorerio(GCA_000002035.4)andAmblyrajaradiata(GCF_010909815.1))wereidentifiedusingDfamTEToolsContainer(https://github.com/Dfam-consortium/TETools)includingRepeatModeler(v.2.0.1)andRepeatMasker(v.4.1.0).Tofurtherexaminetheremainingintergenicsequences,wepredictedrepetitivesequencesagainusingthesameworkflowonthegenomehard-maskedwithrepeatsalreadypredictedbyRepeatMasker.Kimuradistance-baseddistributionanalysisandtransposable-element-compositionprincipalcomponentanalysisKimurasubstitutionlevelsbetweentherepeatconsensustoitscopieswerecalculatedusingautilityscriptcalcDivergenceFromAlign.plbundledinRepeatMasker.RepeatlandscapeplotswereproducedwiththeRscriptnf_all_age_plot.Randnf_am_rb_age_plots.R,usingthedivsumoutputfromcalcDivergenceFromAlign.pl.PrincipalcomponentanalysisonrepetitiveelementcompositionwasperformedinR(v.3.6)usingfactoextrapackage(v.1.0.6).Repetitiveelementcompositions(SINE,LINE,DNA,LTRandunknown)werecalculatedfromthepredictedlibraries.Repetitiveelementcopieswerefilteredbythe80/80rule(equalorlongerthan80bp,equalormorethan80percentidentitycomparedwiththeconsensussequence).Repetitiveelementcompositionofothervertebrateswasobtainedfromref.54.TransposableelementcompositionbygenelengthandLTRfamilyanalysisRepetitivesequencecompositionwithingenes(groupedbylength)wasexaminedbycalculatingthecoverage(inbp)ofeachclassofrepetitiveelement,normalizedbygenelength.WeexaminedLTRfamilyenrichmentingenicregions.Allcalculationsandvisualizationsaresummarizedinthejupyternotebookfilete_general_analysis.ipynb.AllpythonscriptsranonPython≥3.7andusedthepackagegffutils(v.0.10.1)(https://github.com/daler/gffutils)tooperatelargegeneandrepetitiveelementannotationfilesfromlargegenomes.PlotsweregeneratedusingPlotlyPythonAPI(https://plot.ly).TransposableelementcontentingenicregionsIntronpositionwascalculatedbyGenomeTools(v.1.5.9).Thesumofthecoverageoftherepetitiveelement(forexample,LINECR1)wasnormalizedbythelengthofthegenicfeatureconsidered(SupplementaryTable17)(forexample,intron8)usingpythonscriptte_cnt_class.py.TransposableelementexpressionTransposableelementexpressionwasassessedwithTEtools55ongonad,brainandliverpoly(A)-RNAdata.Becauseofthelargesizeoflungfishgenome,arandomsubsetof10%ofalltransposableelementcopieswasused.Transposable-element-familycountswerenormalizedbytransposable-element-familyconsensuslength(count × 106/consensuslength)andlibrarysize.Normalizedcountswereplottedagainsttransposable-element-familycopynumbers.Annotationofprotein-codinggenesProtein-codinggeneswerepredictedbycombiningtranscriptandhomology-basedevidence.Fortranscriptevidence,assembledtranscripts(asdescribedin‘Transcriptomeassembly’)weremappedtotheassemblyusingGmaplv.2019-05-1256andthegenestructurewasinferredusingthePASApipelinev.2.2.357.ExpressionofeachtranscriptwasmeasuredusingthewholeRNA-seqdataset(asdescribedin‘Transcriptomeassembly’)andthepseudoalignmentalgorithmimplementedinKallistov.0.46.158.Forhomologyevidence,wecollectedmanuallycuratedproteinsfromUniProtKB/SWISSPROTdatabase(UniProtKB/Swiss-Prot2020_03)59andproteinsequencesofCallorhinchusmilii,L.chalumnae,L. oculatusandX.tropicalisfromEnsembl(http://www.ensembl.org)andNCBI(https://www.ncbi.nlm.nih.gov/genome),andalignedthemtotherepeat-maskedassemblyusingExoneratev.2.260.Transcriptandhomology-basedevidencewerethencombinedbyprioritizingtheformer(homology-basedpredictedgeneswereremovedwhenintersectingagenepredictedusingthereconstructedtranscripts).Thecombinedgenesetwasthenprocessedbytworoundsof‘PASAcompare’toadduntranslatedregion(UTR)annotationsandmodelsforalternativelysplicedisoforms.Low-qualitygenemodelswereremovedbyapplyingthreefurtherquality-filteringstepsinaniterativefashion:(1)single-exongeneswereretainedonlywhennosimilaritywithexonsofmulti-exonicgeneswasfound(similaritywasidentifiedwiththeglsearch36moduleimplementedintheFASTAv.36.3.8gpackage61withe-valuecut-offsof1 × 10−10andidentitycu-toffsof80);(2)genesintersectingrepeatelementswereremovedwhen>50%(single-exonicgenes)and>90%(multi-exonicgenes)werecoveredbyrepeats;and(3)geneswithinternalstopcodon(s)wereremoved.Thecompletenessofthepredictedprotein-codinggenesetwasassessedwithBUSCOusingthecorevertebrategenesandtheVertebratagenes(vertebrata_odb9database)inthegVolantewebserver.Toannotatethelungfishhoxclusters,hoxgeneswerefirstidentifiedusingBLASTwithvertebrateorthologuesasquery(SupplementaryMethods).AnnotationofncRNAgenesncRNAgeneswereannotatedusingtRNAscan-s.e.v.2.0.362andInfernalv.1.1.263.Thesameprocedurewasappliedtothegenomesofthenineotherfocalspecies.Foreachofthetenspecies,thecorrespondingmicroRNAsets(obtainedfrommiRBasev.2264database)wereusedtopredictmicroRNAtargetsiteson3′UTRsofcanonicalmRNAsusingmiRandav.3.365.FurtherdetailsareprovidedinSupplementaryInformation.AnnotationofconservednoncodingelementsWhole-genomealignmentsThemaskedversionsofthegenomeassembliesofthetenspeciesusedforthephylogenetictree(Fig.1)wereusedtobuildawhole-genomealignmentwiththehumangenomeasreference(ten-waywhole-genomealignment).Inbrief,eachpairwisealignmentwasconstructedusingLastzv.1.03.7366andfurtherprocessedusingUCSCGenomeBrowsertools67.Multiplealignmentsweregeneratedusingasinputtheninepairwisealignmentsin.mafformatwiththeprogramsMultizv.11.2andRoast.v.3.068.DetectionofconservedelementsThephylogenetichiddenMarkovmodel(phylo-HMM)implementedinphastCons69(runinrho-estimationmode)wasusedtopredictaconsistentsetofconservedgenomicelementsintheten-specieswholegenomealignment.AneutralmodelofsubstitutionswascalculatedusingphyloFit69withthegeneralreversiblesubstitutionmodelfromfourfolddegeneratesites.Rawconservednoncodingelements(CNEs)detectedbyphastConsweremergedwhentheirdistancewas<10bp,andsubsequentlyCNEs<50bpwereremoved.Protein-codingCNEsandthoseintersectingncRNAgenes,pseudogenes,retrotransposedelementsandantisensegenes(annotatedinthehumangenome)wereremoved.ExpansionofthegenomeinintergenicregionsThefinalfilteredsetofCNEswasusedtoinvestigateexpansionofintergenicspaces.Wecomparedthedistanceofnonexonicelementsthatareconservedinlungfishandthreetetrapods(human,chickenandaxolotl).ToobtaininformativeCNEpairs,weselectedthoseCNEsthat:(1)werepresentinallfourgenomes;(2)werelocatedinintergenicspace;(3)werelocatedinthesamecontigorchromosomeineachspecies;and(4)didnothaveageneinbetweenthem.Theremainingsetof223CNEpairswereusedtocalculateintergenicdistanceandregion-specificexpansionofthelungfishgenome(SupplementaryTable18).Lineage-specificaccelerationofCNEsTheprogramphyloPwasusedtotesteachCNEforlineage-specificacceleratedevolution69,70inthelungfishbranch.AlikelihoodratiotesttocomputethePvalueofaccelerationwithrespecttoaneutralmodelofevolutionforeachoftheconservedelementsinthealignmentwasused.CNEsshowingfalse-discovery-rate(FDR)-adjustedP values 80%)trimmedwithBMGE75.Orthologywasensuredbymanualinspectionofmaximumlikelihoodgenetrees(IQ-TREE)andalignments(MAFFTginsi)forlocishowinghighbranch-lengthdisparity,andfiveindividualsequenceswereremoved.Lociwereconcatenatedintoafinalmatrixcontaining10taxaand697loci,totalling383,894alignedaminoacidpositions,ofwhich208,588(54%)werevariable.PhylogenywasinferredusingPhyloBayesMPIv.1.776underthesite-heterogeneousCAT-GTRmodel,showntoavoidphylogeneticartefactswhenreconstructingbasalsarcopterygianrelationships4.TwoindependentMarkovchainMonteCarlochainswererununtilconvergence(>4,000cycles),assessedaposterioriusingPhyloBayes’built-infunctions(maxdiff = 0,meandiff = 0,ESS>100forallparameters,afterdiscardingthefirst25%cyclesasburn-in).Post-burn-intreesweresummarizedintoafullyresolvedconsensustreewithposteriorprobabilitiesof1forallbipartitions.Whole-genome-alignment-basedphylogenyTheten-specieswholegenomealignmentwasprocessedbyMafFilterv.1.3.077tokeeponlyalignmentblocks>300bpthatwerepresentinallspecies.Filterednoncodingblockswerethenconcatenatedandexportedin.phylipformat.PoorlyalignedregionswereremovedusingtrimAlv.1.2withoption‘-automated1’.Thefinaldataset(99,601alignednucleotides)wasusedtoreconstructthephylogenywithRAxMLv.8.2.4undertheGTRGAMMAmodeland1,000bootstrapreplicates.GenomesizeevolutionGenomesizeevolutionwasmodelledbymaximumlikelihoodusingthe‘fastAnc’functioninthephytoolsRpackage78.Weusedatime-calibratedtreerepresentingallmajorjawedvertebratelineagesobtainedfromthephylotranscriptomictreeofref.5;agesareagenome-wideestimatesacross100time-calibratedtreesinferredfrom100independentgenejackknifereplicatesinferredinPhyloBayesv.4.179underalog-normalautocorrelatedclockmodelwith16cross-validatedfossilsasuniformcalibrationswithsoftbounds,theCAT-GTRsubstitutionmodelandabirth–deathtreeprior.Genomesizedata(haploidDNAcontentorc-value)wereobtainedfromref.80.Genomesizeestimateswereaveragedperspecies(ifseveralwereavailable)and,insixspecies,genomesizewasapproximatedastheaverageofcloselyrelatedspecieswithinthesamegenera.ForNeoceratodus,thek-mer-basedestimationwasused(43Gb;c-value = 43.97pg).Ancestralgenomesizeswereusedtocalculatetheratesofgenomeevolutionforselectedbranches.MolecularclockanalysesDivergencetimeswereinferredwitharelaxedmolecularclockwithautocorrelatedrates,asimplementedinMCMCTreewithinthePAMLpackagev.4.9h81.Atotalofsixfossilcalibrationswereusedasuniformpriors82.Forfurtherdetails,seeSupplementaryMethods.DynamicsofgenefamilysizeCAFE83wasusedtoinfergenebirthanddeathrates(lambda)andretrievegenefamiliesundersignificantdynamics.Asinput,wetookthespeciestreewithdivergencetimefromtheoutputofMCMCTreeandtheresultsofgeneclustersfromHcluster_sg.Eachgeneclusterwasdeemedtobeagenefamily.WeranCAFEunderamodelinwhichagloballambdawassetacrossthewholetree.Tosymbolizeeachgenefamily,wetookthelongestmemberasrepresentativeandBLAST-searchedwithdiamond84againstSWISSPROTandNRdatabases.Thebesthitfrombothwasretained.Tocomparetherepertoireofolfactoryreceptors,tastereceptorsandpulmonarysurfactantproteinsacrossallstudiedspecies,wefollowedthesameprocedureforeachspecies.First,wecollectedsequencesofolfactoryreceptors,tastereceptorsandpulmonarysurfactantproteinsfromSwiss-ProtandNRdatabaseasquery.ForsequencesfromNRdatabase,weonlykeptthosewithidentifiersstartingwith‘NP_’,whicharesupportedbytheRefSeqeukaryoticcurationgroup.Second,wemappedthequerysettoeachgenomeusingExonerateinservermodel(maxintronsettosixmillionforlungfishandaxolotl).Thealignmentwasextendedtostartandstopcodonwhenpossible.Third,weBLAST-searchedallretrievedsequencestoNRdatabaseandremovedthosewithabesthitthatwasnotanolfactoryreceptor,tastereceptororpulmonarysurfactant.Thefinalresultsequenceshadalignmentcoveragerangingfrom32%to100%(firstquartile95%),andpercentageofidentityfrom17%to100(firstquartile62%)toitsquery.Followingapreviousstudy85,weseparatedthefinalsequencesintothreecategoriesonthebasisoftheiralignmenttotheirquery:(1)pseudogene,sequenceswithprematurestopcodonorframeshift;(2)truncatedgene,sequenceswithoutprematurestopcodonandframeshiftbutbrokenopenreadingframe(ORF)(startorstopcodonmissing);and(3)intactgene,sequenceswithintactORF.PositiveselectionanalysisTwomodelswerecalculated.Model1wasusedtofindgenespositivelyselectedinlungfishandmodel2wasusedforgenescommonlypositivelyselectedintetrapodsandlungfish.GenomesincludedwereN. forsteriandA. mexicanumfromthisstudy,andtheEnsemblgenomesD.rerio(Danio_rerio.GRCz11),A.carolinensis(Anolis_carolinensis.AnoCar2.0),L.oculatus(Lepisosteus_oculatus.LepOcu1),L.chalumnae(Latimeria_chalumnae.LatCha1),C.milii(Callorhinchus_milii.Callorhinchus_milii-6.1.3),X.tropicalis(GCF_001663975.1_Xenopus_laevis_v2),G.gallus(Gallus_gallus.GRCg6a)andH.sapiens(Homo_sapiens.GRCh38).TheX.tropicalisgenome(GCF_001663975.1_Xenopus_laevis_v2)wasdownloadedfromNCBI.ProteinandcDNAfilesfromallspeciesweredownloaded.Toidentifyorthologousproteins,allproteinsequenceswerecomparedtolungfishusingInparanoid86(defaultsettings).TomatchproteinandcDNA,sequencesweresearchedbyTBLASTNandonly100%hitswerekept.CodonalignmentsfortheproteinandcDNAsequencepairswereconstructedusingpal2nalv.1487.ResultingsequenceswerealignedbyMUSCLE88(option:-fastaout)andpoorlyalignedpositionsanddivergentregionsofcDNAwereeliminatedbyGblocksv.0.91b89(options:-b410-b5n --b35 --t = c).Anin-housescriptwasusedtoconverttheGblocksoutputtoPAMLformat.Asaphylogenetictree,wetookthespeciestreewithdivergencetimesfromMCMCTreeasinputfordetectionofpositiveselectionwithC. miliiasoutgroup.Forthephylogeneticanalysesbymaximumlikelihood,the‘EnvironmentforTreeExploration’(ETE3)toolkit90—whichautomatesCodeMLandSlranalysesbyusingpreconfiguredevolutionarymodels—wasused.Fordetectionofgenesunderpositiveselectioninlungfish,wecomparedthebranch-specificmodelbsA1(neutral)withmodelbsA(positiveselection)usingalikelihoodratiotest(FDR≤0.05).Todetectsitesunderpositiveselection,naiveempiricalBayesprobabilitiesforallfourclasseswerecalculatedforeachsite.Siteswithaprobability>0.95foreithersiteclass2a(positiveselectioninmarkedbranchandconservedinrest)or2b(positiveselectioninmarkedbranchandrelaxedinrest)wereconsidered.Twomodelswerecalculated.Inmodel1,onlythebranchforlungfishwasmarked;inmodel2,alltetrapodsandlungfishweremarkedforpositiveselection.FunctionalclusteringwasdonewithIPA(Qiagen,www.qiagenbioinformatics.com/products/ingenuity-pathway-analysis)andDAVID(https://david.ncifcrf.gov/home.jsp)usinghumanhomologueswithdefaultsettings.InsituhybridizationInsituhybridizationwasperformedaspreviouslydescribed36,91,withmodifications(SupplementaryMethods).hoxgeneRNA-seqanalysishoxgeneRNA-seqanalysiswasperformedonastage-52lungfishlarvaRNA-seqdataset(SRR6297462–SRR6297470)39(SupplementaryMethods).LimbenhanceranalysisThreehundredandthirtynonredundantVISTAenhancerelements26,92weresearchedbyBLASTNagainstX.laevis,X.tropicalis,Nanoranaparkeri,axolotl,reedfish,sterlet,gar,elephantshark,coelacanth(LatCha1)andNeoceratodusgenomestodetermineconservation(SupplementaryMethods).ReportingsummaryFurtherinformationonresearchdesignisavailableinthe NatureResearchReportingSummarylinkedtothispaper. Dataavailability DataareavailablefromNCBIBioprojectunderaccessioncodePRJNA644903.Allotherrelevantdataareavailablefromthecorrespondingauthorsuponreasonablerequest. Codeavailability Customcodehasbeendepositedathttps://github.com/labtanaka/meyer_lungfish.ForthecurrentMARVELsourcecoderepository,seehttps://github.com/schloi/MARVEL.Forsampleexecutionscripts,seehttps://github.com/schloi/MARVEL/tree/master/examples. References1.Clack,J.,Sharp,E.&Long,J.inTheBiologyofLungfishes(edsJorgensen,J.M.&Joss,J.)1–42(CRC,2011).2.Kemp,A.ThebiologyoftheAustralianlungfish,Neoceratodusforsteri(Krefft1870).J.Morphol.190,181–198(1986). GoogleScholar 3.Carroll,R.L.VertebratePaleontologyandEvolution(W.H.Freeman,1988).4.Irisarri,I.&Meyer,A.Theidentificationoftheclosestlivingrelative(s)oftetrapods:phylogenomiclessonsforresolvingshortancientinternodes.Syst.Biol.65,1057–1075(2016).PubMed GoogleScholar 5.Irisarri,I.etal.Phylotranscriptomicconsolidationofthejawedvertebratetimetree.Nat.Ecol.Evol.1,1370–1378(2017).PubMed PubMedCentral GoogleScholar 6.Krefft,G.DescriptionofagiantamphibianalliedtothegenusLepidosirenfromtheWideBaydistrict,Queensland.Proc.Zool.Soc.Lond.1870,221–224(1870). GoogleScholar 7.Gunther,A.XIX.DescriptionofCeratodus,agenusofganoidfishes,recentlydiscoveredinriversofQueensland,Australia.Phil.Trans.R.Soc.B161,511–571(1871).ADS GoogleScholar 8.Nowoshilow,S.etal.Theaxolotlgenomeandtheevolutionofkeytissueformationregulators.Nature554,50–55(2018).CAS PubMed ADS GoogleScholar 9.Rock,J.,Eldridge,M.,Champion,A.,Johnston,P.&Joss,J.KaryotypeandnuclearDNAcontentoftheAustralianlungfish,Neoceratodusforsteri(Ceratodidae:Dipnoi).Cytogenet.CellGenet.73,187–189(1996).CAS PubMed GoogleScholar 10.Pedersen,R.A.DNAcontent,ribosomalgenemultiplicity,andcellsizeinfish.J.Exp.Zool.177,65–78(1971).CAS PubMed GoogleScholar 11.Amemiya,C.T.etal.TheAfricancoelacanthgenomeprovidesinsightsintotetrapodevolution.Nature496,311–316(2013).CAS PubMed PubMedCentral ADS GoogleScholar 12.Fox,D.T.,Soltis,D.E.,Soltis,P.S.,Ashman,T.-L.&VandePeer,Y.polyploidy:abiologicalforcefromcellstoecosystems.TrendsCellBiol.30,688–694(2020).CAS PubMed GoogleScholar 13.Vervoort,A.TetraploidyinProtopterus(Dipnoi).Experientia36,294–296(1980). GoogleScholar 14.Putnam,N.H.etal.Theamphioxusgenomeandtheevolutionofthechordatekaryotype.Nature453,1064–1071(2008).CAS PubMed ADS GoogleScholar 15.Simakov,O.etal.Deeplyconservedsyntenyresolvesearlyeventsinvertebrateevolution.Nat.Ecol.Evol.4,820–830(2020).PubMed PubMedCentral GoogleScholar 16.Braasch,I.etal.Thespottedgargenomeilluminatesvertebrateevolutionandfacilitateshuman–teleostcomparisons.Nat.Genet.48,427–437(2016).CAS PubMed PubMedCentral GoogleScholar 17.Jurka,J.,Kapitonov,V.V.,Kohany,O.&Jurka,M.V.Repetitivesequencesincomplexgenomes:structureandevolution.Annu.Rev.GenomicsHum.Genet.8,241–259(2007).CAS PubMed GoogleScholar 18.Zhang,Q.&Edwards,S.V.Theevolutionofintronsizeinamniotes:aroleforpoweredflight?GenomeBiol.Evol.4,1033–1043(2012).PubMed PubMedCentral GoogleScholar 19.Lynch,M.&Conery,J.S.Theoriginsofgenomecomplexity.Science302,1401–1404(2003).CAS PubMed ADS GoogleScholar 20.Bradnam,K.R.&Korf,I.Longerfirstintronsareageneralpropertyofeukaryoticgenestructure.PLoSONE3,e3093(2008).PubMed PubMedCentral ADS GoogleScholar 21.Kugler,M.C.,Joyner,A.L.,Loomis,C.A.&Munger,J.S.Sonichedgehogsignalinginthelung.Fromdevelopmenttodisease.Am.J.Respir.CellMol.Biol.52,1–13(2015).PubMed PubMedCentral GoogleScholar 22.Rankin,S.A.etal.AmolecularatlasofXenopusrespiratorysystemdevelopment.Dev.Dyn.244,69–85(2015).CAS PubMed GoogleScholar 23.Døving,K.B.&Trotier,D.Structureandfunctionofthevomeronasalorgan.J.Exp.Biol.201,2913–2925(1998).PubMed GoogleScholar 24.Syed,A.S.,Sansone,A.,Hassenklöver,T.,Manzini,I.&Korsching,S.I.CoordinatedshiftofolfactoryaminoacidresponsesandV2Rexpressiontoanamphibianwaternoseduringmetamorphosis.Cell.Mol.LifeSci.74,1711–1719(2017).CAS PubMed GoogleScholar 25.Nakamuta,S.,Nakamuta,N.,Taniguchi,K.&Taniguchi,K.Histologicalandultrastructuralcharacteristicsoftheprimordialvomeronasalorganinlungfish.Anat.Rec.(Hoboken)295,481–491(2012). GoogleScholar 26.Visel,A.,Minovitsky,S.,Dubchak,I.&Pennacchio,L.A.VISTAenhancerbrowser—adatabaseoftissue-specifichumanenhancers.NucleicAcidsRes.35,D88–D92(2007).CAS PubMed PubMedCentral GoogleScholar 27.Pennacchio,L.A.etal.Invivoenhanceranalysisofhumanconservednon-codingsequences.Nature444,499–502(2006).CAS PubMed ADS GoogleScholar 28.Dickel,D.E.,Visel,A.,&Pennacchio,L.A.Functionalanatomyofdistant-actingmammalianenhancers.Philos.Tras.R.Soc.Lond.B368,20120359(2013).CAS GoogleScholar 29.Kawakami,Y.etal.Sallgenesregulateregion-specificmorphogenesisinthemouselimbbymodulatingHoxactivities.Development136,585–594(2009).CAS PubMed PubMedCentral GoogleScholar 30.Camp,E.,Hope,R.,Kortschak,R.D.,Cox,T.C.&Lardelli,M.Expressionofthreespalt(sal)genehomologuesinzebrafishembryos.Dev.GenesEvol.213,35–43(2003).CAS PubMed GoogleScholar 31.Fernandez-Guerrero,M.etal.Mammalian-specificectodermalenhancerscontroltheexpressionofHoxcgenesindevelopingnailsandhairfollicles.Proc.NatlAcad.Sci.USA117,30509–30519(2020).CAS PubMed GoogleScholar 32.Spitz,F.,Herkenne,C.,Morris,M.A.&Duboule,D.Inversion-induceddisruptionoftheHoxdclusterleadstothepartitionofregulatorylandscapes.Nat.Genet.37,889–893(2005).CAS PubMed GoogleScholar 33.Andrey,G.etal.AswitchbetweentopologicaldomainsunderliesHoxDgenescollinearityinmouselimbs.Science340,1234167(2013).PubMed GoogleScholar 34.Montavon,T.&Duboule,D.ChromatinorganizationandglobalregulationofHoxgeneclusters.Philos.Trans.R.Soc.Lond.B.368,20120367(2013). GoogleScholar 35.Woltering,J.M.,Noordermeer,D.,Leleu,M.&Duboule,D.ConservationanddivergenceofregulatorystrategiesatHoxlociandtheoriginoftetrapoddigits.PLoSBiol.12,e1001773(2014).PubMed PubMedCentral GoogleScholar 36.Woltering,J.M.etal.Sarcopterygianfinontogenyelucidatestheoriginofhandswithdigits.Sci.Adv.6,eabc3510(2020).CAS PubMed PubMedCentral ADS GoogleScholar 37.Woltering,J.M.,Holzem,M.&Meyer,A.Lissamphibianlimbsandtheoriginsoftetrapodhoxdomains.Dev.Biol.456,138–144(2019).CAS PubMed GoogleScholar 38.Nagano,T.etal.ComparisonofHi-Cresultsusingin-solutionversusin-nucleusligation.GenomeBiol.16,175(2015).PubMed PubMedCentral GoogleScholar 39.Wutz,G.etal.TopologicallyassociatingdomainsandchromatinloopsdependoncohesinandareregulatedbyCTCF,WAPL,andPDS5proteins.EMBOJ.36,3573–3599(2017).CAS PubMed PubMedCentral GoogleScholar 40.Smith,J.J.etal.Achromosome-scaleassemblyoftheaxolotlgenome.GenomeRes.29,317–324(2019).CAS PubMed PubMedCentral GoogleScholar 41.Nowoshilow,S.&Tanaka,E.M.Introducingwww.axolotl-omics.org–anintegrated-omicsdataportalfortheaxolotlresearchcommunity.Exp.CellRes.394,112143(2020).CAS PubMed GoogleScholar 42.Bolger,A.M.,Lohse,M.&Usadel,B.Trimmomatic:aflexibletrimmerforIlluminasequencedata.Bioinformatics30,2114–2120(2014).CAS PubMed PubMedCentral GoogleScholar 43.Song,L.&Florea,L.Rcorrector:efficientandaccurateerrorcorrectionforIlluminaRNA-seqreads.Gigascience4,48(2015).PubMed PubMedCentral GoogleScholar 44.MacManes,M.D.TheOysterRiverProtocol:amulti-assemblerandkmerapproachfordenovotranscriptomeassembly.PeerJ6,e5428(2018).PubMed PubMedCentral GoogleScholar 45.Chikhi,R.&Medvedev,P.Informedandautomatedk-mersizeselectionforgenomeassembly.Bioinformatics30,31–37(2014).CAS PubMed GoogleScholar 46.Robertson,G.etal.DenovoassemblyandanalysisofRNA-seqdata.Nat.Methods7,909–912(2010).CAS PubMed GoogleScholar 47.Emms,D.M.&Kelly,S.OrthoFinder:solvingfundamentalbiasesinwholegenomecomparisonsdramaticallyimprovesorthogroupinferenceaccuracy.GenomeBiol.16,157(2015).PubMed PubMedCentral GoogleScholar 48.Li,W.,Jaroszewski,L.&Godzik,A.Clusteringofhighlyhomologoussequencestoreducethesizeoflargeproteindatabases.Bioinformatics17,282–283(2001).CAS PubMed GoogleScholar 49.Simão,F.A.,Waterhouse,R.M.,Ioannidis,P.,Kriventseva,E.V.&Zdobnov,E.M.BUSCO:assessinggenomeassemblyandannotationcompletenesswithsingle-copyorthologs.Bioinformatics31,3210–3212(2015).PubMed GoogleScholar 50.Nishimura,O.,Hara,Y.&Kuraku,S.gVolanteforstandardizingcompletenessassessmentofgenomeandtranscriptomeassemblies.Bioinformatics33,3635–3637(2017).CAS PubMed PubMedCentral GoogleScholar 51.Kim,D.,Langmead,B.&Salzberg,S.L.HISAT:afastsplicedalignerwithlowmemoryrequirements.Nat.Methods12,357–360(2015).CAS PubMed PubMedCentral GoogleScholar 52.Pertea,M.,Kim,D.,Pertea,G.M.,Leek,J.T.&Salzberg,S.L.Transcript-levelexpressionanalysisofRNA-seqexperimentswithHISAT,StringTieandBallgown.Nat.Protocols11,1650–1667(2016).CAS PubMed GoogleScholar 53.Yan,H.,Bombarely,A.&Li,S.DeepTE:acomputationalmethodfordenovoclassificationoftransposonswithconvolutionalneuralnetwork.Bioinformatics36,4269–4275(2020).CAS PubMed GoogleScholar 54.Chalopin,D.&Volff,J.-N.Analysisofthespottedgargenomesuggestsabsenceofcausativelinkbetweenancestralgenomeduplicationandtransposableelementdiversificationinteleostfish.J.Exp.Zoolog.BMol.Dev.Evol.328,629–637(2017).CAS GoogleScholar 55.Lerat,E.,Fablet,M.,Modolo,L.,Lopez-Maestre,H.&Vieira,C.TEtoolsfacilitatesbigdataexpressionanalysisoftransposableelementsandrevealsanantagonismbetweentheiractivityandthatofpiRNAgenes.NucleicAcidsRes.45,e17(2017).PubMed GoogleScholar 56.Wu,T.D.&Watanabe,C.K.GMAP:agenomicmappingandalignmentprogramformRNAandESTsequences.Bioinformatics21,1859–1875(2005).CAS PubMed GoogleScholar 57.Haas,B.J.etal.ImprovingtheArabidopsisgenomeannotationusingmaximaltranscriptalignmentassemblies.NucleicAcidsRes.31,5654–5666(2003).CAS PubMed PubMedCentral GoogleScholar 58.Bray,N.L.,Pimentel,H.,Melsted,P.&Pachter,L.Near-optimalprobabilisticRNA-seqquantification.Nat.Biotechnol.34,525–527(2016).CAS PubMed GoogleScholar 59.UniProtConsortium.UniProt:aworldwidehubofproteinknowledge.NucleicAcidsRes.47,D506–D515(2019). GoogleScholar 60.Slater,G.S.C.&Birney,E.Automatedgenerationofheuristicsforbiologicalsequencecomparison.BMCBioinformatics6,31(2005).PubMed PubMedCentral GoogleScholar 61.Pearson,W.R.FindingproteinandnucleotidesimilaritieswithFASTA.Curr.Protoc.Bioinformatics53,3.9.1–3.9.25(2016). GoogleScholar 62.Chan,P.P.&Lowe,T.M.tRNAscan-SE:searchingfortRNAgenesingenomicsequences.MethodsMol.Biol.1962,1–14(2019).CAS PubMed PubMedCentral GoogleScholar 63.Nawrocki,E.P.&Eddy,S.R.Infernal1.1:100-foldfasterRNAhomologysearches.Bioinformatics29,2933-2935(2013).CAS PubMed PubMedCentral GoogleScholar 64.Kozomara,A.,Birgaoanu,M.&Griffiths-Jones,S.miRBase:frommicroRNAsequencestofunction.NucleicAcidsRes.47,D155–D162(2019).CAS PubMed GoogleScholar 65.Enright,A.J.etal.MicroRNAtargetsinDrosophila.GenomeBiol.5,R1(2003).PubMed PubMedCentral GoogleScholar 66.Harris,R.ImprovedPairwiseAlignmentofGenomicDNA(PennsylvaniaStateUniv.,2007).67.Kent,W.J.etal.ThehumangenomebrowseratUCSC.GenomeRes.12,996–1006(2002).CAS PubMed PubMedCentral GoogleScholar 68.Blanchette,M.etal.Aligningmultiplegenomicsequenceswiththethreadedblocksetaligner.GenomeRes.14,708–715(2004).CAS PubMed PubMedCentral GoogleScholar 69.Siepel,A.etal.Evolutionarilyconservedelementsinvertebrate,insect,worm,andyeastgenomes.GenomeRes.15,1034–1050(2005).CAS PubMed PubMedCentral GoogleScholar 70.Cooper,G.M.etal.Distributionandintensityofconstraintinmammaliangenomicsequence.GenomeRes.15,901–913(2005).CAS PubMed PubMedCentral GoogleScholar 71.Cho,Y.S.etal.Thetigergenomeandcomparativeanalysiswithlionandsnowleopardgenomes.Nat.Commun.4,2433(2013).PubMed PubMedCentral ADS GoogleScholar 72.Ruan,J.etal.TreeFam:2008update.NucleicAcidsRes.36,D735–D740(2008).CAS PubMed GoogleScholar 73.Whelan,S.,Irisarri,I.&Burki,F.PREQUAL:detectingnon-homologouscharactersinsetsofunalignedhomologoussequences.Bioinformatics34,3929–3930(2018).CAS PubMed GoogleScholar 74.Katoh,K.&Standley,D.M.MAFFTmultiplesequencealignmentsoftwareversion7:improvementsinperformanceandusability.Mol.Biol.Evol.30,772–780(2013).CAS PubMed PubMedCentral GoogleScholar 75.Criscuolo,A.&Gribaldo,S.BMGE(BlockMappingandGatheringwithEntropy):anewsoftwareforselectionofphylogeneticinformativeregionsfrommultiplesequencealignments.BMCEvol.Biol.10,210(2010).PubMed PubMedCentral GoogleScholar 76.Lartillot,N.,Rodrigue,N.,Stubbs,D.&Richer,J.PhyloBayesMPI:phylogeneticreconstructionwithinfinitemixturesofprofilesinaparallelenvironment.Syst.Biol.62,611–615(2013).CAS PubMed GoogleScholar 77.Dutheil,J.Y.,Gaillard,S.&Stukenbrock,E.H.MafFilter:ahighlyflexibleandextensiblemultiplegenomealignmentfilesprocessor.BMCGenomics15,53(2014).PubMed PubMedCentral GoogleScholar 78.Revell,L.J.phytools:AnRpackageforphylogeneticcomparativebiology(andotherthings).MethodsEcol.Evol.3,217–223(2012). GoogleScholar 79.Lartillot,N.,Lepage,T.&Blanquart,S.PhyloBayes3:aBayesiansoftwarepackageforphylogeneticreconstructionandmoleculardating.Bioinformatics25,2286–2288(2009).CAS PubMed GoogleScholar 80.Gregory,T.R.Animalgenomesizedatabase,http://www.genomesize.com (2020).81.Yang,Z.PAML4:phylogeneticanalysisbymaximumlikelihood.Mol.Biol.Evol.24,1586–1591(2007).CAS PubMed GoogleScholar 82.Marjanović,D.ThemakingofcalibrationsausageexemplifiedbyrecalibratingthetranscriptomictimetreeofjawedvertebratesPreprintathttps://doi.org/10.1101/2019.12.19.882829(2019).83.DeBie,T.,Cristianini,N.,Demuth,J.P.&Hahn,M.W.CAFE:acomputationaltoolforthestudyofgenefamilyevolution.Bioinformatics22,1269–1271(2006).PubMed GoogleScholar 84.Buchfink,B.,Xie,C.&Huson,D.H.FastandsensitiveproteinalignmentusingDIAMOND.Nat.Methods12,59–60(2015).CAS PubMed GoogleScholar 85.Niimura,Y.Olfactoryreceptormultigenefamilyinvertebrates:fromtheviewpointofevolutionarygenomics.Curr.Genomics13,103–114(2012).CAS PubMed PubMedCentral GoogleScholar 86.O’Brien,K.P.,Remm,M.&Sonnhammer,E.L.L.Inparanoid:acomprehensivedatabaseofeukaryoticorthologs.NucleicAcidsRes.33,D476–D480(2005).PubMed GoogleScholar 87.Suyama,M.,Torrents,D.&Bork,P.PAL2NAL:robustconversionofproteinsequencealignmentsintothecorrespondingcodonalignments.NucleicAcidsRes.34,W609–W612(2006).CAS PubMed PubMedCentral GoogleScholar 88.Edgar,R.C.MUSCLE:multiplesequencealignmentwithhighaccuracyandhighthroughput.NucleicAcidsRes.32,1792–1797(2004).CAS PubMed PubMedCentral GoogleScholar 89.Castresana,J.Selectionofconservedblocksfrommultiplealignmentsfortheiruseinphylogeneticanalysis.Mol.Biol.Evol.17,540–552(2000).CAS PubMed GoogleScholar 90.Huerta-Cepas,J.,Serra,F.&Bork,P.ETE3:reconstruction,analysis,andvisualizationofphylogenomicdata.Mol.Biol.Evol.33,1635–1638(2016).CAS PubMed PubMedCentral GoogleScholar 91.Woltering,J.M.etal.Axialpatterninginsnakesandcaecilians:evidenceforanalternativeinterpretationoftheHoxcode.Dev.Biol.332,82–89(2009).CAS PubMed GoogleScholar 92.Monti,R.etal.Limb-EnhancerGenie:anaccessibleresourceofaccurateenhancerpredictionsinthedevelopinglimb.PLoSComput.Biol.13,e1005720(2017).PubMed PubMedCentral GoogleScholar 93.Osterwalder,M.etal.HAND2targetsdefineanetworkoftranscriptionalregulatorsthatcompartmentalizetheearlylimbbudmesenchyme.Dev.Cell31,345–357(2014).CAS PubMed PubMedCentral GoogleScholar 94.Osterwalder,M.etal.Enhancerredundancyprovidesphenotypicrobustnessinmammaliandevelopment.Nature554,239–243(2018).CAS PubMed PubMedCentral ADS GoogleScholar 95.Bickelmann,C.etal.NoncanonicalHox,Etv4,andGli3geneactivitiesgiveinsightintouniquelimbpatterninginsalamanders.J.Exp.Zoolog.BMol.Dev.Evol.330,138–147(2018).CAS GoogleScholar 96.Du,K.etal.Thesterletsturgeongenomesequenceandthemechanismsofsegmentalrediploidization.Nat.Eco.Evol.4,841–852(2020). GoogleScholar DownloadreferencesAcknowledgementsWethankthelateJ.ClackandR.L.Carrollfortheircontributiontoourunderstandingofthewater–landtransitionofvertebrates.ThisworkwassupportedbytheGermanScienceFoundation(DFG)throughagranttoA.M.,T.B.andM.S.(Me1725/24-1,Bu956/23-1,Scha408/16-1)andtoJ.M.W.(Wo2165/2-1),andcorefundingfromtheIMPtoE.M.T.J.-N.V.andM.S.weresupportedbyajointgrantoftheFrenchResearchAgency(ANREvobooster)andDFG(SCHA408/13-1).I.I.wassupportedbytheSpanishMinistryofEconomyandCompetitiveness(MINECO)(JuandelaCierva-IncorporaciónfellowshipIJCI-2016-29566)andtheEuropeanResearchCouncil(grantagreementno.852725;ERC-StG‘TerreStriAL’toJ.deVries(UniversityofGöttingen)).W.Y.W.andO.S.weresupportedbytheAustrianScienceFundgrantsP3219andI4353.W.Y.W.issupportedbyCroucherScholarshipsforDoctoralStudy.A.K.wassupportedbyafellowshipfromtheJapaneseSocietyforthePromotionofScience(JSPS)postdoctoralfellowshipforOverseasResearchersProgram.WethankD.OcampoDaza(http://www.egosumdaniel.se/)forgenerouslysharinghisvertebrateillustrations,J.JossandP.Sordinoforthegiftoflungfishembryos,andL.PennacchioforVistaenhancerimages.AuthorinformationAuthornotesIkerIrisarriPresentaddress:DepartmentofAppliedBioinformatics,InstituteforMicrobiologyandGenetics,UniversityofGoettingen,Goettingen,GermanyTheseauthorscontributedequally:AxelMeyer,SiegfriedSchloissnig,PaoloFranchini,KangDu,Joost M.WolteringTheseauthorsjointlysupervisedthiswork:AxelMeyer,OlegSimakov,ThorstenBurmester,EllyM.Tanaka,ManfredSchartlAffiliationsDepartmentofBiology,UniversityofKonstanz,Konstanz,GermanyAxelMeyer, PaoloFranchini, JoostM.Woltering & PeiwenXiongResearchInstituteofMolecularPathology(IMP),Vienna,AustriaSiegfriedSchloissnig, SergejNowoshilow, AkaneKawaguchi & EllyM.TanakaDevelopmentalBiochemistry,Biocenter,UniversityofWürzburg,Würzburg,GermanyKangDu & ManfredSchartlTheXiphophorusGeneticStockCenter,TexasStateUniversity,SanMarcos,TX,USAKangDu & ManfredSchartlDepartmentofBiodiversityandEvolutionaryBiology,MuseoNacionaldeCienciasNaturales(MNCN-CSIC),Madrid,SpainIkerIrisarriDepartmentofNeuroscienceandDevelopmentalBiology,UniversityofVienna,Vienna,AustriaWaiYeeWong & OlegSimakovBiochemistryandCellBiology,Biocenter,UniversityofWürzburg,Würzburg,GermanySusanneKneitzInstitutfürZoologie,UniversitätHamburg,Hamburg,GermanyAndrejFabrizius & ThorstenBurmesterInstitutdeGénomiqueFonctionnelle,ÉcoleNormaleSuperieure,UniversitéClaudeBernard,Lyon,FranceCorentinDechaud & Jean-NicolasVolffFacultyofScience,UniversiteitLeiden,Leiden,TheNetherlandsHermanP.SpainkAuthorsAxelMeyerViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarSiegfriedSchloissnigViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarPaoloFranchiniViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarKangDuViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarJoostM.WolteringViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarIkerIrisarriViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarWaiYeeWongViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarSergejNowoshilowViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarSusanneKneitzViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarAkaneKawaguchiViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarAndrejFabriziusViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarPeiwenXiongViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarCorentinDechaudViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarHermanP.SpainkViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarJean-NicolasVolffViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarOlegSimakovViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarThorstenBurmesterViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarEllyM.TanakaViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarManfredSchartlViewauthorpublicationsYoucanalsosearchforthisauthorin PubMed GoogleScholarContributionsA.M.,T.B.andM.S.conceivedthestudyandcoordinatedthework.A.M.andM.S.wrotethemanuscriptwithcontributionsfromallotherauthors.S.S.performedgenomeassemblyintocontigsandHi-Cscaffolding.P.F.undertooktranscriptomeanalysis,annotationandCNEanalyses.K.D.performedgenomeannotation,analysisofgenefamilydynamicsandgenomeexpansion.J.M.W.analysedandannotatedhoxclusters,andperformedembryonalRNA-seqandinsituhybridization.I.I.generatedphylogeneticanalyses,andmolecularclockandancestralcharacterstatereconstruction.W.Y.W.performedrepeatandsyntenicanalysis.S.N.undertookgenomecorrectionandinitialtranscriptalignment.S.K.performedpositiveselectionanalysis.A.K.undertookHi-Clibrarypreparationandlibrarypreparationforgenomecorrection.A.F.performedtranscriptomegeneration.P.X.annotatedncRNAs.C.D.andJ.-N.V.performedtransposonandrepeatanalysis.H.P.S.contributedresources.O.S.performedsyntenicanalyses.E.M.T.supervisedHi-Candgenomicsequencing,andanalyseddata.CorrespondingauthorsCorrespondenceto AxelMeyer,OlegSimakov,ThorstenBurmester,EllyM.TanakaorManfredSchartl.Ethicsdeclarations Competinginterests Theauthorsdeclarenocompetinginterests. AdditionalinformationPeerreviewinformationNaturethanksMarinaHaukness,RyanLorig-Roach,BenedictPaten,IgorSchneiderandtheother,anonymous,reviewer(s)fortheircontributiontothepeerreviewofthiswork.Peerreviewreportsareavailable.Publisher’snoteSpringerNatureremainsneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations.ExtendeddatafiguresandtablesExtendedDataFig.1Schematicoverviewofthescaffoldingprocedure.a,Scaffoldingconsistsconceptuallyoftwonestedloops.Theinnerloop,depictedontheright,takesalistofcontigs,theircontactinformationanditerativelyperformsaglobalagglomerativeclusteringuntilconvergenceoruntilnomorecontigscanbejoined.Thisloopisnestedinthemainprocedure,whichtakesasinputalistofseedcontigs,assignscontigstheseinitialclusters,scaffoldstheseandallowsforvisualinspectionandmergingorsplittingoftheclusters.b,N(x)plotoftheassembledcontigs.Onthey axisthecontiglengthisshown,forwhichthecollectionofallcontigsofthatlengthorlongercoversatleastxpercent(x axis)oftheassembly.c,N(x)plotofthescaffoldedgenome.Onthey axis,thecontiglengthisshownforwhichthecollectionofallscaffoldsofthatlengthorlongercoversatleastxpercent(x axis)oftheassembly.d,Hi-Ccontactheatmapofthescaffoldedportionofthelungfishgenomeassembly,orderedbyscaffoldlength.Blueboxesindicatethescaffoldboundaries.Thefourlargestscaffoldsrepresentbothchromosomearmsonasinglescaffold.Remainingscaffoldsaresplitintochromosomearmsorrepresentmicrochromosomes.e,Schemaillustratingthecontigmisjoindetectionprocess.Hi-Ccontactsarebinnedalongthediagonal.Pointsthatarenotcrossedbyasufficientnumberofcontactsaredeemedpotentialmisjoinsandarethusseparated(dottedline).ExtendedDataFig.2k-merfrequencyanalysisandtranscriptcoveragebygenomicsequences.a,TheIlluminadatasetwasusedtogeneratethespectraofk-merabundancesusingsevenk-mersizes.b–e,Transcriptcoveragebygenomicsequences.b,Histogramoftheproportionofalltranscriptlengthscoveredbythealignmenttocontigs.c,Histogramoftheproportionofalltranscriptlengthscoveredbythealignmenttoscaffolds.d,e,Histogramoftheproportionofthetranscriptlengthscoveredbythealignmenttocontigs(d)ortoscaffolds(e)ofthosetranscriptswithalignmentsthatwereimprovedafterscaffolding.ExtendedDataFig.3CNE-basedphylogeny,divergencetimesandratesofgenomeevolution.a,Maximumlikelihoodphylogenyfromnoncodingconservedalignmentblockstotalling99,601 informativesites(usingRAxML;GTRGAMMAmodel).Allbranchesweresupportedby100%bootstrapvalue;scalebarisinexpectednucleotidereplacementspersite.BranchlengthsofthetreesobtainedbytheCNEmethodorfromtheproteinsequencesshowahighcorrelation(R2 = 0.84,P 3 s.d.fromthe mean.Overall,theseregionsrepresent0.09%ofthegenome.b,Representativeregionshowingreadpile-upwithcoverageinexcessof3 s.d.fromthemean.TheentireregioniscontainedwithinaregionannotatedasrepetitivebyRepeatMasker(redinterval).Supplementaryinformation SupplementaryInformationThisfilecontainsSupplementaryMethods,SupplementaryResults,SupplementaryTablelegends,andSupplementaryReferences.ReportingSummaryPeerReviewFileSupplementaryTable1Basicstatisticsforthelungfishgenomelong-readsequencingandfinalassembly.SupplementaryTable2Assessmentofthecompletenessofthegenomeassemblyafterannotation.TheorthologysearchpipelineBUSCOwasusedwiththeCoreVertebrateGenes(CVG)andVertebrataconservedgenes(vertebrata_odb9)genesets.SupplementaryTable3Comparisonofnumbersandstructuralfeaturesofdifferentnon-codingRNAclassesandregionsinlungfishandothervertebrates.Thespreadsheethasthefollowingsections:(1)NumberofdifferenttypesofncRNAsintenfocalgenomes.Thelungfishgenomecontains17,095ncRNAgenes,including1,042tRNAgenes,1,771rRNAgenes,and3,974microRNAgenes.Lengthof5’UTR,3’UTR,CDS,andintronsofcanonicalmRNAs.(2)PredictedmiRNAtargetsitesineightgenomes.Comparedtootherspecies,lungfishdoesnotshowsignificantdifferenceinmiRNAtargetdensity,suggestingthepotentiallyneutralevolutionof3’UTRinlungfish.(3)Lengthof5’UTR,3’UTR,CDS,andintronineightfocalgenomes.Lungfishhaslongernon-codingregionsinthegenesthanotherspecies.SupplementaryTable4Lungfishrepetitiveelementstatisticsafterthefirstroundofmasking.Thetablereportstherepetitiveelement,numberofelements,length(bp)occupiedinthewholegenome,percentageofsequence(%),average_copy_length(bp).SupplementaryTable5TEstatisticsafterdoublemasking,withmergedwithresultsfromthefirstroundofmasking.Thetablereportstherepetitiveelement,numberofelements,length(bp)occupiedinthewholegenome,percentageofsequence(%),average_copy_length(bp).SupplementaryTable6ClassificationofconsensussequencesfromRepeatModelerbyDeepTE,PASTECandblast.Thetableshowsthefurtherclassificationresultofeachrepetitiveelementconsensussequencefromotherannotators."NA"referstonomatchingresultfromthetool.Thecolumn "merge_strategy"suggeststhebestwaytomergeannotationsfromdifferenttools.SupplementaryTable7Repertoireofthesmallnon-codingRNAprocessingmachinerygenesinvertebrates.Presenceorabsenceofgenesweretakenfromref.31anddataofAustralianlungfishandaxolotladded.Presence(green)orabsence(red)isindicated.SupplementaryTable8Comparisonofintronsizes(inbp)betweendevelopmentalandnon-developmentalgenesinthelungfishgenome.SupplementaryTable9Listofgenesunderpositiveselectioninmodel1andmodel2andfunctionalclustering.Thespreadsheethasthefollowingsections:(1)Positivelyselectedgenesinthelungfishgenome(model1)andinthecommonlineageoflungfishandtetrapods(model2).(2)FunctionalclusteringbytheDAVIDandIngenuitiysoftwareforgenesidentifiedformodel1.(3)FunctionalclusteringbytheDAVIDandIngenuitiysoftwareforgenesidentifiedformodel2.SupplementaryTable10Numbersofgenefamiliesthataresignificantlyexpandedorcontractedinlungfishandothervertebrates.ResultsarefromanalysesusingtheCAFEprogram,version4.NumbersaregivenforeachbranchofthephylogenydepictedinFigure1.SupplementaryTable11Genefamilydynamicsinlungfishandothervertebrates.Thespreadsheethasthefollowingsections:(1)Genefamiliesthatweresignificantly(p<0.01)expandedorcontractedinthelungfishbranch.(2)Genefamiliesthatweresignificantly(p<0.01)expandedorcontractedonancestralandterminalbranchesin10vertebratespecies.SupplementaryTable12Numberoffunctionalpulmonarysurfactantgenescomparedamongspecies.Thegenenumberisthesumofintactandtruncatedpredictions.SupplementaryTable13Repertoireofolfactoryandtastereceptors.Thenumberoffunctionalolfactoryreceptorsandtastereceptorgenesaregivenforlungfishandninerepresentativeaquatic,amphibianorterrestrialspecies.Numbersarethesumofintactandtruncatedpredictions.Odorantreceptorareassignedtogroupsrelatingtotheoriginoftherespectiveodorsaccordingtoref.32.SupplementaryTable14NumberandlengthoftheregionsinthegenomethatarenotannotatedasrepetitivebyRepeatMaskerbuthavingacoverageinexcessof3standarddeviants.SupplementaryTable15RankorderlistofestimatedchromosomeDNAcontentandofDNAcontentinscaffolds.Leftcolumn:ListofestimatedchromosomeDNAcontent.ChromosomalDNAcontentwascalculatedbymeasuringchromosomeareafromRocketal1996anddeterminingthefractionofthetotalwithagenomesizeof43Gb.Rightcolumn:ListofDNAcontentinscaffolds.Thislistisorderedbysizeanddoesnotimplyanyrelationshiptothechromosomeslistedontheleft.SupplementaryTable16ListofOligonucleotides.Thistablelisttheoligonucleotidesequencesusedforin-situhybridizationprobesynthesisforaxolotlhoxd9,hoxc13,hoxd9,lungfishsal1,hoxc13,sox9,andcichlidhoxc13.SupplementaryTable17Countsofrepetitiveelementsingenicregions.Thetablereportsthegenomicfeatures(i.e.intronandexon),subfeatures(i.e.UTR,intron/exonnumber),repetitiveelementclasses,numberofelements(bp),lengthoffeatures(bp),percentageoffeatureoccupiedbyrepetitiveelement(%)andnumericalorderofsubfeatures(usedtogeneratetheplot).SupplementaryTable18DistancebetweenCNEpairsinhuman,chicken,axolotlandlungfish.Thetablereportsthe223pairsofnon-exonicconservedelements(CNE)thatwereidentifiedinlungfishandthreetetrapods(human,chickenandaxolotl),andusedtocalculatetheintergenicdistanceandtheregion-specificexpansionofthelungfishgenome.TheselectedinformativeCNEpairs1)werepresentinthefourspeciesgenomes,2)werelocatedinintergenicspaceofthesamecontig/chromosomeineachspeciesand3)didnothaveageneinbetweenthem.Meanandmedianexpansionincomparisontoaxolotlandlungfish(thelineagethathaveundergonedrasticgenomeexpansion)areshown.SupplementaryTable19CNEshowingacceleratedevolutioninlungfish.TheprogramphyloPwasusedtotestthenon-codingconservedelements(CNE)forlineage-specificacceleratedevolutioninthelungfishlineage,usingascomplementarytreetheotherninelineagesinourmultispeciesalignment.Thep-valueforeachCNEwascomputedusingalikelihoodratiotestusingthe“ACC”modeimplementedinphyloPandcorrectedwiththeBenjamini–Hochbergfalsediscoveryrate(FDR)multipletestcorrectionprocedure.CNEID,sizeandlocationinthehumangenomechromosomeareshown.ThelastcolumnindicateswhetherthefocalCNEislocatedinintergenicoringenicspace(UTRorintron).Rightsandpermissions OpenAccessThisarticleislicensedunderaCreativeCommonsAttribution4.0InternationalLicense,whichpermitsuse,sharing,adaptation,distributionandreproductioninanymediumorformat,aslongasyougiveappropriatecredittotheoriginalauthor(s)andthesource,providealinktotheCreativeCommonslicense,andindicateifchangesweremade.Theimagesorotherthirdpartymaterialinthisarticleareincludedinthearticle’sCreativeCommonslicense,unlessindicatedotherwiseinacreditlinetothematerial.Ifmaterialisnotincludedinthearticle’sCreativeCommonslicenseandyourintendeduseisnotpermittedbystatutoryregulationorexceedsthepermitteduse,youwillneedtoobtainpermissiondirectlyfromthecopyrightholder.Toviewacopyofthislicense,visithttp://creativecommons.org/licenses/by/4.0/. ReprintsandPermissionsAboutthisarticleCitethisarticleMeyer,A.,Schloissnig,S.,Franchini,P.etal.Giantlungfishgenomeelucidatestheconquestoflandbyvertebrates. Nature590,284–289(2021).https://doi.org/10.1038/s41586-021-03198-8DownloadcitationReceived:13July2020Accepted:06January2021Published:18January2021IssueDate:11February2021DOI:https://doi.org/10.1038/s41586-021-03198-8SharethisarticleAnyoneyousharethefollowinglinkwithwillbeabletoreadthiscontent:GetshareablelinkSorry,ashareablelinkisnotcurrentlyavailableforthisarticle.Copytoclipboard ProvidedbytheSpringerNatureSharedItcontent-sharinginitiative Furtherreading Comparativeanalysisrevealswithin-populationgenomesizevariationinarotiferisdrivenbylargegenomicelementswithhighlyabundantsatelliteDNArepeatelements C.P.Stelzer J.Blommaert D.B.MarkWelch BMCBiology(2021) InvestigationoftheactivityoftransposableelementsandgenesinvolvedintheirsilencinginthenewtCynopsorientalis,aspecieswithagiantgenome FedericaCarducci ElisaCarotti MariaAssuntaBiscotti ScientificReports(2021) Giantgenomesoflungfish GrantOtto NatureReviewsGenetics(2021) EarliestmigratorycephalicNCcellsarepotenttodifferentiateintodentalectomesenchymeofthetwolungfishdentitions:tetrapodomorphancestralconditionofunconstrainedcapabilityofmesencephalicNCcellstoformoralteeth MartinKundrát TheScienceofNature(2021) Fishgenomicsanditsimpactonfundamentalandappliedresearchofvertebratebiology SyedFarhanAhmad MaryamJehangir CesarMartins ReviewsinFishBiologyandFisheries(2021) CommentsBysubmittingacommentyouagreetoabidebyourTermsandCommunityGuidelines.Ifyoufindsomethingabusiveorthatdoesnotcomplywithourtermsorguidelinespleaseflagitasinappropriate. DownloadPDF AssociatedContent Milestone GenomicSequencing Advertisement Explorecontent Researcharticles News Opinion ResearchAnalysis Careers Books&Culture Podcasts Videos Currentissue Browseissues Collections Subjects FollowusonFacebook FollowusonTwitter Signupforalerts RSSfeed Aboutthejournal JournalStaff AbouttheEditors JournalInformation Ourpublishingmodels EditorialValuesStatement Awards JournalImpact Contact Editorialpolicies HistoryofNature Sendanewstip Publishwithus ForAuthors ForReferees Submitmanuscript Search Searcharticlesbysubject,keywordorauthor Showresultsfrom Alljournals Thisjournal Search Advancedsearch Quicklinks Explorearticlesbysubject Findajob Guidetoauthors Editorialpolicies
延伸文章資訊
- 1Genomes of major fishes in world fisheries and aquaculture
- 2Australian lungfish has largest genome of any animal sequenced ...
- 3Giant lungfish genome elucidates the conquest of land by ...
Lungfishes belong to lobe-fined fish (Sarcopterygii) that, ... The vast size of this genome, whic...
- 4Fish genomics and biology
Here we review the impact of the genome sequence for those fish species for which it is already a...
- 5A chromosome-level genome of Astyanax mexicanus surface ...
The improved contig length of the surface fish genome and our syntenic analysis that unites the p...