What is SRE (site reliability engineering)? - Red Hat

文章推薦指數: 80 %
投票人數:10人

Site reliability engineering (SRE) is a software engineering approach to IT operations. SRE teams use software as a tool to manage systems, ... Skiptocontent Featuredlinks Console Support Developers Partners Redhat.com Startatrial Products Solutions Services&support Resources RedHat&opensource MoreRedHat Console Support Developers Partners Redhat.com Startatrial Enteryourkeywords Contactus English Selectalanguage 简体中文EnglishFrançaisDeutschItaliano日本語한국어PortuguêsEspañol Account Login LoginYourRedHataccountgivesyouaccesstoyourmemberprofileandpreferences,andthefollowingservicesbasedonyourcustomerstatus: CustomerPortal RedHatConnectforBusinessPartners Usermanagement CertificationCentral RegisternowNotregisteredyet?Hereareafewreasonswhyyoushouldbe:BrowseKnowledgebasearticles,managesupportcasesandsubscriptions,downloadupdates,andmorefromoneplace. Viewusersinyourorganization,andedittheiraccountinformation,preferences,andpermissions. ManageyourRedHatcertifications,viewexamhistory,anddownloadcertification-relatedlogosanddocuments. EdityourprofileandpreferencesYourRedHataccountgivesyouaccesstoyourmemberprofile,preferences,andotherservicesdependingonyourcustomerstatus. Foryoursecurity,ifyou'reonapubliccomputerandhavefinishedusingyourRedHatservices,pleasebesuretologout. Logout Account Login Jumptosection Jumptosection Sitereliabilityengineering(SRE)isasoftwareengineeringapproachtoIToperations.SREteamsusesoftwareasatooltomanagesystems,solveproblems,andautomateoperationstasks.SREtakesthetasksthathavehistoricallybeendonebyoperationsteams,oftenmanually,andinsteadgivesthemtoengineersoropsteamswhousesoftwareandautomationtosolveproblemsandmanageproductionsystems. SREisavaluablepracticewhencreatingscalableandhighlyreliablesoftwaresystems.Ithelpsyoumanagelargesystemsthroughcode,whichismorescalableandsustainableforsysadminsmanagingthousandsorhundredsofthousandsofmachines. TheconceptofsitereliabilityengineeringcomesfromtheGoogleengineeringteamandiscreditedtoBenTreynorSloss. SREhelpsteamsfindabalancebetweenreleasingnewfeaturesandmakingsurethattheyarereliableforusers.Standardizationandautomationare2importantcomponentsoftheSREmodel.Sitereliabilityengineersshouldalwaysbelookingforwaystoenhanceandautomateoperationstasks.Inthisway,SREhelpstoimprovethereliabilityofasystemtoday,whilealsoimprovingitasitgrowsovertime. SREsupportsteamswhoaremovingfromatraditionalapproachtoIToperationstoacloud-nativeapproach.LearnaboutRedHat'sapproachtoSREAsitereliabilityengineerisauniquerolethatrequireseitherabackgroundasasoftwaredeveloperwithadditionaloperationsexperience,orasasysadminorinanIToperationsrolethatalsohassoftwaredevelopmentskills. SREteamsareresponsibleforhowcodeisdeployed,configured,andmonitored,aswellastheavailability,latency,changemanagement,emergencyresponse,andcapacitymanagementofservicesinproduction.Sitereliabilityengineeringhelpsteamstodeterminewhatnewfeaturescanbelaunchedandwhenbyusingservice-levelagreements(SLAs)todefinetherequiredreliabilityofthesystemthroughservice-levelindicators(SLI)andservice-levelobjectives(SLO). AnSLIisadefinedmeasureofspecificaspectsofprovidedservicelevels.KeySLIsincluderequestlatency,availability,errorrate,andsystemthroughput.AnSLOisbasedonthetargetvalueorrangeforaspecifiedservicelevelbasedontheSLI.AnSLOfortherequiredsystemreliabilityisthendeterminedbasedonthedowntimeagreeduponasacceptable.Thisdowntimelevelisreferredtoasanerrorbudget,themaximumallowablethresholdforerrorsandoutages. WithSRE,100%reliabilityisnotexpected;failureisplannedforandaccepted. Thedevelopmentteamisableto"spend"theerrorbudgetwhenreleasinganewfeature.UsingtheSLOanderrorbudget,thedevelopmentteamcandeterminewhetherornotaproductorserviceisabletolaunchbasedontheavailableerrorbudget.Ifaserviceisrunningwithinthe errorbudget,thenthedevelopmentteamcanlaunchwhenevertheywant,butifthesystemcurrentlyhastoomanyerrorsorgoesdownforlongerthantheerrorbudgetallowsthennonewlaunchescantakeplaceuntiltheerrorsarewithinbudget.   Thedevelopmentteamconductsautomatedoperationsteststodemonstratereliability. Sitereliabilityengineerssplittheirtimebetweenoperationstasksandprojectwork.AccordingtoSREbestpracticesfromGoogle,asitereliabilityengineercanonlyspendamaximumof50%oftheirtimeonoperations,whichshouldbemonitoredtoensuretheydon’tgoover. Therestofthetimeshouldbespentondevelopmenttaskslikecreatingnewfeatures,scalingthesystem,andimplementingautomation.Excessoperationalworkandpoorlyperformingservicescanberedirectedbacktothedevteamtoruninsteadofthesitereliabilityengineerspendingtoomuchtimeontheoperationsofanapplicationorservice. Automationisanimportantpartofthesitereliabilityengineer’srole.Iftheyaredealingwithaproblemrepeatedlythentheywillautomateasolution.Thisalsohelpsensurethatoperationsworkremainsathalfoftheirworkload. MaintainingthebalancebetweenoperationsanddevelopmentworkisakeycomponentofSRE. DevOpsisanapproachtoculture,automation,andplatformdesignintendedtodeliverincreasedbusinessvalueandresponsivenessthroughrapid,high-qualityservicedelivery.SREcanbeconsideredanimplementationofDevOps.LikeDevOps,SREisaboutteamcultureandrelationships.BothSREandDevOpsworktobridgethegapbetweendevelopmentandoperationsteamstodeliverservicesfaster. Fasterapplicationdevelopmentlifecycles,improvedservicequalityandreliability,andreducedITtimeperapplicationdevelopedarebenefitsthatcanbeachievedbybothDevOpsandSREpractices.SREisdifferentbecauseitreliesonsitereliabilityengineerswithinthedevelopmentteamwhoalsohaveanoperationsbackgroundtoremovecommunicationandworkflowproblems.Thesitereliabilityengineerroleitselfcombinestheskillsetofdevteamsandoperationsteamsbyrequiringanoverlapinresponsibilities. SREcanhelpDevOpsteamswhosedevelopersareoverwhelmedbyoperationstasksandneedsomeonewithmorespecializedopsskills. Intermsofcodeandnewfeatures,DevOpsfocusesonmovingthroughthedevelopmentpipelineefficiently,whileSREisfocusedonbalancingsitereliabilitywithcreatingnewfeatures. Modernapplicationplatformsbasedoncontainertechnology,KubernetesandmicroservicesarecriticaltoDevOpspractices,helpingdeliversecureandinnovativesoftwareservices.LearnhowtoimplementDevOpswithaKubernetesplatformReadmoreaboutDevOpsonRedHatDeveloperSREreliesonautomatingroutineoperationaltasksandstandardizationacrossanapp’slifecycle.Linux®containersgiveyourteamtheunderlyingtechnologyneededforacloud-nativedevelopmentstyle.Containerssupportaunifiedenvironmentfordevelopment,delivery,integration,andautomation.AndKubernetesisthemodernwaytoautomateLinuxcontaineroperations.KuberneteshelpsyoueasilyandefficientlymanageclustersrunningLinuxcontainersacrosspublic,private,orhybridclouds.Withtherightplatform,youcanbesttakeadvantageofthecultureandprocesschangesyou’veimplemented.RedHat®OpenShift®istheenterprise-readyKubernetesplatformtosupportSREinitiatives.TryRedHatOpenShiftforfree IfyouwanttotakefulladvantageoftheagilityandresponsivenessofDevOps,ITsecuritymustplayaroleinthefulllifecycleofyourapps.CI/CDintroducesongoingautomationandcontinuousmonitoringthroughoutthelifecycleofapps,fromintegrationandtestingphasestodeliveryanddeployment.ADevOpsengineerhasauniquecombinationofskillsandexpertisethatenablescollaboration,innovation,andculturalshiftswithinanorganization.  ProductsAnintensive,highlyfocusedresidencywithRedHatexpertswhereyoulearntouseanagilemethodologyandopensourcetoolstoworkonyourenterprise’sbusinessproblems.Engagementswithourstrategicadviserswhotakeabig-pictureviewofyourorganization,analyzeyourchallenges,andhelpyouovercomethemwithcomprehensive,cost-effectivesolutions.RelatedarticlesUnderstandingDevOpsCloud-nativeCI/CDonRedHatOpenShiftWhatisdeploymentautomation?WhatisDevOpsautomation?WhoisaDevOpsengineer?WhatisaCI/CDpipeline​?Whatisagilemethodology?Whatisapplicationlifecyclemanagement (ALM)?Whatisbluegreendeployment?WhatisCI/CD?Whatiscontinuousdelivery?WhatisDevSecOps?WhatisGitOps?WhatisSRE(sitereliabilityengineering)?ResourcesEnterpriseautomationwithaDevOpsmethodologyStreamlineCI/CDpipelineswithRedHatAnsibleAutomationPlatformANALYSTMATERIAL451ResearchPathfinderreport:AchievingIntelligentDevOpsANALYSTMATERIALDrivingDevOpsautomationANALYSTMATERIALAcceleratingDevOpsinthepublicsector Getmorecontentlikethis Signupforourfreenewsletter,RedHatShares. Continue



請為這篇文章評分?