Table of Contents - Google - Site Reliability Engineering

文章推薦指數: 80 %
投票人數:10人

... Part I - Introduction · Chapter 1 - Introduction · Chapter 2 - The Production Environment at Google, from the Viewpoint of an SRE · Part II - Principles ... Foreword Preface PartI-Introduction 1.Introduction 2.TheProductionEnvironmentatGoogle,fromtheViewpointofanSRE PartII-Principles 3.EmbracingRisk 4.ServiceLevelObjectives 5.EliminatingToil 6.MonitoringDistributedSystems 7.TheEvolutionofAutomationatGoogle 8.ReleaseEngineering 9.Simplicity PartIII-Practices 10.PracticalAlerting 11.BeingOn-Call 12.EffectiveTroubleshooting 13.EmergencyResponse 14.ManagingIncidents 15.PostmortemCulture:LearningfromFailure 16.TrackingOutages 17.TestingforReliability 18.SoftwareEngineeringinSRE 19.LoadBalancingattheFrontend 20.LoadBalancingintheDatacenter 21.HandlingOverload 22.AddressingCascadingFailures 23.ManagingCriticalState:DistributedConsensusforReliability 24.DistributedPeriodicSchedulingwithCron 25.DataProcessingPipelines 26.DataIntegrity:WhatYouReadIsWhatYouWrote 27.ReliableProductLaunchesatScale PartIV-Management 28.AcceleratingSREstoOn-CallandBeyond 29.DealingwithInterrupts 30.EmbeddinganSREtoRecoverfromOperationalOverload 31.CommunicationandCollaborationinSRE 32.TheEvolvingSREEngagementModel PartV-Conclusions 33.LessonsLearnedfromOtherIndustries 34.Conclusion AppendixA.AvailabilityTable AppendixB.ACollectionofBestPracticesforProductionServices AppendixC.ExampleIncidentStateDocument AppendixD.ExamplePostmortem AppendixE.LaunchCoordinationChecklist AppendixF.ExampleProductionMeetingMinutes Bibliography TableofContents TableofContents Foreword Preface PartI-Introduction Chapter1-Introduction Chapter2-TheProductionEnvironmentatGoogle,fromtheViewpointofanSRE PartII-Principles Chapter3-EmbracingRisk Chapter4-ServiceLevelObjectives Chapter5-EliminatingToil Chapter6-MonitoringDistributedSystems Chapter7-TheEvolutionofAutomationatGoogle Chapter8-ReleaseEngineering Chapter9-Simplicity PartIII-Practices Chapter10-PracticalAlerting Chapter11-BeingOn-Call Chapter12-EffectiveTroubleshooting Chapter13-EmergencyResponse Chapter14-ManagingIncidents Chapter15-PostmortemCulture:LearningfromFailure Chapter16-TrackingOutages Chapter17-TestingforReliability Chapter18-SoftwareEngineeringinSRE Chapter19-LoadBalancingattheFrontend Chapter20-LoadBalancingintheDatacenter Chapter21-HandlingOverload Chapter22-AddressingCascadingFailures Chapter23-ManagingCriticalState:DistributedConsensusforReliability Chapter24-DistributedPeriodicSchedulingwithCron Chapter25-DataProcessingPipelines Chapter26-DataIntegrity:WhatYouReadIsWhatYouWrote Chapter27-ReliableProductLaunchesatScale PartIV-Management Chapter28-AcceleratingSREstoOn-CallandBeyond Chapter29-DealingwithInterrupts Chapter30-EmbeddinganSREtoRecoverfromOperationalOverload Chapter31-CommunicationandCollaborationinSRE Chapter32-TheEvolvingSREEngagementModel PartV-Conclusions Chapter33-LessonsLearnedfromOtherIndustries Chapter34-Conclusion AppendixA-AvailabilityTable AppendixB-ACollectionofBestPracticesforProductionServices AppendixC-ExampleIncidentStateDocument AppendixD-ExamplePostmortem AppendixE-LaunchCoordinationChecklist AppendixF-ExampleProductionMeetingMinutes Bibliography



請為這篇文章評分?