Table of Contents - Google - Site Reliability Engineering
文章推薦指數: 80 %
... Part I - Introduction · Chapter 1 - Introduction · Chapter 2 - The Production Environment at Google, from the Viewpoint of an SRE · Part II - Principles ... Foreword Preface PartI-Introduction 1.Introduction 2.TheProductionEnvironmentatGoogle,fromtheViewpointofanSRE PartII-Principles 3.EmbracingRisk 4.ServiceLevelObjectives 5.EliminatingToil 6.MonitoringDistributedSystems 7.TheEvolutionofAutomationatGoogle 8.ReleaseEngineering 9.Simplicity PartIII-Practices 10.PracticalAlerting 11.BeingOn-Call 12.EffectiveTroubleshooting 13.EmergencyResponse 14.ManagingIncidents 15.PostmortemCulture:LearningfromFailure 16.TrackingOutages 17.TestingforReliability 18.SoftwareEngineeringinSRE 19.LoadBalancingattheFrontend 20.LoadBalancingintheDatacenter 21.HandlingOverload 22.AddressingCascadingFailures 23.ManagingCriticalState:DistributedConsensusforReliability 24.DistributedPeriodicSchedulingwithCron 25.DataProcessingPipelines 26.DataIntegrity:WhatYouReadIsWhatYouWrote 27.ReliableProductLaunchesatScale PartIV-Management 28.AcceleratingSREstoOn-CallandBeyond 29.DealingwithInterrupts 30.EmbeddinganSREtoRecoverfromOperationalOverload 31.CommunicationandCollaborationinSRE 32.TheEvolvingSREEngagementModel PartV-Conclusions 33.LessonsLearnedfromOtherIndustries 34.Conclusion AppendixA.AvailabilityTable AppendixB.ACollectionofBestPracticesforProductionServices AppendixC.ExampleIncidentStateDocument AppendixD.ExamplePostmortem AppendixE.LaunchCoordinationChecklist AppendixF.ExampleProductionMeetingMinutes Bibliography TableofContents TableofContents Foreword Preface PartI-Introduction Chapter1-Introduction Chapter2-TheProductionEnvironmentatGoogle,fromtheViewpointofanSRE PartII-Principles Chapter3-EmbracingRisk Chapter4-ServiceLevelObjectives Chapter5-EliminatingToil Chapter6-MonitoringDistributedSystems Chapter7-TheEvolutionofAutomationatGoogle Chapter8-ReleaseEngineering Chapter9-Simplicity PartIII-Practices Chapter10-PracticalAlerting Chapter11-BeingOn-Call Chapter12-EffectiveTroubleshooting Chapter13-EmergencyResponse Chapter14-ManagingIncidents Chapter15-PostmortemCulture:LearningfromFailure Chapter16-TrackingOutages Chapter17-TestingforReliability Chapter18-SoftwareEngineeringinSRE Chapter19-LoadBalancingattheFrontend Chapter20-LoadBalancingintheDatacenter Chapter21-HandlingOverload Chapter22-AddressingCascadingFailures Chapter23-ManagingCriticalState:DistributedConsensusforReliability Chapter24-DistributedPeriodicSchedulingwithCron Chapter25-DataProcessingPipelines Chapter26-DataIntegrity:WhatYouReadIsWhatYouWrote Chapter27-ReliableProductLaunchesatScale PartIV-Management Chapter28-AcceleratingSREstoOn-CallandBeyond Chapter29-DealingwithInterrupts Chapter30-EmbeddinganSREtoRecoverfromOperationalOverload Chapter31-CommunicationandCollaborationinSRE Chapter32-TheEvolvingSREEngagementModel PartV-Conclusions Chapter33-LessonsLearnedfromOtherIndustries Chapter34-Conclusion AppendixA-AvailabilityTable AppendixB-ACollectionofBestPracticesforProductionServices AppendixC-ExampleIncidentStateDocument AppendixD-ExamplePostmortem AppendixE-LaunchCoordinationChecklist AppendixF-ExampleProductionMeetingMinutes Bibliography
延伸文章資訊
- 1Site Reliability Engineering: How Google Runs Production ...
Chris Jones is a Site Reliability Engineer for Google App Engine, a cloud platform-as-a-service p...
- 2SRE Book (@srebook) / Twitter
SRE Book. @srebook. Documenting the care and feeding of production software systems ... Google's ...
- 3Table of Contents - Google - Site Reliability Engineering
... Part I - Introduction · Chapter 1 - Introduction · Chapter 2 - The Production Environment at ...
- 4Site Reliability Engineering: How Google Runs Production ...
This book is a series of essays written by members and alumni of Google's Site Reliability Engine...
- 5SRE books - Google - Site Reliability Engineering
Discover Site Reliability Engineering, learn about building and maintaining reliable engineering ...