C-Learning: Horizon-Aware Cumulative Accessibility Estimation

文章推薦指數: 80 %
投票人數:10人

Multi-goal reaching is an important problem in reinforcement learning needed to achieve algorithmic generalization. Despite recent advances in this field, ... TogglenavigationOpenReview.netLoginOpenPeerReview.OpenPublishing.OpenAccess.OpenDiscussion.OpenRecommendations.OpenDirectory.OpenAPI.OpenSource.×C-Learning:Horizon-AwareCumulativeAccessibilityEstimationPantehaNaderian,GabrielLoaiza-Ganem,HarryJ.Braviner,AnthonyL.Caterini,JesseC.Cresswell,TongLi,AnimeshGargSep28,2020(editedJan26,2021)ICLR2021PosterReaders:EveryoneKeywords:reinforcementlearning,goalreaching,Q-learningAbstract:Multi-goalreachingisanimportantprobleminreinforcementlearningneededtoachievealgorithmicgeneralization.Despiterecentadvancesinthisfield,currentalgorithmssufferfromthreemajorchallenges:highsamplecomplexity,learningonlyasinglewayofreachingthegoals,anddifficultiesinsolvingcomplexmotionplanningtasks.Inordertoaddresstheselimitations,weintroducetheconceptofcumulativeaccessibilityfunctions,whichmeasurethereachabilityofagoalfromagivenstatewithinaspecifiedhorizon.Weshowthatthesefunctionsobeyarecurrencerelation,whichenableslearningfromofflineinteractions.Wealsoprovethatoptimalcumulativeaccessibilityfunctionsaremonotonicintheplanninghorizon.Additionally,ourmethodcantradeoffspeedandreliabilityingoal-reachingbysuggestingmultiplepathstoasinglegoaldependingontheprovidedhorizon.Weevaluateourapproachonasetofmulti-goaldiscreteandcontinuouscontroltasks.Weshowthatourmethodoutperformsstate-of-the-artgoal-reachingalgorithmsinsuccessrate,samplecomplexity,andpathoptimality.Ourcodeisavailableathttps://github.com/layer6ai-labs/CAE,andadditionalvisualizationscanbefoundathttps://sites.google.com/view/learning-cae/.One-sentenceSummary:WeintroduceC-learning,aQ-learninginspiredmethodtolearnhorizon-dependentpoliciesforgoalreaching.CodeOfEthics:IacknowledgethatIandallco-authorsofthisworkhavereadandcommittoadheringtotheICLRCodeofEthics19RepliesLoading×SendFeedbackEnteryourfeedbackbelowandwe'llgetbacktoyouassoonaspossible.CancelSend×BibTeXRecordClickanywhereontheboxabovetohighlightcompleterecordDone



請為這篇文章評分?