Raft Structure 翻译
A Raft instance has to deal with the arrival of external events (Start() calls, AppendEntries and RequestVote RPCs, and RPC replies), and it has to execute periodic tasks (elections and heart-beats). There are many ways to structure your Raft code to manage these activities; this document outlines a few ideas.
Raft 实例必须处理接受到的外部事件(Start()调用、AppendEntries 和 RequestVote RPC 以及 RPC 回复),并且必须执行定期任务(选举和心跳)。有很多方法可以构建您的 Raft 代码来管理这些活动;本文档概述了一些想法。
Each Raft instance has a bunch of state (the log, the current index, &c) which must be updated in response to events arising in concurrent goroutines. The Go documentation points out that the goroutines can perform the updates directly using shared data structures and locks, or by passing messages on channels. Experience suggests that for Raft it is most straightforward to use shared data and locks. 每个 Raft 实例都有一堆状态(例如日志、当前索引等),必须更新这些状态以响应并发 goroutines 中出现的事件。Go 文档指出,goroutines 可以直接使用共享数据结构和锁来执行更新,也可以通过在通道上传递消息来执行更新。经验表明,对于Raft来说,使用共享数据和锁是最直接的。
A Raft instance has two time-driven activities: the leader must send heart-beats, and others must start an election if too much time has passed since hearing from the leader. It’s probably best to drive each of these activities with a dedicated long-running goroutine, rather than combining multiple activities into a single goroutine.
Raft 实例有两个时间驱动的活动:领导者必须发送心跳,如果从领导者的通知开始经过太多时间,其他人必须开始选举。最好使用专用的长时间运行的 goroutine来驱动这些活动中的每一个,而不是将多个活动合并到一个 goroutine。
The management of the election timeout is a common source of headaches. Perhaps the simplest plan is to maintain a variable in the Raft struct containing the last time at which the peer heard from the leader, and to have the election timeout goroutine periodically check to see whether the time since then is greater than the timeout period. It’s easiest to use time.Sleep() with a small constant argument to drive the periodic checks. Don’t use time.Ticker and time.Timer;they are tricky to use correctly. 选举超时的管理是令人头疼的常见来源。也许最简单的计划是在 Raft 结构中维护一个变量,其中包含对等方上次从领导者那里听到的时间,并让选举超时 goroutine 定期检查自那时以来的时间是否大于超时期限。最容易利用 time.Sleep()带有一个小的常量参数来驱动定期检查。不要使用time.Ticker,time.Timer;它们很难正确使用。
You’ll want to have a separate long-running goroutine that sends committed log entries in order on the applyCh. It must be separate, since sending on the applyCh can block; and it must be a single goroutine, since otherwise it may be hard to ensure that you send log entries in log order. The code that advances commitIndex will need to kick the apply goroutine; it’s probably easiest to use a condition variable (Go’s sync.Cond) for this.
您需要一个单独的长时间运行的 goroutine,用于在 applyCh 上按顺序发送提交的日志条目。它必须是分开的,因为在 applyCh 上发送可能会阻塞;并且它必须是单个 GoRoutine,否则可能很难确保按日志顺序发送日志条目。推进 commitIndex 的代码需要启动应用程序 goroutine;使用条件变量(Go’s sync.Cond)为此。
Each RPC should probably be sent (and its reply processed) in its own goroutine, for two reasons: so that unreachable peers don’t delay the collection of a majority of replies, and so that the heartbeat and election timers can continue to tick at all times. It’s easiest to do the RPC reply processing in the same goroutine, rather than sending reply information over a channel.
每个 RPC 可能应该在其自己的 goroutine 中发送(及其回复处理),原因有两个:这样无法访问的对等方就不会延迟大多数回复的收集,并且心跳和选举计时器可以始终继续滴答作响。最简单的方法是在同一个 goroutine 中执行 RPC 回复处理,而不是通过通道发送回复信息。 Keep in mind that the network can delay RPCs and RPC replies, and when you send concurrent RPCs, the network can re-order requests and replies. Figure 2 is pretty good about pointing out places where RPC handlers have to be careful about this (e.g. an RPC handler should ignore RPCs with old terms). Figure 2 is not always explicit about RPC reply processing. The leader has to be careful when processing replies; it must check that the term hasn’t changed since sending the RPC, and must account for the possibility that replies from concurrent RPCs to the same follower have changed the leader’s state (e.g.nextIndex).
请记住,网络可以延迟 RPC 和 RPC 回复,当您发送并发 RPC 时,网络可以对请求和回复重新排序。图 2 很好地指出了 RPC 处理程序必须注意的地方(例如,RPC 处理程序应忽略具有旧术语的 RPC)。图 2 并不总是明确介绍 RPC 回复处理。领导在处理回复时必须小心;它必须检查术语自发送 RPC 以来是否未更改,并且必须考虑从并发 RPC 到同一关注者的回复更改了领导者状态的可能性(例如 nextIndex)。