Brewer's CAP Theorem
原文地址:
Brewer’s (CAP) Theorem
So what exactly is Brewer’s Theorem, and why does it warrant comparison with a 1976 punk gig in Manchester?
Brewer’s 2000 talk was based on his theoretical work at UC Berkley and observations from running , though Brewer and others were talking about trade-off decisions that need to be made in highly scalable systems years before that (e.g. “” from SOSP in 1997 and “” in 1999) so the contents of the presentation weren’t new and, like many of these ideas, they were the work of many smart people (as I am sure Brewer himself would be quick to point out).
What he said was there are three core that exist in a special relationship when it comes to designing and deploying applications in a distributed environment (he was talking specifically about the web but so many corporate businesses are multi-site/multi-country these days that the effects could equally apply to your data-centre/LAN/WAN arrangement).
The three requirements are: Consistency, Availability and Partition Tolerance, giving Brewer’s Theorem its other name - CAP.
To give these some real-world meaning let’s use a simple example: you want to buy a copy of ’s to read on a particularly long vacation you’re starting tomorrow. Your favourite web bookstore has one copy left in stock. You do your search, check that it can be delivered before you leave and add it to your basket. You remember that you need a few other things so you browse the site for a bit (have you ever bought just one thing online? Gotta maximise the parcel dollar). While you’re reading the customer reviews of a suntan lotion product, someone, somewhere else in the country, arrives at the site, adds a copy to their basket and goes right to the checkout process (they need an urgent fix for a wobbly table with one leg much shorter than the others).
那么, 布鲁尔的定理到底是什么, 为什么它与曼彻斯特的1976朋克演出相比更值得呢?
布鲁尔的2000谈话是基于他在 UC 伯克利的理论工作和运行 Inktomi 的观察, 虽然布鲁尔和其他人谈论的权衡决定, 需要在高度可伸缩的系统前几年 (例如, "基于群集的可伸缩性网络服务 "从 SOSP 在1997年和" 收获, 产量和可扩展的容忍系统 "在 1999年) 所以介绍的内容不是新的, 并且, 象许多这些想法, 他们是许多聪明的人的工作 (我肯定布鲁尔自己将很快指出).
他所说的是, 在分布式环境中设计和部署应用程序时, 有三核心系统要求存在于特殊关系中 (他是专门讨论 web 的, 但许多企业业务都是多站点/多国这些天的效果可以同样适用于您的数据中心/LAN/WAN 安排)。
三的要求是: 一致性, 可用性和分区宽容, 给布鲁尔的定理, 它的其他名称上限。
为了给这些真实世界的意义, 让我们用一个简单的例子: 你想买一本托尔斯泰的《战争与和平》的书,从明天开始在一个特别长的假期阅读。你最喜欢的网上书店有一份存货。你搜索到, 并添加到购物车,然后你离开了网上书店。你想起来, 你需要一些其他的东西, 所以你浏览了一下网站 (你曾经买了一个网上的东西吗?为了包邮,为了凑单折扣)。当你阅读的其他买家评论的防晒霜产品时, 某人, 在该国的其他地方, 也登录了网站, 也添加这本书到他们的购物车, 并进行正确的结帐过程。
Consistency
A service that is consistent operates fully or not at all. Gilbert and Lynch use the word “atomic” instead of consistent in their proof, which makes more sense technically because, strictly speaking, consistent is the C in as applied to the ideal properties of database transactions and means that data will never be persisted that breaks certain pre-set constraints. But if you consider it a preset constraint of distributed systems that multiple values for the same piece of data are not allowed then I think the leak in the abstraction is plugged (plus, if Brewer had used the word atomic, it would be called the AAP theorem and we’d all be in hospital every time we tried to pronounce it).
In the book buying example you can add the book to your basket, or fail. Purchase it, or not. You can’t half-add or half-purchase a book. There’s one copy in stock and only one person will get it the next day. If both customers can continue through the order process to the end (i.e. make payment) the lack of consistency between what’s in stock and what’s in the system will cause an issue. Maybe not a huge issue in this case - someone’s either going to be bored on vacation or spilling soup - but scale this up to thousands of inconsistencies and give them a monetary value (e.g. trades on a financial exchange where there’s an inconsistency between what you think you’ve bought or sold and what the exchange record states) and it’s a huge issue.
We might solve consistency by utilising a database. At the correct moment in the book order process the number of War and Peace books-in-stock is decremented by one. When the other customer reaches this point, the cupboard is bare and the order process will alert them to this without continuing to payment. The first operates fully, the second not at all.
Databases are great at this because they focus on ACID properties and give us Consistency by also giving us Isolation, so that when Customer One is reducing books-in-stock by one, and simultaneously increasing books-in-basket by one, any intermediate states are isolated from Customer Two, who has to wait a few milliseconds while the data store is made consistent.
- 一致性(Consistency)
服务一致性是指操作全做,或者都不做。吉尔伯特和林奇使用 "原子" 这个词, 而不是一致性, 这使得更有意义的技术上, 因为严格说来, 一致性C在ACID中适用于数据库事务中理想的属性, 意思是事务被中断时数据绝不会被持久化,这是数据库事务中一致性C的预设约束。但是, 如果您认为分布式系统的预设约束是不允许同一数据有多个值, 那么我认为抽象概念的漏洞就不存在了(另外, 如果布鲁尔使用了原子这个词, 它将被称为 “AAP定理”,我们每次尝试这么称呼时都会被送进医院)。
在买书的例子中, 你可以把书添加到你的购物车里, 或者失败。买或不买。不存在买版本书的情况。只有一本书的情况,也就只有一个人能买下它。如果两个客户都可以继续通过订单流程付款,那么将导致仓库里的数量和系统里数量不一致。也许不是一个大问题, 在这种情况下,只不过有人要度过一个无聊的假期或挤弄防晒霜而已,大不了退款。但大规模数以千计的不一致就是个大问题了,如果还涉及到货币 (如虚拟货币的金融交易,首款或支付交易,一旦出错的话难以回朔,因为它对应不上实际账目), 这是一个巨大的问题。
我们可以利用数据库来解决一致性问题。在买书的订单过程中某个点去减少《战争与和平》书的库存数量一个(原子事务操作,先读取数量然后马上再减-1)。当其他客户到达这一点时, 当减库存减无可减时(原子事务操作,先读取数量,为0时抛出异常并回滚), 订单过程将提醒他们, 而不是继续支付。
数据库是伟大的, 因为他们专注于ACID的属性, 并给我们的一致性, 也给我们隔离性, 这样当客户把书放入购物车时,库存里减少一本书, 并同时增加一本书到购物车, 任何其他客户做同样操作时必须等待几毫秒,因为在数据库要保持事务一致性的情况下锁定了库存表。
Availability
Availability means just that - the service is available (to operate fully or not as above). When you buy the book you want to get a response, not some browser message about the web site being uncommunicative. Gilbert & Lynch in their proof of CAP Theorem make the good point that availability most often deserts you when you need it most - sites tend to go down at busy periods precisely because they are busy. A service that’s available but not being accessed is of no benefit to anyone.
- 可用性(Availability)
可用性意味着-服务可用 (像上面那样操作,能操作完全,或不完全)。当你买这本书, 你希望得到一个回复, 而不是一些浏览器给出的异常信息。吉尔伯特和林奇在他们的CAP定理中指出一个很好的点,可用性最经常在你最需要的时候抛弃你-网站往往是在繁忙的时间未响应, 正是因为他们很忙。可用但不能被访问的服务对任何人都没有好处。
译者:他这里的意思是上面的操作把书放入购物车时有个事务是要么都做,要么都不做,这里就会锁定库存让其他客户等待几毫秒响应。在这几毫秒时间里,服务是不可用的。所以要保证一致性就要损失可用性。
Partition Tolerance
If your application and database runs on one box then (ignoring scale issues and assuming all your code is perfect) your server acts as a kind of atomic processor in that it either works or doesn’t (i.e. if it has crashed it’s not available, but it won’t cause data inconsistency either).
Once you start to spread data and logic around different nodes then there’s a risk of partitions forming. A partition happens when, say, a network cable gets chopped, and Node A can no longer communicate with Node B. With the kind of distribution capabilities the web provides, temporary partitions are a relatively common occurrence and, as I said earlier, they’re also not that rare inside global corporations with multiple data centres.
- 分区宽容(Partition Tolerance)
如果您的应用程序和数据库运行在一个box上, 那么 (忽略了缩放问题并假定所有代码都是完美的), 服务器作为一种原子处理器, 它要么工作, 要么不起作用 (也就是说, 如果它崩溃了, 它是不可用的, 但不会导致数据不一致)。
一旦开始在不同节点之间传播数据和逻辑, 就会形成分区的风险。当网络电缆被砍掉, 节点 a 不能再与节点 B 通信时, 就会发生分区。通过 web 提供的分发能力, 临时分区是一个相对常见的事件, 正如我前面所说的那样, 它们在具有多个数据中心的全球公司中也并不少见。
Gilbert & Lynch defined partition tolerance as:
“
No set of failures less than total network failure is allowed to cause the system to respond incorrectly
”
and noted Brewer’s comment that a one-node partition is equivalent to a server crash, because if nothing can connect to it, it may as well not be there.
吉尔伯特和林奇定义的分区宽容为:
“没有比总网络故障更少的故障设置, 导致系统响应不正确”
并注意到布鲁尔的评论, 单节点分区等同于服务器崩溃, 因为如果没有什么可以连接到它, 它可能也没有在那里。
The Significance of the Theorem
定理的意义
CAP Theorem comes to life as an application scales. At low transactional volumes, small latencies to allow databases to get consistent has no noticeable affect on either overall performance or the user experience. Any load distribution you do undertake, therefore, is likely to be for systems management reasons.
CAP 定理以应用的尺度来生活。在低事务性卷中, 允许数据库保持一致的小延迟对总体性能或用户体验没有明显的影响。因此, 您承担的任何负载分配都可能是出于系统管理原因。
But as activity increases, these pinch-points in throughput will begin limit growth and create errors. It’s one thing having to wait for a web page to come back with a response and another experience altogether to enter your credit card details to be met with “HTTP 500 java.lang.schrodinger.purchasingerror” and wonder whether you’ve just paid for something you won’t get, not paid at all, or maybe the error is immaterial to this transaction. Who knows? You are unlikely to continue, more likely to shop elsewhere, and very likely to phone your bank.
Either way this is not good for business. Amazon that just an extra one tenth of a second on their response times will cost them 1% in sales. Google they noticed that just a half a second increase in latency caused traffic to drop by a fifth.
但是, 随着活动的增加, 这些在吞吐量中的捏点将开始限制增长并产生错误。这是一件事, 必须等待一个网页回来与响应和其他的资料, 以及您输入信用卡详细信息后遇到:”HTTP 500 java.lang.schrodinger.purchasingerror", 然后你一定想知道你是否刚刚支付的东西, 钱付了不会发货,还是没有支付, 或者可能错误是无关紧要的这个事务。谁知道呢?你不大可能继续, 更有可能在别处购物, 很可能打电话给你的银行。
无论哪种方式, 这都不适合做生意。亚马逊声称, 在他们的回复时间里, 仅仅额外的1/10 秒将花费他们1% 的销售额。谷歌表示, 他们注意到, 仅有半秒的延迟增加导致流量下降了15%。
I’ve written a little about scalability , so won’t repeat all that here except to make two points: the first is that whilst addressing the problems of scale might be an architectural concern, the initial discussions are not. They are business decisions. I get very tired of hearing, from techies, that such-and-such an approach is not warranted because current activity volumes don’t justify it. It’s not that they’re wrong; more often than not they’re quite correct, it’s that to limit scale from the outset is to implicitly make revenue decisions - a factor that should be made explicit during business analysis.
我以前写过一些关于可伸缩性的文章, 所以不会重复这里的所有内容, 除了提出两点: 第一个问题是, 虽然解决横向扩容的难题可能是一个架构上的关注, 但最初的讨论却不是。他们是商业决策。我厌倦了听, 从技术人员, 这样-和-这样的方法是没有必要的, 由于目前的activity volumes。这并不是说他们错了;通常情况下, 他们是完全正确的, 从一开始就限制规模是为了隐含地做出收入决定—在业务分析中应该明确的一个因素。
The second point is that once you embark on discussions around how to best scale your application the world falls broadly into two ideological camps: the database crowd and the non-database crowd.
The database crowd, unsurprisingly, like database technology and will tend to address scale by talking of things like and , keeping the database at the heart of things.
he non-database crowd will tend to address scale by managing data outside of the database environment (avoiding the relational world) for as long as possible.
第二个问题是, 一旦你开始讨论如何最好地扩展你的应用, 世界就会大致分成两个思想阵营: 数据库人群和非数据库人群。
数据库的人群, 毫不奇怪,倾向于数据库技术解决横向扩容的问题, 如乐观锁和分片, 保持数据库的核心的东西。
非数据库人群将倾向于通过管理数据库环境之外的数据 (避免关系世界) 尽可能长的战线来解决横向扩容问题。
I think it’s fair to say that the former group haven’t taken to CAP Theorem with quite the same gusto as the latter (though they are ). This is because if you have to drop one of consistency, availability, or partition tolerance, many opt to drop consistency which is the raison d’être of the database. The logic. no doubt, is that availability and partition-tolerance keep your money-making application alive, whereas inconsistency just feels like one of those things you can work around with clever design.
Like so much else in IT, it’s not as black and white as this. Eric Brewer, on slide 13 of his PODC talk, when comparing ACID and it’s informal counterpart even says “I think it’s a spectrum”. And if you’re interested in this as a topic (it’s slightly outside of what I want to talk about here) you could do worse than start with a paper called “” by Haifeng Yu and Amin Vahdat. Nobody should interpret CAP as implying the database is dead.
我认为这是公平的说, 前者采取CAP定理没有后者相当的热情 (虽然他们正在谈论它)。这是因为, 如果必须删除一致性、可用性或分区宽容, 许多人会选择删除一致性, 这是数据库存在的理由。选择的逻辑毫无疑问是可用性和分区宽容使您的赚钱应用程序活着, 而不一致只是感觉像这些事情之一, 你可以用巧妙的设计来解决。
译者:这篇文章的作者认为非数据库人群更加热衷CAP定理,并且觉得分区宽容和可用性关系到用户体验,甚至是公司生存的问题,而一致性是可以别的方法来处理的。
就好像很多公司一样,先拿到订单再说,做反正都是可以做的,合同条款也都能满足的,至于中间出点小差错小困难,反正以后都有更长的时间去解决。
就像其他的一样, 它没有像这样的黑白。埃里克. 布鲁尔, 在他的 PODC 谈话的幻灯片 13, 当比较酸和它的非正式对应基础甚而说 "我认为它是光谱"。如果你对这个话题感兴趣 (它稍微超出了我想谈论的内容), 你可以做得比从海丰和阿明 Vahdat 的一篇名为 "复制服务的连续一致性模型设计和评估" 的论文开始要糟糕得多。没有人应该把 CAP 解释为暗示数据库已死。
Where both sides agree though is that the answer to scale is distributed parallelisation not, as was once thought, supercomputer grunt. Eric Brewer’s influence on the projects of the mid-nineties led to the architectures that exposed CAP theorem, because as he says in another presentation on (flash) the answer has always been processors working in parallel:
虽然双方都同意, 横向扩容的答案是并行处理, 而不是超级计算机。埃里克. 布鲁尔对九十年代中旬工作站项目网络的影响导致了公开 CAP 定理的体系结构, 因为正如他在 Inktomi 和互联网泡沫 (flash) 的另一个演示中所说的, 答案总是处理器并行工作:
“If they’re not working in parallel you have no chance to get the problem done in a reasonable amount of time. This is a lot like anything else. If you have a really big job to do you get lots of people to do it. So if you are building a bridge you have lots of construction workers. That’s parallel processing also. So a lot of this will end up being “how do we mix parallel processing and the internet?”
如果他们不并行工作, 你就没有机会在合理的时间内完成这个问题。这和其他东西很像。如果你有一个很大的工作要做, 你有很多人做。所以, 如果你正在建造一座桥, 你有很多建筑工人。这也是并行处理。因此, 很多这将最终成为 “我们如何整合并行处理和互联网?”
The Proof in Pictures
Here’s a simplified proof, in pictures because I find it much easier to understand that way. I’ve mostly used the same terms as Gilbert and Lynch so that this ties up with their paper.
图片中的证明
这是一个简化的证明, 在图片中, 因为我发现它更容易理解这种方式。我主要是用与吉尔伯特和林奇相同的术语, 以便与他们的论文联系起来。
The diagram above shows two nodes in a network, N1 and N2. They both share a piece of data V (how many physical copies of War and Peace are in stock), which has a value V0. Running on N1 is an algorithm called A which we can consider to be safe, bug free, predictable and reliable. Running on N2 is a similar algorithm called B. In this experiment, A writes new values of V and B reads values of V.
上图显示了网络中的两个节点, N1 和 N2。他们都分享了一块数据 V (多少本《战争与和平》的物理库存), 有价值 V0。在 N1 上运行的是一种我们可以考虑安全、无 bug、可预测和可靠的算法。在 N2 上运行的是一种类似的算法, 称为 B。在这个实验中,A 写入 V 和 B 读取 V 的值。
In a sunny-day scenario this is what happens: (1) First A writes a new value of V, which we’ll call V1. (2) Then a message (M) is passed from N1 to N2 which updates the copy of V there. (3) Now any read by B of V will return V1.
在正常的情况下, 这是发生了什么: (1) A写一个新的值 V, 我们叫它 V1。(2) 然后N1发送一个消息给N2,告诉他V更新了。(3) 现在B读取 V 都将返回 值V1。
If the network partitions (that is messages from N1 to N2 are not delivered) then N2 contains an inconsistent value of V when step (3) occurs.
如果网络分区 (即从 N1 到 N2 的消息未传递), 则 N2 在发生 (3) 时包含 V 的不一致值。
Hopefully that seems fairly obvious. Scale this is up to even a few hundred transactions and it becomes a major issue. If M is an asynchronous message then N1 has no way of knowing whether N2 gets the message. Even with guaranteed delivery of M, N1 has no way of knowing if a message is delayed by a partition event or something failing in N2. Making M synchronous doesn’t help because that treats the write by A on N1 and the update event from N1 to N2 as an atomic operation, which gives us the same latency issues we have already talked about (or worse). Gilbert and Lynch also prove, using a slight variation on this, that even in a partially-synchronous model (with ordered clocks on each node) atomicity cannot be guaranteed.
希望这似乎相当明显。规模, 这是甚至几个百笔交易, 它成为一个重大问题。如果 M 是异步消息, 则 N1 无法知道 N2 是否获取消息。即使在保证交付的情况下, N1 也无法知道消息是由分区事件延迟还是在 N2 中发生故障。使 M 同步没有帮助, 因为它将 N1 上的写入以及从 N1 到 N2 的更新事件视为原子操作, 这给出了我们已经讨论过 (或更糟) 的相同的滞后时间问题。吉尔伯特和林奇也证明, 使用一个轻微的变化, 在这一点上, 即使在一个部分同步模型 (与有序时钟在每个节点) 原子性不能保证。
So what CAP tells us is that if we want A and B to be highly available (i.e. working with minimal latency) and we want our nodes N1 to Nn (where n could be hundreds or even thousands) to remain tolerant of network partitions (lost messages, undeliverable messages, hardware outages, process failures) then sometimes we are going to get cases where some nodes think that V is V0 (one copy of War and Peace in stock) and other nodes will think that V is V1 (no copies of War and Peace in stock).
因此, CAP 告诉我们的是, 如果我们希望 A 和 B 高度可用 (即使用最小滞后时间), 并且我们希望我们的节点 N1 到 Nn (其中 n 可以是成百上千个甚至上千个) 来保持对网络分区的容忍 (丢失的消息、无法传递的消息、硬件停机、进程故障) 然后, 有时我们会得到一些节点认为 v 是 V0 和其他节点会认为 v 是 V1的情况下。
We’d really like everything to be structured, consistent and harmonious, like the music of a band from the early seventies, but what we are faced with is a little bit of punk-style anarchy. And actually, whilst it might scare our grandmothers, it’s OK once you know this, because both can work together quite happily.
Let’s quickly analyse this from a transactional perspective.
我们真的希望一切都是结构化的, 连贯的, 和谐的, 就像七十年代代初的摇滚乐队的音乐, 但是我们所面临的是一些朋克式的无政府状态。事实上, 虽然它可能会吓到我们的祖母, 这是可以的, 一旦你知道这一点, 因为两者都能很愉快地一起工作。
让我们从事务性的角度快速分析这个问题。
If we have a transaction (i.e. unit of work based around the persistent data item V) called α, then α1 could be the write operation from before and α2 could be the read. On a local system this would easily be handled by a database with some simple locking, isolating any attempt to read in α2 until α1 completes safely. In the distributed model though, with nodes N1 and N2 to worry about, the intermediate synchronising message has also to complete. Unless we can control when α2 happens, we can never guarantee it will see the same data values α1 writes. All methods to add control (blocking, isolation, centralised management, etc) will impact either partition tolerance or the availability of α1 (A) and/or α2 (B).
如果我们有一个事务 (即基于持久性数据项 V 的工作单元) 称为α, 那么α1可能是以前的写操作, α2可以是读取。在本地系统上, 这很容易由一个具有一些简单锁定的数据库处理, 隔离任何在α2中读取的尝试, 直到α1安全地完成。然而在分布式模型中, 随着节点 N1 和 N2 的担忧, 中间同步消息也已完成。除非我们能够控制α2发生的时间, 否则我们永远不能保证它会看到α1写的相同的数据值。添加控制 (阻塞、隔离、集中管理等) 的所有方法都将影响分区宽容或α1 (A) 和/或α2 (B) 的可用性。
Dealing with CAP
You’ve got a few choices when addressing the issues thrown up by CAP. The obvious ones are:
处理CAP
在处理 CAP 抛出的问题时, 您有几个选择。明显的是:
Drop Partition Tolerance
If you want to run without partitions you have to stop them happening. One way to do this is to put everything (related to that transaction) on one machine, or in one atomically-failing unit like a rack. It’s not 100% guaranteed because you can still have partial failures, but you’re less likely to get partition-like side-effects. There are, of course, significant scaling limits to this.
- 放弃分区容忍
如果你想在没有分区的情况下运行, 你必须阻止它们的发生。这样做的一个方法是将所有 (相关的事务) 放在一台机器上, 或者放在一个像机架一样的原子故障单元中。它不是100% 保证, 因为你仍然可以有部分失败, 但你不太可能得到分区一样的副作用。当然, 对此有显著的扩展限制。
译者:放在同一台服务器,解决一个断网的可能性。那么断电用UPS解决的话,也无法避免服务器关闭(有计划的关闭),高可用集群下局部更新新的办法。或者是更新前先把用户挡在外面,就好像许多游戏要例行维护那样,这就属于放弃可用性了。
Drop Availability
This is the flip side of the drop-partition-tolerance coin. On encountering a partition event, affected services simply wait until data is consistent and therefore remain unavailable during that time. Controlling this could get fairly complex over many nodes, with re-available nodes needing logic to handle coming back online gracefully.
- 丢弃可用性
这是放弃分区容忍硬币的反面。遇到分区事件时, 受影响的服务只是等待数据保持一致, 因此在这段时间内仍然不可用。控制这个可能会在许多节点上变得相当复杂, 而重新可用的节点需要逻辑来处理恢复正常的在线。
译者:牺牲可用性,服务之间同步通信,不发异步MQ消息。或者异步通信,但逻辑上处理某个账户的冻结和解冻,来达到其他账户的高可用。
Drop Consistency
Or, as Werner Vogels puts it, accept that things will become “” (updated Dec 2008). Vogels’ article is well worth a read. He goes into a lot more detail on operational specifics than I do here.
Lots of inconsistencies don’t actually require as much work as you’d think (meaning continuous consistency is probably not something we need anyway). In my book order example if two orders are received for the one book that’s in stock, the second just becomes a back-order. As long as the customer is told of this (and remember this is a rare case) everybody’s probably happy.
- 放弃一致性
或者, 正如沃纳 Vogels 所说, 接受 "最终一致" (2008年12月更新)。Vogels 的文章很值得一读。他在操作细节上的细节比我在这里详细得多。
许多不一致实际上并不需要像你想象的那么多的工作 (这意味着持续的一致性可能不是我们需要的东西)。在我的图书订单示例中, 如果收到了一本库存的书的两个订单, 则第二个命令就会变成一个后端顺序。只要客户被告知这一点 (记住这是一个罕见的情况), 每个人都可能会高兴。
The BASE Jump
The notion of accepting eventual consistency is supported via an architectural approach known as BASE (Basically Available, Soft-state, Eventually consistent). BASE, as its name indicates, is the logical opposite of ACID, though it would be quite wrong to imply that any architecture should (or could) be based wholly on one or the other. This is an important point to remember, given our industry’s habit of “oooh shiny” strategy adoption.
- 引入BASE
接受最终一致性的概念是通过一种称为BASE (B基本A可用、S软状态、E最终一致) 的体系结构方法来支持的。基, 顾名思义, 是酸的逻辑相反, 但暗示任何体系结构应该 (或可能) 完全基于一个或另一个是完全错误的。这是一个重要的问题, 要记住, 考虑到我们行业的习惯, "oooh shiny" 战略的采用。
And here I defer to Professor Brewer himself who emailed me some comments on this article, saying:
“the term “BASE” was first presented in the 1997 SOSP article that you cite. I came up with acronym with my students in their office earlier that year. I agree it is contrived a bit, but so is “ACID” – much more than people realize, so we figured it was good enough. Jim Gray and I discussed these acronyms and he readily admitted that ACID was a stretch too – the A and D have high overlap and the C is ill-defined at best. But the pair connotes the idea of a spectrum, which is one of the points of the PODC lecture as you correctly point out.”
在这里, 我听从了布鲁尔教授自己谁发电子邮件给我一些意见, 在这篇文章, 说:
"BASE" 一词首先出现在你引用的 1997 SOSP 文章中。那年早些时候, 我和我的学生一起在他们的办公室里学到了缩写词。我同意这一点是做作的, 但是 "ACID"--比人们意识到的要多, 所以我们认为这是足够好的。吉姆. 格雷和我讨论了这些缩略语, 他欣然承认, ACID也是一个延伸--a 和 D 有很高的重叠, 而 C 定义有点理想化?。但这对是指一个频谱的想法, 这是 PODC 演讲的要点之一, 你正确地指出。
Dan Pritchett of EBay has a nice presentation on BASE.
丹普里切特的易趣ebay在BASE有一个很好的介绍。
Design around it
Guy Pardon, CTO of wrote an interesting post which he called “”, suggesting an architectural approach that would deliver Consistency, Availability and Partition-tolerance, though with some caveats (notably that you don’t get all three guaranteed in the same instant).
It’s worth a read as Guy eloquently represents an opposing view in this area.
- 围绕它设计
Guy Pardon, atomikos公司 的CTO写了一个有趣的帖子, 他称之为 "CAP 解决方案 (证明布鲁尔是错误的)", 建议一种架构方法, 将提供一致性, 可用性和分区容忍, 虽然有一些警告 (特别是你不要得到所有三保证在同一瞬间)。
这是值得一读,Guy Pardon的雄辩代表了在这一领域反对意见。
Summary
That you can only guarantee two of Consistency, Availability and Partition Tolerance is real and evidenced by the most successful websites on the planet. If it works for them I see no reason why the same trade-offs shouldn’t be considered in everyday design in corporate environments. If the business explicitly doesn’t want to scale then fine, simpler solutions are available, but it’s a conversation worth having. In any case these discussions will be about appropriate designs for specific operations, not the whole shebang. As Brewer said in his email “the only other thing I would add is that different parts of the same service can choose different points in the spectrum”. Sometimes you whatever the scaling cost, because the risk of not having it is too great.
These days I’d go so far as to say that Amazon and EBay don’t have a scalability problem. I think they had one and now they have the tools to address it. That’s why they can freely talk about it. Any scaling they do now (given the size they already are) is really more of the same. Once you’ve scaled, your problems shift to those of operational maintenance, monitoring, rolling out software updates etc. - tough to solve, certainly, but nice to have when you’ve got those revenue streams coming in.
总结
你只能保证两个一致性, 可用性和分区容忍是真实的, 并证明了最成功的网站在地球上。如果它适用于他们, 我看不出有任何理由不应该在公司环境的日常设计中考虑相同的权衡。如果业务明确不希望扩展然后罚款, 更简单的解决方案是可用的, 但它是一个值得拥有的对话。在任何情况下, 这些讨论将是有关具体操作的适当设计, 而不是整个工作。正如布鲁尔在他的电子邮件中说的, "我唯一要补充的是, 同一服务的不同部分可以在频谱中选择不同的点"。有时, 你绝对需要一致性无论什么规模的成本, 因为没有它的风险是太大了。
这些天来, 我不得不说, 亚马逊和 EBay 没有一个可伸缩性的问题。我认为他们有一个, 现在他们有解决这个问题的工具。这就是为什么他们可以自由地谈论它。他们现在所做的任何缩放 (考虑到它们已经存在的大小) 实际上是相同的。一旦你调整了规模, 你的问题就会转移到操作维护、监控、软件更新等方面--很难解决, 当然, 但是当你有了这些收入流的时候, 你会很高兴的。