1 概述与背景
2 ANSI事务隔离级别
- P1 脏读 (“Dirty read”): SQL-transaction T1 modifies a row. SQL- transaction T2 then reads that row before T1 performs a COMMIT. If T1 then performs a ROLLBACK, T2 will have read a row that was never committed and that may thus be considered to have never existed.
- P2 不可重复读 (“Non-repeatable read”): SQL-transaction T1 reads a row. SQL- transaction T2 then modifies or deletes that row and performs a COMMIT. If T1 then attempts to reread the row, it may receive the modified value or discover that the row has been deleted.
- P3 幻读 (“Phantom”): SQL-transaction T1 reads the set of rows N that satisfy some <search condition>. SQL-transaction T2 then executes SQL-statements that generate one or more rows that satisfy the <search condition> used by SQL-transaction T1. If SQL-transaction T1 then repeats the initial read with the same <search condition>, it obtains a different collection of rows.
级别 | P1(脏读) | P2(不可重复读) | P3(幻读) |
Read Uncommitted | 允许 | 允许 | 允许 |
Read Committed | 禁止 | 允许 | 允许 |
Repeatable Read | 禁止 | 禁止 | 允许 |
(Anomaly) Serializable | 禁止 | 禁止 | 禁止 |
Note: The exclusion of these penomena or SQL-transactions executing at isolation level SERIALIZABLE is a consequence of the requirement that such transactions be serializable. |
如标准文档所述,禁止了P1/P2/P3异象的事务即满足Serializable级别,但矛盾的是,标准文档中对Serializable又做了如下说明:
The execution of concurrent SQL-transactions at isolation level SERIALIZABLE is guaranteed to be serializable. A serializable execution is defined to be an execution of the operations of concurrently executing SQL-transactions that produces the same effect as some serial execution of those same SQL-transactions |
它要求多个并发事务执行的效果与某种串行化执行的效果一致,但是仅仅禁止P1/P2/P3异象,并不一定能够保证“serial execution”的效果,因此论文中将ANSI Serializable称为Anomaly Serializable。
P1/P2/P3的形式化描述
根据标准文档的定义,可以将这三种异象使用形式化语言描述如下,称为A1/A2/A3(其中w1[x]表示事务1写入记录x,r1表示事务1读取记录x,c1表示事务1提交,a1表示事务1回滚,r1[P]表示事务1按照谓词P的条件读取若干条记录,w1[y in P]表示事务1写入记录y满足谓词P的条件):
- A1 脏读:w1[x] … r2[x] … (a1 and c2 in any order)
- A2 不可重复读:r1[x] … w2[x] … c2 … r1[x] … c1
- A3 幻读:r1[P] … w2[y in P] … c2 … r1[P] … c1
上述A1/A2/A3形式化描述,根据标准定义的P1/P2/P3异象的自然语言描述转化而来,但是ANSI标准定义的异象只针对了单个记录或谓词描述,对于多条记录需满足业务一致性的场景并未能覆盖(比如两个账户间转账要求余额总和不变),举例如下:
- H1:r1[x=50]w1[x=10] r2[x=10]r2[y=50] c2 r1[y=50]w1[y=90] c1
- 事务1执行账户x向账户y转账40,事务2读取到了进行到了一半的事务1(Read Uncommitted),破坏了余额总和的一致性
- 因为事务1并未回滚,H1的行为并不符合A1的形式化定义
- H2:r1[x=50] r2[x=50]w2[x=10]r2[y=50]w2[y=90] c2 r1[y=90] c1
- 事务2执行账户x向账户y转账40,事务1在事务2提交前后读取到了破坏余额总和一致性的数据(Unrepeatable Read)
- 因为事务1并未重复读取记录x,H2的行为并不符合A2的形式化定义
- H3:r1[P] w2[insert y to P] r2[z] w2[z] c2 r1[z] c1
- 事务2增加新雇员并更新雇员总数z,事务1在事务2提交前后读取到了破坏雇员列表与雇员总数的一致性的数据(Phantom)
- 因为事务1并未重复读取谓词P指定的数据集合,H3的行为并不符合A3的形式化定义
因为要增强对上述H1/H2/H3异象的约束,论文将A1/A2/A3的形式化描述称为“狭义的描述(strict interpretations)”,然后增加了“广义的描述(broad interpretation)”,去除了strict interpretations中对事务提交、回滚和数据读取范围的约束,只保留事务之间读写的时序关系,即事务之间只要包含如下时序的操作,即可能产生包含H1/H2/H3在内的异象,如下:
- P1 脏读:w1[x] … r2[x] … ((c1 or a1) and (c2 or a2) in any order)
- P2 不可重复读:r1[x] … w2[x] … ((c1 or a1) and (c2 or a2) in any order)
- P3 幻读:r1[P] … w2[y in P] … ((c1 or a1) and (c2 or a2) in any order)
在上述形式化描述下,禁止P1即可禁止H1,禁止P1/P2即可禁止H2,禁止P1/P2/P3即可禁止H3。至此,ANSI标准隔离级别定义的三种异象,可以被扩展为适用范围更广的的P1/P2/P3的形式化定义,这种隔离级别定义被论文称之为“phenomena-based”,即基于“异象”的隔离级别定义。
3 基于锁的事务隔离
- Predicate lock 谓词锁(gap锁):Locks on all data items satisfying the search condition
- Well-formed Writes 合法write:Requests a Write(Exclusive) lock on each data item or predicate before writing
- Well-formed Reads 合法read:Requests a Read(share) lock on each data item or predicate before reading
- Long duration locks 长周期锁:Locks are held until after the transaction commits or aborts
- Shord duration locks 短周期锁:Locks are released immediately after the action completes
Read Lock | Write Lock | |
Locking Read Uncommited |
none required
|
Well-formed Writes,
Long duration Write locks
|
Locking Read Commited |
Well-formed Reads, Short duration read lock |
Well-formed Writes,
Long duration Write locks
|
Locking Repeatable Read |
Well-formed Reads, Long duration data-item Read locks, Short duration Read Predicate locks |
Well-formed Writes
Long duration Write locks
|
Locking Serializable |
Well-formed Reads, Long duration Read locks |
Well-formed Writes
Long duration Write locks
|
- Well-formed Reads, Short duration read lock 禁止了 P1发生,r2[x]将被读锁阻塞,直到事务1提交或回滚
- Well-formed Reads, Long duration data-item Read locks, Short duration Read Predicate locks 禁止了P2发生,w2[x]将被写锁阻塞,直到事务1提交或回滚
- 其中Short duration Read Predicate locks的作用论文中并没有说明,实际上它保护了r[P]的一致性,保证r[P]读取到的多行数据是一个“well-formed history”
- Well-formed Reads, Long duration Read locks 禁止了P3发生,w2[y in P]将被谓词写锁阻塞,直到事务1提交或回滚
4 基于快照的事务隔离
- P1描述的w1[x] … r2[x] …操作时序不可能出现,因为在快照隔离下,实际的操作时序为w1[x] … r2[last version of x] …,因此可知快照隔离禁止P1
- P2描述的r1[x] … w2[x] … 它实际的操作时序为r1[x] … w2[new version of x] …,可以知道快照隔离也禁止了P2。至此,我们可以确定快照隔离的效果至少大于Read Committed
- P3描述的r1[P] … w2[y in P] … 它实际的操作时序为r1[P] … w2[new version of y in P] …,可以知道快照隔离也禁止了P3,达到了第2小节中ANSI的Anomaly Serializable级别
- A5A Read Skew: r1[x]…w2[x]…w2[y]…c2…r1[y]…(c1 or a1)
- A5B Writer Skew: r1[x]…r2[y]…w1[y]…w2[x]…(c1 and c2 occur)
- 扩展的Write Skew(并非来自原文):r1[P]…r2[P]…w1[x]…w2[y]…(c1 and c2 occur)
5 总结
Read Uncommitted < Read Committed < (Repeatable Read >< Snapshot) < Serializable |