This
is a followup discussion after my adviser's original idea blog,
proactive serialization.
The
general idea is serialization does not need to be Paxos-like total
order, and using a lock service, it can be proactive and
anticipatory. The lock service can anticipate a future request and
provide some locks ahead of time, so the worker proceed without an
explicit request. Latency will be reduced and throughput can be
improved.
However,
with a slightly different design, anticipation is no longer obvious.
Problem
statement: Given the star graph $G
= <V, E>$, $|V| = 1+|E|$, $E = \{\forall j \in V, (k, j) \in
E\}$.
$n$ data items that shared between all nodes, and each data item is
associated with one token. A node $j$ requires all tokens of the data
it access in one transaction in order to commit that transaction.
What is the most efficient location for each token? In
other words, the goal is, given the setup and application access
pattern, find the best location of tokens with minimum cost. The cost
would be communication between sites.
If
node $k$ is a simple token manager / lock broker, then the problem is
easy to solve, because it keeps access history for all other nodes,
then assign the token to it upon request or anticipation.
However,
if node $k$ is the elected leader that responsible for exchanging
tokens (locks) between other nodes in the cluster. In the meanwhile,
$k$ is like any other node, that also will accept request from client
and propose/commit on transactions, then the problem becomes, does
$k$ move the token or keep it and commit at $k$. Should node $j$
block a transaction while waiting for token, or forward the
transaction to $k$ instead.
When
two nodes $i, j$ competing the same data item, it is better to keep
the token at $k$, instead of moving it back and forth all the time. A
simple way to make a decision of when to move a token is to remember
the N consecutive accesses for each item. Let's say for a certain
point of time, $i$ got the token, now if $j$ forwards its transaction
to $k$, should $k$ wait until $i$'s leas timeout? Which will block
$j$'s request, or revoke token back to $k$ immediately.
Let's
try to setup some rules.
Rule
#1: node $k$ should be one of the node that geographically located in
the middle among all nodes. So that it's relatively faire for every
site to communication with $k$. $\forall j$, $\sum cost(j,k)$ is
minimum.
Rule
#2: Token should remain in one site while there're still outstanding
transactions. This rule will guarantee the safety property, hopefully
by following two means:
(1)
tokens are send within a replication message.
(2)
every token includes the last transaction id, node $k$ only release a
token after it has seen that transaction. (seen by state machine??)
Rule
#3: Creating operation will create a new data item, thus create a new
token, which is unknown by $k$ initially. This special operation
immediately leads to two way of creation. (1) create the data
globally, requires the creating site holds the father data item in a
tree data structure. So no other site will create a conflict key, or
(2) all create request are forward to site $k$ to commit, (3) any
site may create its local data item that can be only accessed by
itself. In which case, no token is needed.
Rule
#4:
Let
assume each site to site one-way communication takes T sec. The time
to commit a transaction is t, where t << T. If data D has the
following access pattern from site i and j: [i, j, i, j, i, j, …].
The most efficient way is to keep token(D) at site $k$, so the
average time for each txn would be (2T+2t)/2 = T+t.
However,
if the access pattern is [iii, j, iii, j, iii, j, …]. Then keeping
the token at $k$ takes same amount of time (6T+4t)/4 = 1.5T+t. But
send token to $i$ first, then revoke back to $j$ for one txn, then
send token again, would take (T+3t+T+T)/4 = 0.75T+0.75t.
Such
patterns are not be able to detect and anticipate by two consecutive
accesses.
No comments:
Post a Comment