bzdww

Get answers and suggestions for various questions from here

The correct implementation of Redis distributed locks

cms
Author: Wu Shan
blog: wudashan.cn

Foreword

Distributed locks generally have three implementations: 1. Database optimistic locking; 2. Distributed lock based on Redis; 3. Distributed lock based on ZooKeeper. This blog will introduce the second way to implement distributed locks based on Redis. Although there are various blogs on the Internet that introduce Redis distributed lock implementation, their implementation has various problems. In order to avoid mistakes, this blog will detail how to implement Redis distributed locks correctly.


reliability

First, to ensure that distributed locks are available, we at least ensure that the lock implementation meets the following four conditions:

  • Mutually exclusive. Only one client can hold a lock at any time.
  • No deadlocks will occur. Even if one client crashes while holding the lock and does not actively unlock it, it can guarantee that other clients can be locked later.
  • Fault tolerant. As long as most of the Redis nodes are up and running, the client can lock and unlock.
  • The trouble should end it. Locking and unlocking must be the same client, and the client itself cannot solve the lock added by others.

Code

Component dependency

First we need to introduce the Jedis open source component through Maven, add the following code in the pom.xml file:

<dependency>
    <groupId>redis.clients</groupId>
    <artifactId>jedis</artifactId>
    <version>2.9.0</version>
</dependency>

Lock code

Correct posture

Talk is cheap, show me the code. Show the code first, then let everyone slowly explain why this is done:

public class RedisTool {

    private static final String LOCK_SUCCESS = "OK";
    private static final String SET_IF_NOT_EXIST = "NX";
    private static final String SET_WITH_EXPIRE_TIME = "PX";

    /**
     * 尝试获取分布式锁
     * @param jedis Redis客户端
     * @param lockKey 锁
     * @param requestId 请求标识
     * @param expireTime 超期时间
     * @return 是否获取成功
     */
    public static boolean tryGetDistributedLock(Jedis jedis, String lockKey, String requestId, int expireTime) {

        String result = jedis.set(lockKey, requestId, SET_IF_NOT_EXIST, SET_WITH_EXPIRE_TIME, expireTime);

        if (LOCK_SUCCESS.equals(result)) {
            return true;
        }
        return false;

    }

}

As you can see, we lock a line of code: jedis.set(String key, String value, String nxxx, String expx, int time). This set() method has a total of five parameters:

The first one is key , we use the key to lock, because the key is unique.

The second one is value . We pass the requestId. Many children's shoes may not understand. Is it enough to have a key as a lock? Why use value? The reason is that when we talk about reliability above, the distributed lock must satisfy the fourth condition to solve the bell and still have to ring the bell. By assigning the value to the requestId, we know which request is added to the lock. You can have a basis. The requestId can be generated using the UUID.randomUUID().toString() method.

The third one is nxxx . This parameter is filled in by NX, which means SET IF NOT EXIST. That is, when the key does not exist, we perform the set operation; if the key already exists, no operation is performed;

The fourth is expx . This parameter we pass is PX, which means we have to add an expired setting to this key. The specific time is determined by the fifth parameter.

The fifth is time , which corresponds to the fourth parameter and represents the expiration time of the key.

In general, executing the set() method above will only result in two results:

  1. There is currently no lock (the key does not exist), then the lock operation is performed, and the lock is set to an expiration date, and the value represents the locked client.
  2. It is already latched and does nothing.

The heart of the children's shoes will be found, our lock code meets the three conditions described in our reliability. First, set() adds the NX parameter to ensure that if the existing key exists, the function will not be called successfully, that is, only one client can hold the lock and satisfy the mutual exclusion. Secondly, since we set the expiration time for the lock, even if the lock holder subsequently crashes without unlocking, the lock will be automatically unlocked due to the expiration time (ie, the key is deleted), and no deadlock will occur.

Finally, because we assign the value to the requestId, which represents the client request identifier of the lock, then the client can verify whether it is the same client when it is unlocked. Since we only consider the scenario of Redis single-machine deployment, we will not consider fault tolerance.

Error example 1

A more common example of an error is to use the combination of jedis.setnx() and jedis.expire() to implement locking. The code is as follows:

public static void wrongGetLock1(Jedis jedis, String lockKey, String requestId, int expireTime) {

    Long result = jedis.setnx(lockKey, requestId);
    if (result == 1) {
        // 若在这里程序突然崩溃,则无法设置过期时间,将发生死锁
        jedis.expire(lockKey, expireTime);
    }

}

The setnx() method is SET IF NOT EXIST, and the expire() method adds an expiration time to the lock. At first glance, it seems to be the same as the previous set() method. However, since this is two Redis commands, it is not atomic. If the program suddenly crashes after executing setnx(), the lock does not set the expiration time. Then a deadlock will occur. The reason why people do this online is because the low version of jedis does not support the multi-parameter set() method.

Error example 2

This kind of error example is more difficult to find and the implementation is more complicated. Implementation ideas: use the jedis.setnx () command to achieve locking, where key is the lock, value is the lock expiration time.

Implementation process:

  1. Try to lock by the setnx() method. If the current lock does not exist, return the lock successfully.
  2. If the lock already exists, the expiration time of the lock is acquired. Compared with the current time, if the lock has expired, a new expiration time is set, and the lock is successfully returned.

code show as below:

public static boolean wrongGetLock2(Jedis jedis, String lockKey, int expireTime) {

    long expires = System.currentTimeMillis() + expireTime;
    String expiresStr = String.valueOf(expires);

    // 如果当前锁不存在,返回加锁成功
    if (jedis.setnx(lockKey, expiresStr) == 1) {
        return true;
    }

    // 如果锁存在,获取锁的过期时间
    String currentValueStr = jedis.get(lockKey);
    if (currentValueStr != null && Long.parseLong(currentValueStr) < System.currentTimeMillis()) {
        // 锁已过期,获取上一个锁的过期时间,并设置现在锁的过期时间
        String oldValueStr = jedis.getSet(lockKey, expiresStr);
        if (oldValueStr != null && oldValueStr.equals(currentValueStr)) {
            // 考虑多线程并发的情况,只有一个线程的设置值和当前值相同,它才有权利加锁
            return true;
        }
    }

    // 其他情况,一律返回加锁失败
    return false;

}

So where is the code problem? 1. Since the client itself generates the expiration time, it is mandatory to require that the time of each client under the distribution must be synchronized. 2. When the lock expires, if multiple clients execute the jedis.getSet() method at the same time, although only one client can lock at the end, the expiration time of the client's lock may be overwritten by other clients. 3. The lock does not have the owner ID, ie any client can unlock it.

Unlock code

Correct posture

Or show the code first, then let everyone slowly explain why this is done:

public class RedisTool {

    private static final Long RELEASE_SUCCESS = 1L;

    /**
     * 释放分布式锁
     * @param jedis Redis客户端
     * @param lockKey 锁
     * @param requestId 请求标识
     * @return 是否释放成功
     */
    public static boolean releaseDistributedLock(Jedis jedis, String lockKey, String requestId) {

        String script = "if redis.call('get', KEYS[1]) == ARGV[1] then return redis.call('del', KEYS[1]) else return 0 end";
        Object result = jedis.eval(script, Collections.singletonList(lockKey), Collections.singletonList(requestId));

        if (RELEASE_SUCCESS.equals(result)) {
            return true;
        }
        return false;

    }

}

As you can see, we only need two lines of code to unlock it! The first line of code, we wrote a simple Lua script code, the last time I saw this programming language or in "Hacker and Painter", I did not expect this time actually used. In the second line of code, we pass the Lua code to the jedis.eval() method, and assign the parameter KEYS[1] to lockKey, and ARGV[1] to the requestId. The eval() method is to pass the Lua code to the Redis server for execution.

So what is the function of this Lua code? In fact, it is very simple. First, get the value corresponding to the lock, check if it is equal to requestId, and if it is equal, delete the lock (unlock). So why use the Lua language to achieve this? Because to ensure that the above operations are atomic. For questions about non-atomicity, read [Unlock Code - Error Example 2]. Then why the implementation of the eval () method can ensure atomicity, derived from the characteristics of Redis, the following is a partial explanation of the eval command on the official website:

In simple terms, when the eval command executes Lua code, the Lua code will be executed as a command, and Redis will execute other commands until the eval command is executed.

Error example 1

The most common unlock code is to use the jedis.del() method to delete the lock. This way of directly unlocking the owner without first determining the lock will cause any client to unlock at any time, even if the lock is not its .

public static void wrongReleaseLock1(Jedis jedis, String lockKey) {
    jedis.del(lockKey);
}

Error example 2

This kind of unlock code is no problem at first glance, even I almost did this before, similar to the correct posture, the only difference is that it is divided into two commands to execute, the code is as follows:

public static void wrongReleaseLock2(Jedis jedis, String lockKey, String requestId) {

    // 判断加锁与解锁是不是同一个客户端
    if (requestId.equals(jedis.get(lockKey))) {
        // 若在此时,这把锁突然不是这个客户端的,则会误解锁
        jedis.del(lockKey);
    }

}

As the code comments, the problem is that if the jedis.del() method is called, the lock will not be added to the current client when it is not in the current client. So is there really such a scene? The answer is yes. For example, client A locks. After a period of time, client A unlocks. Before jedis.del() is executed, the lock suddenly expires. At this time, client B tries to lock successfully, and then client A executes again. The del() method removes the lock on client B.


to sum up

This article mainly introduces how to correctly implement Redis distributed locks using Java code. Two classic error examples are given for locking and unlocking. In fact, it is not difficult to implement distributed locks through Redis, as long as it can meet the four conditions of reliability. Although the Internet has brought us convenience, as long as there is a problem, you can google, but the answer on the Internet must be correct? Actually it is not, so we should always keep the spirit of questioning and think more about verification.

If Redis is deployed in a multi-machine project, you can try to implement distributed locks using Redisson. This is the Java component provided by Redis. The links are given in the reference reading section.


Reference reading

redis.io/topics/distloc
redis.io/commands/eval
github.com/redisson/red