文章

03Kotlin协程异常处理

03Kotlin协程异常处理

协程异常处理

协程异常的传播

协程构建器有两种形式:⾃动传播异常launchactor)或向⽤户暴露异常asyncproduce)。当这些构建器⽤于创建⼀个根协程时,即该协程不是另⼀个协程的⼦协程时:

  • launchactor 发生异常且非 CancellationException 异常的话,会传递给 UncaughtExceptionHandler 处理,默认处理异常的方式只是打印堆栈信息,可以自定义 CoroutineExceptionHandler 来处理异常
  • asyncproduce 协程本身不会处理异常,自定义 CoroutineExceptionHandler 也无效,依赖用户调用 await/receive 恢复调用者协程时重新抛出异常
  • 协程中抛出 CancellationException 异常会被忽略掉
  • 抛出未捕获的非 CancellationException 异常会取消子协程和自己,也会取消父协程,一直取消 root 协程,异常也会由 root 协程处理
  • 如果使用了 SupervisorJobsupervisorScope,子协程抛出未捕获的非 CancellationException 异常不会取消父协程,异常也会由子协程自己处理。
  • 当协程出现异常时,会根据当前作用域触发异常传递。在 coroutineScope 当中协程异常会触发父协程的取消,进而将整个协程作用域取消掉,如果对 coroutineScope 整体进行捕获,也可以捕获到该异常;如果是 supervisorScope,那么子协程的异常不会向上传递,由子协程自己处理
  • 通过 GlobeScope 启动的协程单独启动一个协程作用域,内部的子协程遵从默认的作用域规则。通过 GlobeScope 启动的协程 “ 自成一派
  • 案例 1(未设置 UncaughtExceptionHandler,默认输出到 console):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
private fun test1() = runBlocking {
    val job = GlobalScope.launch { // launch 根协程
        println("Throwing exception from launch")
        throw IndexOutOfBoundsException() // 传递给Thread.defaultUncaughtExceptionHandler,默认在控制台输出
//        throw CancellationException("this is a CancellationException") // 不会传递给Thread.defaultUncaughtExceptionHandler
    }
    job.join()

    println("Joined failed job")

    val deferred = GlobalScope.async { // async 根协程
        println("Throwing exception from async")
        throw ArithmeticException() // 没有在控制台打印任何东⻄,依赖⽤户去处理
    }
    try {
        deferred.await()
        println("Unreached")
    } catch (e: ArithmeticException) {
        println("Caught ArithmeticException")
    }
    println("main end")
}

输出:

1
2
3
4
5
6
7
Throwing exception from launch
Exception in thread "DefaultDispatcher-worker-1" java.lang.IndexOutOfBoundsException // 省略异常堆栈
    ......
Joined failed job
Throwing exception from async
Caught ArithmeticException
main end

可以看到 launch 抛出的 IndexOutOfBoundsException 异常默认交给被打印到控制台了;

  • 案例 2(设置了 UncaughtExceptionHandler):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
private fun test2() = runBlocking {

    Thread.setDefaultUncaughtExceptionHandler { t, e ->
        println("uncaughtException ${t.name} $e")
    }

    val job = GlobalScope.launch { // launch 根协程
        println("Throwing exception from launch")
        throw IndexOutOfBoundsException() // 传递给Thread.defaultUncaughtExceptionHandler,默认在控制台输出
//        throw CancellationException("this is a CancellationException") // 不会传递给Thread.defaultUncaughtExceptionHandler
    }
    job.join()

    println("Joined failed job")

    val deferred = GlobalScope.async { // async 根协程
        println("Throwing exception from async")
        throw ArithmeticException() // 没有在控制台打印任何东⻄,依赖⽤户去处理
    }
    try {
        deferred.await()
        println("Unreached")
    } catch (e: ArithmeticException) {
        println("Caught ArithmeticException")
    }
    println("main end")
}

输出:

1
2
3
4
5
6
Throwing exception from launch
uncaughtException DefaultDispatcher-worker-1 java.lang.IndexOutOfBoundsException
Joined failed job
Throwing exception from async
Caught ArithmeticException
main end
  • 案例:当⼀个协程使⽤ Job.cancel 取消的时候,它会被终⽌,但是它不会取消它的⽗协程
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
val job = launch {
    val child = launch {
        try {
            delay(Long.MAX_VALUE)
        } finally {
            println("Child is cancelled")
        }
    }
    yield()
    println("Cancelling child")
    child.cancel()
    child.join()
    yield()
    println("Parent is not cancelled")
}
job.join()

输出:

1
2
3
Cancelling child
Child is cancelled
Parent is not cancelled
  • 案例:当⽗协程的所有⼦协程都结束后,原始的异常才会被⽗协程处理:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
fun test4() = runBlocking {
    val handler = CoroutineExceptionHandler { _, exception ->
        println("CoroutineExceptionHandler got $exception")
    }
    val job = GlobalScope.launch(handler) {
        launch { // the first child
            try {
                delay(Long.MAX_VALUE)
            } finally {
                withContext(NonCancellable) {
                    println("Children are cancelled, but exception is not handled until all children terminate")
                    delay(100)
                    println("The first child finished its non cancellable block")
                }
            }
        }
        launch { // the second child
            delay(10)
            println("Second child throws an exception")
            throw ArithmeticException()
        }
    }
    job.join()
}

输出:

1
2
3
4
Second child throws an exception
Children are cancelled, but exception is not handled until all children terminate
The first child finished its non cancellable block
CoroutineExceptionHandler got java.lang.ArithmeticException
  • 案例:launch 和 async 处理异常区别
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
@Test
fun `test exception propagation`() = runBlocking<Unit> {
    val job = GlobalScope.launch {
        try {
            throw IndexOutOfBoundsException()
        } catch (e: Exception) {
            println("Caught IndexOutOfBoundsException")
        }
    }
    job.join()

    val deferred = GlobalScope.async {
        println("async")
        throw ArithmeticException()
    }

    try {
        deferred.await() // async不调用await的话,async中的异常不会被抛出
    } catch (e: Exception) {
        println("Caught ArithmeticException")
    }

    delay(1000)
}

异常捕获

异常是被自动抛出异常的协程抛出的(使用 launch,而不是 async),异常会被捕获

CoroutineExceptionHandler

将未捕获异常(除 CancellationException 了)打印到控制台的默认⾏为是可⾃定义的。CoroutineExceptionHandler 类似于 Thread.setDefaultUncaughtExceptionHandler 全局异常处理者。

  • CoroutineExceptionHandler 仅在未捕获的异常上调⽤ — 没有以其他任何⽅式处理的异常
  • async 构建器始终会捕获所有异常并将其表示在结果 Deferred 对象中,因此它的 CoroutineExceptionHandler ⽆效
  • supervision scope 上运行的协程,异常不会传递到父协程

案例,CoroutineExceptionHandler 在 launch 和 async 区别:

1
2
3
4
5
6
7
8
9
10
11
12
fun test3() = runBlocking {
    val handler = CoroutineExceptionHandler { _, exception ->
        println("CoroutineExceptionHandler got $exception")
    }
    val job = GlobalScope.launch(handler) { // 根协程,运⾏在GlobalScope中
        throw AssertionError()
    }
    val deferred = GlobalScope.async(handler) { // 同样是根协程,但使⽤async代替了launch
        throw ArithmeticException() // 没有打印任何东⻄,依赖⽤户去调⽤deferred.await()
    }
    joinAll(job, deferred)
}

输出:

1
CoroutineExceptionHandler got java.lang.AssertionError
  • CoroutineExceptionHandler 要放在最外层的作用域,否则无效
1
2
3
4
5
6
7
8
9
10
11
12
13
@Test
fun `test CoroutineExceptionHandler3`() = runBlocking {
    val handler = CoroutineExceptionHandler { _, exception ->
        println("Caught $exception")
    }
    val scope = CoroutineScope(Job())
    val job = scope.launch {
        launch(handler) {
            throw IllegalArgumentException()
        }
    }
    job.join()
}

输出异常:

1
2
Exception in thread "DefaultDispatcher-worker-1 @coroutine#3" java.lang.RuntimeException: Exception while trying to handle coroutine exception
	...

正确:

1
2
3
4
5
6
7
8
9
10
11
12
13
@Test
fun `test CoroutineExceptionHandler3`() = runBlocking<Unit> {
    val handler = CoroutineExceptionHandler { _, exception ->
        println("Caught $exception")
    }
    val scope = CoroutineScope(Job())
    val job = scope.launch(handler) {
        launch {
            throw IllegalArgumentException()
        }
    }
    job.join()
}

Android 中全局异常处理

全局异常处理器可以获取到所有协程未处理的未捕获异常,不过它并不能对异常进行捕获,虽然不能阻止程序崩溃,但在程序调试和异常上报等场景中有很大用处

在 classpath 下创建 META/services 目录,并在其中创建一个名为 kotlinx.coroitines.CoroutineExceptionHandler 文件,文件内容就是我们的全局异常处理器的全类名

异常聚合

当协程的多个⼦协程因异常⽽失败时,⼀般规则是 “ 取第⼀个异常 “,因此将处理第⼀个异常。在第⼀个异常之后发⽣的所有其他异常都作为被抑制的异常绑定⾄第⼀个异常。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
fun test5() = runBlocking {
    val handler = CoroutineExceptionHandler { _, exception ->
        println("CoroutineExceptionHandler got $exception with suppressed ${exception.suppressed.contentToString()}")
    }
    val job = GlobalScope.launch(handler) {
        launch {
            try {
                delay(Long.MAX_VALUE) // it gets cancelled when another sibling fails with IOException
            } finally {
                throw ArithmeticException() // the second exception
            }
        }
        launch {
            delay(100)
            throw IOException() // the first exception
        }
        delay(Long.MAX_VALUE)
    }
    job.join()
}

输出:

1
CoroutineExceptionHandler got java.io.IOException with suppressed [java.lang.ArithmeticException]

注意,这个机制当前只能在 Java 1.7 以上的版本中使⽤。 在 JS 和原⽣环境下暂时会受到限制,但将来会取消。

Supervision 监督

Supervision job

SupervisorJob 的取消只会向下传播。一个子协程异常,不会影响其他子协程,SupervisorJob 不会传播异常给它的父级,它会让子协程自己处理异常。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
fun testSupervisorJob() = runBlocking {
    val supervisor = SupervisorJob()
    with(CoroutineScope(coroutineContext + supervisor)) {
        // launch the first child -- its exception is ignored for this example (don't do this in practice!)
        val firstChild = launch(CoroutineExceptionHandler { _, _ -> }) {
            println("The first child is failing")
            throw AssertionError("The first child is cancelled")
        }
        // launch the second child
        val secondChild = launch {
            // Cancellation of the first child is not propagated to the second child
            println("The first child is cancelled: ${firstChild.isCancelled}, but the second one is still active")
            try {
                delay(Long.MAX_VALUE)
            } finally {
                // But cancellation of the supervisor is propagated
                println("The second child is cancelled because the supervisor was cancelled")
            }
        }
        // wait until the first child fails & completes
        firstChild.join()
        println("Cancelling the supervisor")
        supervisor.cancel()
        secondChild.join()
    }
}

输出:

1
2
3
4
The first child is failing
The first child is cancelled: true, but the second one is still active
Cancelling the supervisor
The second child is cancelled because the supervisor was cancelled

firstChild 协程异常不会取消 parent 协程,secondChild 也就不会被取消;取消 supervisor 会把 secondChild 取消

Supervision scope

supervisorScope 只会单向的传播并且当作业⾃身执⾏失败的时候将所有⼦作业全部取消。作业⾃身也会在所有的⼦作业执⾏结束前等待。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
fun test_supervisorScope() = runBlocking {
    try {
        supervisorScope {
            val child = launch {
                try {
                    println("The child is sleeping")
                    delay(Long.MAX_VALUE)
                } finally {
                    println("The child is cancelled")
                }
            }
            // 使⽤ yield 来给我们的⼦作业⼀个机会来执⾏打印
            yield()
            println("Throwing an exception from the scope")
            throw AssertionError()
        }
    } catch (e: AssertionError) {
        println("Caught an assertion error")
    }
}

输出:

1
2
3
4
The child is sleeping
Throwing an exception from the scope
The child is cancelled
Caught an assertion error

Exceptions in supervised coroutines

supervisor 协程中的每个子 Job 应该通过异常处理机制处理⾃身的异常,子 Job 的异常不会传递给父协程

  • supervisorScope 处理异常案例
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
fun test_supervisor_exception() = runBlocking {
    val handler = CoroutineExceptionHandler { _, exception ->
        println("CoroutineExceptionHandler got $exception")
    }
    supervisorScope {
        val child = launch(handler) {
            println("The child throws an exception")
            throw AssertionError()
        }
        val second = launch {
            println("second launch ...")
            delay(100)
            println("second launch completed")
        }
    }
    println("The scope is completed")
}

输出:

1
2
3
4
5
The child throws an exception
CoroutineExceptionHandler got java.lang.AssertionError
second launch ...
second launch completed
The scope is completed
  • coroutineScope 处理异常案例:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
fun test_coroutineScope_exception() = runBlocking {
    val handler = CoroutineExceptionHandler { _, exception ->
        println("CoroutineExceptionHandler got $exception")
    }
    coroutineScope {
        val child = launch(handler) {
            println("The child throws an exception")
            throw AssertionError()
        }
        val second = launch {
            println("second launch ...")
            delay(100)
            println("second launch completed")
        }
    }
    println("The scope is completed")
}

输出:

1
2
The child throws an exception
Exception in thread "main" java.lang.AssertionError

repeat 导致的死循环

  • 问题代码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
override fun onCreate(savedInstanceState: Bundle?) {
    super.onCreate(savedInstanceState)
    setContentView(R.layout.activity_lifecycle_scope_repeat_demo)
    val interval = 4000L

    Looper.myQueue().addIdleHandler {
        lifecycleScope.launchWhenResumed {
            Log.i("hacket", "IdleHandler")
            repeatWhenever(Int.MAX_VALUE) {
                LogUtils.logi(
                    "hacket",
                    "repeatWhenever",
                    "ovoAnchorOnlineReport(), interval=$interval."
                )
                ovoOnlineReport()
                delay(interval)
            }
        }
        false
    }
}
inline fun repeatWhenever(times: Int, action: (Int) -> Unit) {
    repeat(times) {
        try {
            action.invoke(times)
        } catch (e: java.lang.Exception) {
            // ignore
            e.printStackTrace()
            LogUtils.w("[Exception]repeatWhenever ${e.message}")
        }
    }
}
  • 按下 Home 键 App 到后台时,App 一直在不断的输出 log,delay 也失效了,repeat 也没有被 cancel
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
2022-03-01 17:57:07.641 31451-31451/me.hacket.assistant.samples I/hacket.hacket: 【repeatWhenever】ovoAnchorOnlineReport(), interval=4000.,线程:main,日期:2022-03-01 17:57:07
2022-03-01 17:57:07.641 31451-31451/me.hacket.assistant.samples W/hacket.hacket: [Exception]repeatWhenever UndispatchedCoroutine was cancelled
2022-03-01 17:57:07.642 31451-31451/me.hacket.assistant.samples I/hacket.hacket: 【repeatWhenever】ovoAnchorOnlineReport(), interval=4000.,线程:main,日期:2022-03-01 17:57:07
2022-03-01 17:57:07.642 31451-31451/me.hacket.assistant.samples W/hacket.hacket: [Exception]repeatWhenever UndispatchedCoroutine was cancelled
2022-03-01 17:57:07.642 31451-31451/me.hacket.assistant.samples I/hacket.hacket: 【repeatWhenever】ovoAnchorOnlineReport(), interval=4000.,线程:main,日期:2022-03-01 17:57:07
2022-03-01 17:57:07.643 31451-31451/me.hacket.assistant.samples W/hacket.hacket: [Exception]repeatWhenever UndispatchedCoroutine was cancelled
2022-03-01 17:57:07.644 31451-31451/me.hacket.assistant.samples I/hacket.hacket: 【repeatWhenever】ovoAnchorOnlineReport(), interval=4000.,线程:main,日期:2022-03-01 17:57:07
2022-03-01 17:57:07.645 31451-31451/me.hacket.assistant.samples W/hacket.hacket: [Exception]repeatWhenever UndispatchedCoroutine was cancelled
2022-03-01 17:57:07.645 31451-31451/me.hacket.assistant.samples I/hacket.hacket: 【repeatWhenever】ovoAnchorOnlineReport(), interval=4000.,线程:main,日期:2022-03-01 17:57:07
2022-03-01 17:57:07.646 31451-31451/me.hacket.assistant.samples W/hacket.hacket: [Exception]repeatWhenever UndispatchedCoroutine was cancelled
2022-03-01 17:57:07.646 31451-31451/me.hacket.assistant.samples I/hacket.hacket: 【repeatWhenever】ovoAnchorOnlineReport(), interval=4000.,线程:main,日期:2022-03-01 17:57:07
2022-03-01 17:57:07.652 31451-31451/me.hacket.assistant.samples W/hacket.hacket: [Exception]repeatWhenever UndispatchedCoroutine was cancelled
2022-03-01 17:57:07.653 31451-31451/me.hacket.assistant.samples I/hacket.hacket: 【repeatWhenever】ovoAnchorOnlineReport(), interval=4000.,线程:main,日期:2022-03-01 17:57:07
2022-03-01 17:57:07.656 31451-31451/me.hacket.assistant.samples W/hacket.hacket: [Exception]repeatWhenever UndispatchedCoroutine was cancelled
2022-03-01 17:57:07.664 31451-31451/me.hacket.assistant.samples I/hacket.hacket: 【repeatWhenever】ovoAnchorOnlineReport(), interval=4000.,线程:main,日期:2022-03-01 17:57:07
2022-03-01 17:57:07.665 31451-31451/me.hacket.assistant.samples W/hacket.hacket: [Exception]repeatWhenever UndispatchedCoroutine was cancelled
2022-03-01 17:57:07.665 31451-31451/me.hacket.assistant.samples I/hacket.hacket: 【repeatWhenever】ovoAnchorOnlineReport(), interval=4000.,线程:main,日期:2022-03-01 17:57:07
2022-03-01 17:57:07.683 31451-31451/me.hacket.assistant.samples W/hacket.hacket: [Exception]repeatWhenever UndispatchedCoroutine was cancelled
2022-03-01 17:57:07.684 31451-31451/me.hacket.assistant.samples I/hacket.hacket: 【repeatWhenever】ovoAnchorOnlineReport(), interval=4000.,线程:main,日期:2022-03-01 17:57:07
2022-03-01 17:57:07.686 31451-31451/me.hacket.assistant.samples W/hacket.hacket: [Exception]repeatWhenever UndispatchedCoroutine was cancelled
2022-03-01 17:57:07.687 31451-31451/me.hacket.assistant.samples I/hacket.hacket: 【repeatWhenever】ovoAnchorOnlineReport(), interval=4000.,线程:main,日期:2022-03-01 17:57:07
2022-03-01 17:57:07.688 31451-31451/me.hacket.assistant.samples W/hacket.hacket: [Exception]repeatWhenever UndispatchedCoroutine was cancelled
  • 原因分析:
    lifecycleScope.launchWhenResumed,在 App 按 home 键到后台时,会取消协程,delay 会抛出 CancellationException,而在 repeatWhenever 中进行了 try{}catch{},导致 delay 抛出的 CancellationException 被捕获了;继而 repeat 继续执行,遇到 delay 继续抛出 CancellationException,就这样往复,导致死循环,导致 App ANR 了。
  • 解决
1
2
3
4
5
6
7
8
9
10
11
12
13
14
lifecycleScope.launchWhenResumed {
    repeat(Int.MAX_VALUE) {
        Log.i(
            "hacket",
            "[fix]repeatWhenever ovoAnchorOnlineReport(), interval=$interval."
        )
        try {
            ovoOnlineReport()
        } catch (e: Exception) {
            e.printStackTrace()
        }
        delay(interval)
    }
}

Ref

本文由作者按照 CC BY 4.0 进行授权