Coroutines in C++/Boost (2)

Also see my previous article: Coroutines in C++/Boost.

C++ finally has a native implementation in C++20. The principal difference between coroutines and routines is that a coroutine enables explicit suspend and resume of its progress via additional operations by preserving execution state and thus provides an enhanced control flow (maintaining the execution context).

1. Asymmetric vs Symmetric

From boost:

An asymmetric coroutine knows its invoker, using a special operation to implicitly yield control specifically to its invoker.

By contrast, all symmetric coroutines are equivalent; one symmetric coroutine may pass control to any other symmetric coroutine. Because of this, a symmetric coroutine must specify the coroutine to which it intends to yield control.

So C++20 coroutines are asymmetric ones. A coroutine only knows its parent. With the dependency, symmetric coroutines can be chained, just like a normal function calls another one. No goto semantics as with a symmetric one.

C++23 generators are also asymmetric. They are resumed repeatedly to generate a series of return values.

2. Stackless vs Stackful

Again From boost:

In contrast to a stackless coroutine, a stackful coroutine can be suspended from within a nested stackframe. Execution resumes at exactly the same point in the code where it was suspended before.

With a stackless coroutine, only the top-level routine may be suspended. Any routine called by that top-level routine may not itself suspend. This prohibits providing suspend/resume operations in routines within a general-purpose library.

Well, these two are confusing. Tutorials and Blogs have different description. To make it simple, if there is await/yield definition, it’s stackless. Then if there is something called Fiber in the language, it’s stackful.

Fibers are just like threads, they can be suspended at any stackframe. While await/yield is used as a suspend point, a stackless coroutine can only suspend at exactly that point.

A stackless coroutine shares a default stack among all the coroutines, while a stackful coroutine assigns a separate stack to each coroutine. With stackless coroutine, the code is transformed into event handlers at compile time, and driven by an event engine at run time, i.e. the scheduler of stackless coroutine. Transferring control of CPU to a stackless coroutine is merely a function call with an argument pointing to its context. Conversely, transferring CPU control to a stackful coroutine requires a context switch.

Here’s a summary of how coroutine is implemented in most popular programming languages.

Language Stackful coroutines (Fibers) Stackless coroutines (await/yield)
Java (Y2023) Virtual threads in Java 21 n/a
C n/a n/a
C++ n/a (Y2020) co_await, co_yield, co_return in C++ 20
Python n/a (Y2015) async, await/yield in Python 3.5
C# n/a (Y2012) async, await/yield in C# 5.0
Javascript n/a (Y2017) async, await/yield in ES 2017
PHP (Y2021) Fiber in PHP 8.1 n/a
Go (Y2012) Goroutine in Go 1.0
(Y2020) asynchronously preemptible in 1.14
n/a
Objective-C n/a n/a
Swift n/a (Y2021) async, await/yield in Swift 5.5
Rust n/a (Y2019) async, await in Rust 1.39

Reference

Boost.Coroutine2
Fibers under the magnifying glass
Stackful Coroutine Made Fast

Coroutines in C++/Boost

Starting with 1.56, boost/asio provides asio::spawn() to work with coroutines. Just paste the sample code here, with minor modifications:

The Python in my previous article can be used to work with the code above. I also tried to write a TCP server with only boost::coroutines classes. select() is used, since I want the code to be platform independent. NOTE: with coroutines, we have only _one_ thread.

Coroutines in Python

Python 3.5 added native support for coroutines. Actually, there were several steps towards the current implementation. See Wikipedia, and it seems a bit messy to me:

  • Python 2.5 implements better support for coroutine-like functionality, based on extended generators (PEP 342).
  • Python 3.3 improves this ability, by supporting delegating to a subgenerator (PEP 380).
  • Python 3.4 introduces a comprehensive asynchronous I/O framework as standardized in PEP 3156, which includes coroutines that leverage subgenerator delegation.
  • Python 3.5 introduces explicit support for coroutines with async/await syntax (PEP 0492).

Before Python 2.5, there were only generators.

In Python 2.5, yield was refined to be an expression rather than a statement, which gave the possibility to implement a simple coroutine. But still a lot of work left for programmers to use it. For instance, a simple conroutine scheduler was required.

In Python 3.3, yield from was added to support subgenerators. Nothing to do with coroutines.

In Python 3.4, the Father of Python (Guido van Rossum) wrote a PEP himself to add an asyncio module to simplify coroutine usage in Python. An official scheduler was added. We can use @asyncio.coroutine to decorate a function. We can use yield from expressions to yield to a specific coroutine.

In Python 3.5, async/await syntax was added, borrowed from C#. The newest PEP made coroutines a native Python language feature, and clearly separated them from generators. A native coroutine now declares with async def syntax, and yield from is replaced with await expression. This removes generator/coroutine ambiguity. So in Python 3.5, coroutines used with asyncio may be implemented using the async def statement, or by using generators. Generator-based coroutines should be decorated with @asyncio.coroutine, although this is not strictly enforced. The decorator enables compatibility with async def coroutines, and also serves as documentation. See Python documents here.

The implementation can be found in this commit.

I wrote a echo server/client sample to try corutines. Server code first:

Client code here, or you can simply use telnet command:

Server output:

Client output:

With Python 3.5 on Ubuntu 16.04, we can also use async/await: