Cpp Notes

introduction_to_cpp_coroutines

Introduction to C++ Coroutines - James McNellis

Why coroutine

Motivation for Coroutines:

  • Coroutines aim to simplify asynchronous programming in C++.
  • Traditional synchronous functions (like TCP operations) block operations, which is inefficient for tasks that involve waiting (e.g., network communication).

Example Explained:

int64_t tcp_reader(int64_t total) {
  std::array<char, 4096> buffer;

  tcp::connection the_connection = tcp::connect("127.0.0.1", 1337);

  int64_t remaining = total;
  for (;;) {
    int64_t bytes_read = the_connection.read(buffer.data(), buffer.size());
    remaining -= bytes_read;
    if (remaining <= 0 || bytes_read == 0) {
      return remaining;
    }
  }
}
  • Synchronous TCP function example:
    • Connects to a server.
    • Reads data in chunks until all data is read or the end of the stream is reached.
    • Operations like connecting and reading block further execution until completed.

Transition to Asynchronous Programming:

  • Change function return type to future<int> to handle asynchronous operations.
  • Encapsulate state in a structure to persist state across asynchronous calls, requiring dynamic memory allocation.
  • Use asynchronous APIs provided by TCP library for non-blocking operations.
future<void> do_while(function<future<bool>()> body) {
  return body().then([=](future<bool> not_done) {
    return not_done.get() ? do_while(body) : make_ready_future();
  })
}

std::future<int64_t> tcp_reader(int64_t total) {
  struct reader_state {
    std::array<char, 4096> _buffer;
    int64_t _remaining;
    tcp::connection _connection;
    explicit reader_state(int64_t total) : _remaining(total) {}
  };

  auto state = std::make_shared<reader_state>(total);

  return tcp::connect("127.0.0.1", 1337)
      .then([state](std::future<tcp::connection> the_connection) {
        state->_connection = std::move(the_connection.get());
        return do_while([state]() -> std::future<bool> {
          if (state->_remaining <= 0) {
            return std::make_ready_future(false);
          }
          return state->conn.read(state->_buffer.data(), sizeof(state->_buffer))
              .then([state](std::future<int64_t> bytes_read_future) {
                int64_t bytes_read = bytes_read_future.get();
                if (bytes_read == 0) {
                  return std::make_ready_future(false);
                }
                state->_remaining -= bytes_read;
                return std::make_ready_future(true);
              });
        });
      })
      .then([state] { return std::make_ready_future(state->_remaining); });
}

Challenges:

  • Asynchronous code is more complex to write and understand compared to synchronous code.
  • Maintenance, modification, and debugging of asynchronous code are challenging.
auto tcp_reader(int64_t total) -> std::future<int64_t> {
  std::array<char, 4096> buffer;

  tcp::connection the_connection = co_await tcp::connect("127.0.0.1", 1337);

  int64_t remaining = total;
  for (;;) {
    int64_t bytes_read =
        co_await the_connection.read(buffer.data(), buffer.size());
    remaining -= bytes_read;
    if (remaining <= 0 || bytes_read == 0) {
      return remaining;
    }
  }
}

Transition to Coroutines:

auto tcp_reader(int64_t total) -> std::future<int64_t> {
  std::array<char, 4096> buffer;

  tcp::connection the_connection = co_await tcp::connect("127.0.0.1", 1337);

  int64_t remaining = total;
  for (;;) {
    int64_t bytes_read =
        co_await the_connection.read(buffer.data(), buffer.size());
    remaining -= bytes_read;
    if (remaining <= 0 || bytes_read == 0) {
      return remaining;
    }
  }
}
  • Objective: Simplify asynchronous code to make it readable and maintainable while retaining asynchronous characteristics.

Key Changes with Coroutines:

  1. Return Type Modification:

    • Changed the function's return type to std::future<int> to indicate the function is asynchronous and will return the result in the future, not immediately.
  2. Introduction of coawait:

    • Added the coawait keyword before potentially blocking calls (e.g., TCP connection and read operations).
    • The coawait keyword suspends the coroutine at the point of the call until the awaited operation completes, allowing other operations to run concurrently.
  3. Simplified Code Structure:

    • Despite changes, the code structure remains similar to the synchronous version, enhancing readability.
    • Reduces boilerplate code associated with managing asynchronous logic and state continuation manually.

Benefits Highlighted:

  • Readability: Coroutine version is significantly clearer compared to traditional asynchronous code using futures and callbacks.
  • Maintainability: Easier to write, modify, and debug due to reduced complexity and increased clarity.
  • Performance: Optimizes runtime behavior by efficiently handling asynchronous operations without blocking threads, leveraging concurrency effectively.

Basics of Coroutines

Definitions and Comparisons:

  • Coroutine: A generalization of a subroutine that can be suspended and resumed.
  • Subroutine: A function in C++ that can be invoked and returns control to the caller after execution.

Key Properties:

  1. Invocation:

    • Subroutine: Invoked through a regular function call.
    • Coroutine: Also invoked through a function call, similarly to subroutines.
  2. Control Return:

    • Subroutine: Returns control via a return statement or by reaching the end of the function.
    • Coroutine: Uses a co_return statement to return control and values.
  3. Execution Suspension:

    • Subroutine: Does not support suspension.
    • Coroutine: Can suspend execution at certain points using the co_await keyword.
  4. Execution Resumption:

    • Subroutine: Not applicable.
    • Coroutine: After suspension, it can be resumed, continuing from the point of suspension.

How Coroutines Work:

  • A function becomes a coroutine when it includes operations like co_return (to return values), co_await (to await on promises), co_yield (to yield a sequence of values), or utilizes co_await within a range-based for loop.

Visual Representation (Table):

  • Invoke:
    • Subroutine: Function call
    • Coroutine: Function call
  • Return:
    • Subroutine: Return statement or end of function
    • Coroutine: co_return statement
  • Suspend:
    • Subroutine: Not applicable
    • Coroutine: co_await expression
  • Resume:
    • Subroutine: Not applicable
    • Coroutine: Handled implicitly by the continuation of co_await

Practical Example Comparison:

  • Standard Function:
    • Uses std::async to perform a task asynchronously on a background thread, returning a std::future representing the result.
    • Example Code (Pre-C++20):
std::future<int> compute_value() {
  return std::async([] { return 30; });
}
  • Coroutine Function:
    • Implements asynchronous behavior natively using co_await to suspend and resume around potentially blocking or long-running operations, simplifying code.
    • Example Code (C++20 and later):
// Note: function signature the same as before!!!
std::future<int> compute_value() {
  int result = co_await std::async([] { return 30; });
  co_return result;
}

Key Takeaways:

  • Flexibility: Functions can transition between being subroutines and coroutines without affecting their interfaces or caller code.
  • Maintainability and Readability: Coroutines reduce complexity in asynchronous programming, making code more intuitive and easier to maintain.
  • Performance: By suspending and resuming operations, coroutines can handle long-running tasks more efficiently without blocking threads, thus better utilizing system resources.

Understanding the coawait Keyword in C++ Coroutines:

The coawait keyword is pivotal in implementing asynchronous logic in C++. It helps manage asynchronous operations by suspending and resuming functions efficiently. Here's a detailed breakdown of how coawait works and interacts with asynchronous tasks:

Process Breakdown:

  1. Expression Evaluation:

    • The expression following coawait (e.g., an asynchronous task returning a future or similar object) is first evaluated and its result is stored in a temporary variable.
  2. Check Readiness:

    • The await_ready() function is called on the awaited object to check if the result is already available.
      • If true, the coroutine continues execution without suspending, bypassing the costly suspension process.
      • If false, the coroutine needs to be suspended as the result is not yet ready.
  3. Handle Suspension:

    • await_suspend() is invoked to manage what happens right before the coroutine suspends.
    • This function typically handles attaching the coroutine's continuation to the awaited object's completion event (e.g., linking the coroutine to the completion of an asynchronous I/O operation).
    • The coroutine suspends after this function call, yielding control back to the caller.
  4. Resumption of Execution:

    • Once the awaited condition is fulfilled (e.g., data is available or an event is completed), the coroutine is eligible to resume.
    • await_resume() is executed upon resumption to:
      • Perform any necessary operations right after resuming (custom actions).
      • Obtain the result of the awaited expression, which becomes the result of the coawait expression.

What happens when co_await?

Illustration with Code Transformation:

Consider a coroutine that awaits an asynchronous operation, here's how the compiler transforms the use of coawait:

auto result = co_await expression;

// ----> compiler translates to ...

auto&& _a = expression;

if (!_ a.await_ready()) {
  a.await_suspend(coroutine - handle);

  // ... suspend/resume point ...
}

auto result = _a.await_resume();
// to do above, you can see that an "awaiter"
// that can become the result of the await operator
// has to support the 3 functions:

struct awaitable_concept {
  bool await_ready();
  void await_suspend(coroutine_handle<>);
  auto await_resume();
};

Simplest awaitables: suspend_always and suspend_never

1. suspend_always

  • Purpose: Guarantees the suspension of the coroutine every time co_await is used with it.
  • Behavior:
    • await_ready(): Returns false consistently, indicating that the coroutine should always suspend.
    • await_suspend(): Typically does nothing but returns control to the coroutine scheduler or caller.
    • await_resume(): Also does nothing; it is simply there to fulfill the coroutine’s awaitable interface requirements.
struct suspend_always {
  bool await_ready() noexcept { return false; }

  void await_suspend(coroutine_handle<>) noexcept {}

  void await_resume() noexcept {}
};

//...

return_type my_coroutine() {
  cout << "my_coroutine about to suspend\n";

  co_await suspend_always{};  // This will suspend the coroutine and return
                              // control back to its caller

  cout << "my_coroutine was resumed\n";
}
  • Explanation:
    • When my_coroutine is called, it prints a message, suspends, and then yields control back to the caller.
    • Upon resumption (triggered externally), it continues from where it left off and prints another message.

2. suspend_never

  • Purpose: Prevents the coroutine from suspending.
  • Behavior:
    • await_ready(): Returns true consistently, indicating that there is no need to suspend because the awaited operation is considered complete.
    • await_suspend(): Not invoked because await_ready returns true.
    • await_resume(): Not necessary to do anything special, as suspension never occurs.
struct suspend_never {
  bool await_ready() noexcept { return true; }

  void await_suspend(coroutine_handle<>) noexcept {}

  void await_resume() noexcept {}
};

//...

return_type my_coroutine() {
  cout << "my_coroutine before 'no-op' await\n";

  co_await suspend_never{}; // This will not suspend the coroutine and will
                            // allow the coroutine to continue execution.

  cout << "my_coroutine after 'no-op' await\n";
}
  • Explanation:
    • my_coroutine starts execution, prints a message, attempts to co_await on suspend_never, which does not cause suspension, and immediately continues execution to print the final message.

How does the caller resume a coroutine

Coroutine Invocation and Suspension:

  1. Invocation:

    • Similar to functions, invoking a coroutine involves setting up an execution context. However, instead of a traditional stack frame, coroutines utilize a coroutine frame which is more complex.
  2. Coroutine Frame Construction: what does it need to maintain?

    • Components:
      • Formal parameters and local variables.
      • Temporaries that the coroutine might need to resume.
      • Execution state to remember the point of suspension.
      • A promise object, which facilitates communication between the coroutine and its caller, potentially holding the eventual return value or the state of the coroutine.
    • Memory Allocation:
      • Typically, coroutine frames are dynamically allocated using operator new to ensure they persist after the coroutine suspends, independent of the stack which might be reused.
      • Customization is possible by overloading operator new for specific coroutines.
  3. Optimization:

    • If the compiler determines that the coroutine will not escape the caller’s context, it may optimize by allocating the coroutine frame on the stack.
    • These optimizations are critical for performance and are aggressively pursued by compilers.

Resuming a Coroutine:

  • Resumption of a coroutine is handled by the caller, which holds a handle to the coroutine frame. This handle is essential for managing the life cycle of the coroutine.

Understanding Coroutine Handles

A coroutine handle is an abstraction used to control the execution of a coroutine; it can be likened to a pointer or a reference to the coroutine's frame.

Types of Coroutine Handles:

template <>
struct coroutine_handle<void> {
  // ...
};

template <typename Promise>
struct coroutine_handle : coroutine_handle<void> {
  // ...
};
  1. coroutine_handle<void>:

    • Basic handle type used for coroutines that do not return a value (similar to functions returning void).
    • Provides foundational operations needed to manage a coroutine’s execution such as resumption and destruction.
  2. coroutine_handle<Promise>:

    • Inherits from coroutine_handle<void>.
    • Used for coroutines that return a value; the Promise type typically represents the return type and manages the value being produced by the coroutine.
    • Adds functionality to access and manipulate the promise object, which might hold the return value or state.

Details of coroutine_handle<void>

template <>
struct coroutine_handle<void> {
  coroutine_handle() noexcept = default;  // Default constructor
  coroutine_handle(nullptr_t) noexcept;   // Construct from nullptr
  coroutine_handle& operator=(nullptr_t) noexcept;  // Assign from nullptr
  explicit operator bool() const noexcept;  // Test if handle is non-empty

  static coroutine_handle from_address(void* a) noexcept;  // Convert void* to coroutine handle
  void* to_address() const noexcept;  // Convert coroutine handle to void*

  void operator()() const;  // Resume coroutine (alternative to resume())
  void resume() const;  // Resume coroutine

  void destroy();  // Destroy coroutine

  bool done() const;  // Check if coroutine is finished
};
  1. Default Construction and Nullability:

    • Default Constructor: Initializes the handle to an empty state, referring to no coroutine.
    • Nullptr Construction and Assignment:
      • Construct and assign from nullptr to explicitly set the handle to an empty state.
      • These operations ensure that the handle can be safely declared without immediately referring to an active coroutine.
  2. Boolean State Testing:

    • Explicit Operator Bool: Allows the coroutine handle to be tested as a boolean expression, which is true if the handle refers to a valid coroutine, and false if it is empty. This is useful for checking the validity of the handle before attempting operations like resumption or destruction.
  3. Pointer Interoperability:

    • Conversion to and from void*:
      • to_address(): Converts the coroutine handle to a void*. This is particularly useful for interoperating with C APIs, where a coroutine handle needs to be passed as a context pointer.
      • from_address(void* a): Static method to convert a void* back to a coroutine handle, facilitating the re-establishment of control over a coroutine when passed across API boundaries.
  4. Execution Control:

    • Resumption:
      • resume(): Resumes the coroutine if it is suspended, allowing it to continue execution from where it last left off.
      • operator()(): An alternative to resume() using the function call operator. Simplifies resuming the coroutine and enhances readability.
    • Destruction:
      • destroy(): Explicitly destroys the coroutine, causing all its local variables to be destructed as if a return statement was executed at the last suspension point. This is critical for managing resource cleanup meticulously.
  5. Completion Testing:

    • done(): Checks whether the coroutine has completed its execution and no further resumption is possible. This function is vital for loops or control structures that depend on the coroutine's lifecycle.

coroutine_handle<Promise>

template <typename Promise>
struct coroutine_handle : coroutine_handle<void> {
  Promise& promise() const noexcept;

  static coroutine_handle from_promise(Promise&) noexcept;
};
  1. Accessing the Promise:

    • promise() Method:
      • Returns a reference to the promise object associated with the coroutine.
      • This method is crucial for accessing the result set by the coroutine and for performing operations that depend on the coroutine's completion status.
  2. Constructing from a Promise:

    • from_promise(Promise&) Static Method:
      • Creates a coroutine_handle that corresponds to the coroutine associated with a given promise.
      • This is particularly useful when you have a promise and need to control the coroutine that fulfills this promise (e.g., to resume or destroy the coroutine).

A simple example

resumable_thing counter() {
  cout << "counter: called\n";
  for (unsigned i = 1;; ++i) {
    co_await suspend_always{};
    cout << "counter: resumed (#" << i << ")\n";
  }
}

int main() {
  cout << "main: calling counter\n ";
  resumable_thing the_counter = counter();
  cout << "main: resuming counter\n ";
  the_counter.resume();
  the_counter.resume();
  cout << "main: done\n ";
}
  1. Coroutine Function (counter):

    • Prints a message when called.
    • Enters an infinite loop, suspending itself each iteration and printing the iteration count upon resumption.
  2. Main Function Usage:

    • Announces the calling of counter.
    • Calls counter, resuming it twice.
    • Ends by printing a termination message.

Coroutine Implementation (resumable_thing)

// Definition of coroutine-based class
struct resumable_thing {
  struct promise_type; // Promise type defined later

  coroutine_handle<promise_type> _coroutine = nullptr; // Handle to manage the coroutine

  explicit resumable_thing(coroutine_handle<promise_type> coroutine)
      : _coroutine(coroutine) {} // Constructor initializing handle

  ~resumable_thing() {
    if (_coroutine) {
      _coroutine.destroy(); // Destroy coroutine if it exists
    }
  }

  resumable_thing() = default; // Default constructor
  resumable_thing(const resumable_thing&) = delete; // No copy initialization
  resumable_thing& operator=(const resumable_thing&) = delete; // No copy assignment

  resumable_thing(resumable_thing&& other) noexcept : _coroutine(other._coroutine) {
    other._coroutine = nullptr; // Move constructor
  }

  resumable_thing& operator=(resumable_thing&& other) noexcept {
    if (this != &other) {
      _coroutine = other._coroutine; // Move assignment
      other._coroutine = nullptr;
    }
    return *this;
  }
};
  • Structure Definition:

    • Manages the coroutine's lifecycle using a promise_type.
    • Uses coroutine_handle to control the coroutine (similar to a raw pointer).
  • Constructors and Destructor:

    • Default constructor.
    • Explicit constructor to initialize with a coroutine_handle.
    • Destructor to destroy the coroutine if it still exists.
    • Move constructor and move assignment for handling ownership transfer.
  • Deleted Operations:

    • Copy constructor and copy assignment are deleted to prevent unintended copying.

Promise Type Implementation (promise_type)

// Promise type for the coroutine
struct resumable_thing::promise_type {
  resumable_thing get_return_object() {
    return resumable_thing(coroutine_handle<promise_type>::from_promise(*this));
  }

  auto initial_suspend() { return suspend_never{}; } // Does not suspend initially
  auto final_suspend() { return suspend_never{}; } // Does not suspend finally
  void return_void() {} // No return value
};
  • Key Functions:
    • get_return_object: Converts the promise into the coroutine's return type.
    • initial_suspend & final_suspend: Control coroutine suspension points (both set to never suspend in this example).
    • return_void: Placeholder function as the coroutine does not return a value.

What happened when coroutine run

Coroutine Structure

  • Coroutine Context (__counter_context):

    • Stores the coroutine's promise type, local variables, and an instruction pointer to manage execution state.
    • May include storage for registers and temporary variables introduced by the compiler.
  • Automatic Code Injection by Compiler:

    • The compiler automatically constructs the coroutine context at the start using new.
    • Retrieves the return object (resumable_thing) from the promise.
    • Manages execution flow using initial_suspend() and final_suspend() to control suspension points.

Implementation Steps

  1. Context and Promise Initialization:

    • A new context (__counter_context) is created, initializing storage for the promise and local variables.
    • The return object is fetched using the promise's get_return_object() method.
  2. Execution Control:

    • initial_suspend() is called to decide if the coroutine should suspend immediately before executing further (depends on whether suspend_always or suspend_never is used).
  3. Coroutine Body Execution:

    • Upon resuming (either initially or from a suspension), prints "counter: called".
    • Enters an infinite loop, incrementing i and suspending after each print statement using suspend_always.
  4. Clean-up:

    • After the loop (theoretically never in this infinite case), final_suspend() is called to potentially suspend one last time before coroutine destruction.
    • The context is deleted to free up resources, ensuring no memory leaks.

Example Code with Annotations

// Compiler implicitly generates definition of the coroutine context structure
struct __counter_context {
  resumable_thing::promise_type _promise; // Promise object for coroutine management
  unsigned i;                            // Local variable `i` used within the loop
  void* _instruction_pointer;            // Pointer to manage where to resume execution
};

// Enhanced coroutine function with compiler injections simulated
resumable_thing counter() {
  __counter_context* __context = new _counter_context{};  // Create new coroutine context
  auto __return = __context->_promise.get_return_object(); // Get the return object from the promise
  co_await __context->_promise.initial_suspend();         // Potentially suspend before execution starts

  cout << "counter: called\n";                            // Initial print statement
  for (unsigned i = 1;; ++i) {
    co_await suspend_always{};                            // Suspend after each iteration
    cout << "counter: resumed\n";                         // Print statement after each resume
  }

final_suspend_label:
  co_await __context->_promise.final_suspend();           // Potentially suspend before destruction
  delete __context;                                       // Clean up coroutine context
}

The coroutine can be used in different way

resumable_thing named_counter(std ::string name) {
  cout << "counter(" << name << ") was called\n";
  for (unsigned i = 1;; ++i) {
    co_await suspend_always{};
    cout << "counter(" << name << ") resumed #" << i << '\n';
  }
}

int main() {
  resumable_thing counter_a = named_counter("a");
  resumable_thing counter_b = named_counter("b");
  counter_a.resume();
  counter_b.resume();
  counter_b.resume();
  counter_a.resume();
}

Here’s a structured breakdown and explanation of the enhanced C++ coroutine example which uses named counters and illustrates interleaved execution of coroutines:

Extended Coroutine Usage: Named Counters

Code Overview

  • Function named_counter:

    • Takes a string name and prints it alongside counter messages to uniquely identify coroutine instances.
    • Utilizes an infinite loop where it suspends after each iteration and prints the iteration count upon resumption.
  • Main Function:

    • Creates two named coroutine instances, counter_a and counter_b, corresponding to "a" and "b".
    • Executes them in an interleaved manner to demonstrate their independent and parallel operation.

Code Execution

  1. Instantiation:

    • Two instances of named_counter are created with names "a" and "b".
    • Both coroutines start and print their initialization messages.
  2. Interleaved Resumption:

    • counter_a is resumed first, prints its first resumed message, then suspends.
    • counter_b follows, displaying its first resumed count and suspends.
    • Resumption continues in an interleaved fashion showing the independent states of each coroutine.

Run 2 coroutines

  • Independent Operation: The coroutines run independently of each other, demonstrated by the interleaved calls to resume(), affecting only the targeted instance.
  • Infinite Operation: The use of an infinite loop with suspend_always allows these coroutines to run "forever" until explicitly destroyed, suitable for continuous tasks.
    • The use of suspend_always showcases a simple way to handle asynchronous waits, making it a practical example for tasks such as event handling or asynchronous I/O operations where tasks might wait for external triggers to resume execution.
// Coroutine that prints a counter with a given name
resumable_thing named_counter(std::string name) {
  cout << "counter(" << name << ") was called\n";  // Announce coroutine call with name
  for (unsigned i = 1;; ++i) {  // Infinite loop
    co_await suspend_always{};  // Suspend and wait to be resumed
    cout << "counter(" << name << ") resumed #" << i << '\n';  // Print resume count with name
  }
}

int main() {
  resumable_thing counter_a = named_counter("a");  // Create coroutine for "a"
  resumable_thing counter_b = named_counter("b");  // Create coroutine for "b"
  counter_a.resume();  // Resume "a", prints "counter(a) resumed #1"
  counter_b.resume();  // Resume "b", prints "counter(b) resumed #1"
  counter_b.resume();  // Resume "b", prints "counter(b) resumed #2"
  counter_a.resume();  // Resume "a", prints "counter(a) resumed #2"
}

// Interleaved Execution Output
/*
counter(a) was called
counter(b) was called
counter(a) resumed #1
counter(b) resumed #1
counter(b) resumed #2
counter(a) resumed #2
*/

Some discussion

  • Multithreading: Coroutines can be resumed on any thread, showcasing their flexibility in asynchronous programming environments.
  • Automatic Capture: Unlike lambdas where captured variables are explicitly specified, coroutines automatically capture all variables used within them.
    • Storage Location: Variables are directly constructed within the coroutine's frame, not on the stack first. This is efficient as it avoids copying/moving variables unnecessarily.
    • Handling of Non-Movable Types: Even non-copyable or non-movable types are supported as they are constructed in-place within the coroutine frame.
  • Coroutine as Class Member Functions
    • this Pointer Capture: When a coroutine is a member function of a class, it captures the this pointer implicitly.
    • Lifetime Management: It is crucial to manage the lifetime of the class instance when its member function is a coroutine to prevent dangling pointer issues.
  • Lambdas can indeed be used as coroutines, allowing for even more compact and inline coroutine definitions.
  • Suspension Points and Stack Unwinding
    • Coawait Implications: Every coawait potentially allows for the coroutine to be destroyed at that suspension point.
    • Error Handling: The possibility of stack unwinding at each coawait means that error handling and resource management need to be carefully considered.

How to return from coroutine

Overview of Coroutines and co_return in C++

  • Purpose of co_return:
    • Utilized in coroutines to return values.
    • Replaces traditional return by managing asynchronous operations and states efficiently.
std::future<int> compute_value() {
  int result = co_await std::async([] { return 30; });
  co_return result; // why do we return an int, while std::future<int> can't
                    // be initialized from int directly?
}
  • there's two interesting things about this example.
    • The first is, why is it co-return instead of return?
    • And the second, and this kind of explains it, is, well, we're returning an int, and the return type is future of int, so you can't actually construct a future of int from an int

We're going to see how this works with another example:

resumable_thing get_value() {
  cout << "get_value: called\n";
  co_await suspend_always{};
  cout << "get_value: resumed\n";
  co_return 30;
}

int main() {
  cout << "main: calling get_value\n ";
  resumable_thing value = get_value();
  cout << "main:resuming get_value\n ";
  value.resume();
  cout << "main: value was " << value.get() << '\n';
}
  • Here, the compiler is going to generate that same boilerplate at the beginning and end that it did before.
  • What we really want to look at is, what does this co_return do?
// co_return 30 is actually transformed to:
//
// __context->_promise.return_value(30);
// goto final_suspend_label;

// So the full get_value from compiler should look like:
resumable_thing get_value() {
  __counter_context* __context = new _counter_context{};  // Create new coroutine context
  auto __return = __context->_promise.get_return_object(); // Get the return object from the promise
  co_await __context->_promise.initial_suspend();         // Potentially suspend before execution starts

  cout << "get_value: called\n";
  co_await suspend_always{};
  cout << "get_value: resumed\n";

  __context->_promise.return_value(30);
  goto final_suspend_label;

final_suspend_label:
  co_await __context->_promise.final_suspend();           // Potentially suspend before destruction
  delete __context;                                       // Clean up coroutine context
}
  • Well, given that we've implemented that return value function on the promise, the compiler is going to transform that into a call to return value to set the return value that you've given, and then it's going to go to the final suspend label, which will ask, should I suspend or not, and then if you're not going to suspend it, it'll delete the context.

  • So we've already seen that our promise type has to have a few things. We're going to make a few small changes to our resumable thing in order to support returning of the value.

  • For coroutines that do return values to their callers, you have to implement, instead of return_void, a return_value.

  • we're going to actually have to store the value inside of the promise. We're going to have to implement that return value function, which is actually just going to set the value.

  • And then since we want to be able to call get from that main function in order to get the value back, we're going to actually, from the coroutine handle that we have, get the promise and then return the value member of it.

struct resumable_thing {
  struct promise_type {
    int _value;  // critical for co_return

    resumable_thing get_return_object() {
      return resumable_thing(
          coroutine_handle<promise_type>::from_promise(*this));
    }

    auto initial_suspend() { return suspend_never{}; }
    auto final_suspend() { return suspend_always{}; } // critical for co_return
    // NOTE: if you do suspend_never in final_suspend, the above generated
    // co_await ...final_suspend will not suspend, the context could be deleted
    // even before user access the  __context->_promise to get the co_return
    // result. So this is crucial to suspend_always to ensure lifetime.
    // You might ask, then when do we resume and continue the delete __context?
    // We won't. But our ~resumable_thing() actually have the
    // if (_coroutine) { _coroutine.destroy(); }, which will clean it up!

    void return_value(int value) { _value = value; }  // critical for co_return
  };

  int get() { return _coroutine.promise()._ value; }  // critical for co_return
};
  • We have to look at the coroutine lifetime.
    • So coroutine comes into existence when it's called. This is when the compiler creates the coroutine context, as we've seen. And then it's destroyed when either the final_suspend is resumed, or when you call destroy on the coroutine handle, whichever happens first.
    • So if we look at our get_value coroutine, you can see that in the final_suspend_label, we call final_suspend, and then when that resumes, we're just going to delete the context.
    • Well, the final_suspend of our original coroutine type returns suspend_never, so it's not going to suspend. It's just going to continue on. The context is going to be deleted. And now we're accessing freed memory, and so we don't get that 30 that we so nicely stored in our promise.
    • So we have to make one small change here in order to make this work. We just have to change the final_suspend to say suspend_always. By doing that, when we're actually executing through the get value coroutine, we'll reach the final_suspend_label, we'll co-await on this final_suspend, it will say suspend_always, return control to the caller, and we will not delete the context. And so when we do that, it will print out the correct result
    • So when does the coroutine actually get destroyed then? Have we just leaked it? And the answer is no, because in the destructor that we wrote a while ago, we actually check and say if the coroutine was valid, then we're going to call destroy on it explicitly.
    • And how does coroutine destruction work? Someone already alluded to this before. So when you destroy a coroutine, it's basically as if you had returned from the last suspension point.

Make previous example work

std::future<int> compute_value() { // Need to make future a coroutine type
  int result = co_await std::async([] { return 30; }); // need to make future awaitable
  co_return result;
}

Two things needed

  1. Making std::future a Coroutine-Compatible Type
  2. Making std::future Awaitable

1. Making std::future a Coroutine-Compatible Type

To integrate std::future with coroutines, we define a specialization of coroutine_traits for std::future<T>.

  • This allows us to specify how the coroutine machinery interacts with futures, particularly how values and exceptions are handled.
// Make future coroutine type!!!
template <typename T, typename... Arguments>
struct coroutine_traits<future<T>, Arguments...> {
  struct promise_type {
    promise<T> _promise; // Promise object to interact with std::future

    future<T> get_return_object() { return _promise.get_future(); }

    auto initial_suspend() { return suspend_never{}; }

    auto final_suspend() { return suspend_never{}; }

    template <typename U>
    void return_value(U&& value) {
      _promise.set_value(std ::forward<U>(value));
    }

    void set_exception(std ::exception_ptr ex) {
      _promise.set_exception(std ::move(ex));
    }
  };
};

Key Points:

  • Promise Type: Encapsulates a std::promise which pairs with the std::future that the coroutine returns.
  • Initial and Final Suspend: Configured to suspend_never which means the coroutine will not automatically suspend upon start or before destruction, reflecting a design choice that assumes the coroutine should run to completion unless explicitly suspended.
  • Handling Returns and Exceptions: Provides mechanisms to set values or exceptions on the promise, which are then accessible through the associated future.

2. Making std::future Awaitable

// Make future awaitable!!!
template <typename T>
struct future_awaiter {
  // we just store a reference to the future of t inside of our awaiter. And
  // then our operator co_await just wraps up or stores, constructs an instance
  // of the future awaiter from that
  future<T>& _f;

  // Note: std::future actually don't have `is_ready` (check the other talk)
  bool await_ready() { return _f.is_ready(); }

  void await_suspend(coroutine_handle<> ch) {
    // Note: This may be on a different thread. It may be on the same thread.
    // Note: std::future actually don't have `.then` (check the other talk)
    _f.then([ch]() { ch.resume(); });
  }

  auto await_resume() { return _f.get(); }
}

// specialized to use the future_awaiter!!!
template <typename T>
future_awaiter<T> operator co_await(future<T>& value) {
  return future_awaiter<T>{value};
}

Yields

generator<int> integers(int first, int last) {
  for (int i = first; i <= last; ++i) {
    co_yield i; // Suspends the coroutine and returns i
  }
}
  • Iterates from first to last.

  • co_yield outputs the current value of i and pauses the function's execution until it is resumed.

  • Compiler Transformation for Coroutine:

// compiler actually generates:

generator<int> integers(int first, int last) {
  for (int i = first; i <= last; ++i) {
    co_await _promise.yield_value(i);
  }
}
  • The co_yield i is essentially transformed by the compiler to use _promise.yield_value(i), where _promise is an object managing coroutine state.

  • co_await then suspends the execution awaiting the next call to the generator.

  • Components of corresponding promise_type:

    struct promise_type {
      int const* current;
    
      int_generator get_return_object() {
        return int_generator(coroutine_handle<promise_type>::from_promise(*this));
      }
    
      auto initial_suspend() { return suspend_always{}; }
      auto final_suspend() { return suspend_always{}; }
    
      auto yield_value(int const& value) {
        current = &value;
        return suspend_always{};
      }
    };
    • current: Pointer to the current integer generated.
    • get_return_object(): Returns the generator object linked with the coroutine.
    • initial_suspend() and final_suspend(): Ensure the coroutine is initially and finally suspended.
    • yield_value(const int& value): Updates current to point at value and suspends the coroutine.
  • Iterator for int_generator:

    • Allows iteration over the values produced by the int_generator.
    • Comprises begin() and end() methods to manage the iteration process.
  • Implementation of Iterator Functions:

    struct iterator;
    
    iterator begin() {
      if (_coroutine) {
        _coroutine.resume(); // Resumes coroutine to fetch the first integer
        if (_coroutine.done()) {
          return end(); // If coroutine is finished, return end iterator
        }
      }
      return iterator(_coroutine);
    }
    
    iterator end() { return iterator{}; }
    • begin(): Resumes the coroutine to retrieve the first integer; returns end() if the coroutine has no more integers.
    • end(): Represents the end of the sequence for the iteration.
struct int_generator {
  struct iterator : std::iterator<input_iterator_tag, int> {
    coroutine_handle<promise_type> _coroutine;

    iterator& operator++() {
      _coroutine.resume();
      if (_coroutine.done()) {
        _coroutine = nullptr;
      }
      return *this;
    }

    int const& operator*() const { return *_coroutine.promise()._current; }
  };
};
  • Inherits from std::iterator specifying it as an input iterator that yields integers.

  • Holds a coroutine_handle pointing to the associated coroutine's promise type.

  • Key Functionalities of the Iterator:

    • Increment Operator (operator++):

      • Resumes the coroutine to advance to the next integer.
      • Checks if the coroutine has finished (_coroutine.done()). If so, invalidates the handle by setting _coroutine to nullptr.
      • Ensures the iterator only progresses forward and cannot go backwards or repeat values.
    • Dereference Operator (operator*):

      • Accesses the current integer produced by the coroutine.
      • Dereferences the pointer to the current integer stored in the coroutine's promise type via _coroutine.promise()._current.
  • Behavior and Lifecycle Management:

    • Initialization and Advancement:

      • Upon creation and each advancement (via operator++), the iterator resumes the coroutine.
      • Continues until the coroutine has no more values to yield (signaled by _coroutine.done()), at which point the iterator becomes equivalent to the end() iterator.
    • End of Sequence:

      • Represented by an iterator where _coroutine is nullptr, making it distinct from active iterators.

Summary

Design principles

  • Scalability:

    • Objective: Design coroutines to be scalable to billions of concurrent executions.
    • Implementation: Use of stackless coroutines, which are more memory efficient than stackful coroutines.
    • Memory Efficiency: Each coroutine only allocates memory for its frame, not for an entire stack, significantly reducing the total memory footprint.
  • Types of Coroutines:

    • Stackless Coroutines:
      • Can only suspend from within the coroutine itself; suspending from a callee function is not possible.
      • Ideal for high-volume, lightweight coroutine creation and management.
      • Example Use: Managing large numbers of concurrent tasks without exhausting memory resources.
    • Stackful Coroutines:
      • Allow suspension from within callee functions, offering greater flexibility at the cost of higher memory use per coroutine.
  • Advantages of Stackless Coroutines:

    • Reduced Memory Demand: Minimal allocation per coroutine makes it feasible to create millions of coroutines without running out of memory.
    • Efficiency:
      • Suspend and resume operations have costs comparable to function calls, making them efficient in terms of computational overhead.
      • Compiler optimizations are possible such as inlining coroutines and avoiding heap allocations for coroutine frames, thanks to increased visibility into coroutine behaviors.
    • Integration and Flexibility:
      • Easily integrates with existing C++ facilities like std::future.
      • Can be adapted to work with various library types, enhancing usability across different programming scenarios.
    • Exception Safety:
      • Usable in environments where exceptions are disabled (e.g., kernel development), providing a robust alternative for error handling.

QA

1. Granularity of await in Asynchronous Code:

  • Question: How granular should you put a wait in your code to manage blocking operations effectively?
  • Explanation:
    • In asynchronous code, blocking operations should be minimized by using await on potentially blocking calls.
    • Granularity Consideration: Depending on the expected delay (e.g., nanoseconds vs. seconds), the cost of setting up a continuation with await might not always justify itself for extremely short durations.

2. Behavior of return in Coroutines:

  • Question: What happens if you issue an ordinary return statement from within a coroutine?
  • Answer: Using a plain return in coroutines is typically invalid and results in a compiler error, as coroutines require specific mechanisms (like co_return) to handle returns properly.
  • Reason: Standard return could be confusing and inadequate for managing the coroutine's state and lifecycle effectively.

3. Overloading Operator new for Coroutines:

  • Question: How would one overload operator new for coroutine usage?
  • Answer: The process isn't covered in the current session but is detailed in the specification. It involves using custom allocators to manage memory for coroutine frames efficiently, often in specialized scenarios like using coroutines in low-memory environments or systems programming.

4. Return-Type Deduction in Coroutines:

  • Question: How does return-type deduction work with coroutines?
  • Answer: Return-type deduction typically does not apply to coroutines as the compiler needs to know the exact type to construct the appropriate promise type in the coroutine frame.

5. Use of Placement new with Coroutines:

  • Question: Is there any support for a kind of placement new?
  • Answer: Generally, no. The coroutine's frame construction is managed by the compiler, which uses standard memory allocation strategies unless explicitly overridden (e.g., by specializing coroutine traits with custom allocators).

6. Optimization of Coroutine Allocations:

  • Question: How common is the optimization where the compiler elides the construction of the coroutine frame?
  • Explanation:
    • Optimizations depend on the coroutine's usage context. For instance:
      • If a coroutine is used within a limited scope and does not escape (e.g., confined within a single function), the compiler might allocate it on the stack.
      • In contrast, asynchronous operations with uncertain lifetimes typically require heap allocation.
  • Potential for Optimization: If the coroutine implements RAII and the object does not escape the calling scope, then the compiler might optimize the frame allocation.

7. Coroutine Chaining and Interaction:

  • Question: Can a coroutine call another coroutine and how does interaction occur?
  • Answer: Yes, a coroutine can call another coroutine. Each coroutine can suspend itself, but cannot suspend its caller directly (i.e., no nested suspensions across coroutine boundaries).

8. Tooling and Error Prevention:

  • Question: Upon implementation, will there be tools to help developers avoid common errors, like resuming a coroutine twice?
  • Answer: The expectation is that most developers will use library-provided coroutine adapters rather than writing raw coroutine management code. Libraries are expected to encapsulate safe practices and robust error handling to prevent common pitfalls such as double-resuming a coroutine.