Introduction
In modern C++ programming, asynchronous programming is an essential component that can significantly enhance program performance. std::async
, as an important function template in the C++ standard library for implementing asynchronous operations, provides developers with a concise and powerful way to run asynchronous tasks. This article will delve into the functionality, usage, and differences between various compiler implementations of std::async
, helping you better understand and use this powerful tool.
Basic Usage of std::async
std::async
is defined in the <future>
header file. Its basic functionality is to run a function asynchronously and return a std::future
object that holds the result of the function call.
The declaration of std::async
:
1
2
3
4
5
6
7
| template <class Fn, class... ArgTypes>
future<typename result_of<Fn(ArgTypes...)>::type>
async(Fn&& fn, ArgTypes&&... args);
template <class Fn, class... ArgTypes>
future<typename result_of<Fn(ArgTypes...)>::type>
async(launch policy, Fn&& fn, ArgTypes&&... args);
|
In the second declaration, a launch policy can be specified. std::launch
is an enumeration class.
launch::deferred
: Indicates that the function call is deferred until the wait()
or get()
function is called.launch::async
: Indicates that the function is executed on a new, independent thread. (This new thread may be obtained from a thread pool or newly created, depending on the compiler implementation.)launch::deferred | launch::async
: The default parameter for std::async
, allowing the system to decide whether to run asynchronously (create a new thread) or synchronously (not create a new thread).
Basic usage example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| int foo(int a) {
return a;
}
int main() {
// Default policy
std::future<int> f = std::async(&foo, 10);
// Launch with new thread
std::future<int> f1 = std::async(std::launch::async, []() { return 0; });
// Deferred call
std::future<int> f2 = std::async(std::launch::deferred, []() { return 0; });
std::println("result is: {}", f.get());
std::println("result is: {}", f1.get());
std::println("result is: {}", f2.get());
return 0;
}
|
I won’t elaborate further on the basic usage. Let’s focus on analyzing the detailed aspects of std::async
.
In-depth Analysis of std::async
Policies
The C++ standard does not explicitly specify the default policy for std::async
, but most compiler implementations (such as GCC, LLVM, and MSVC) have chosen std::launch::async | std::launch::deferred
as the default policy. So, what policy is actually executed under different platforms?
In GCC, the default option is launch::async|launch::deferred
:
1
2
3
4
5
6
7
8
9
| /// async, potential overload
template<typename _Fn, typename... _Args>
_GLIBCXX_NODISCARD inline future<__async_result_of<_Fn, _Args...>>
async(_Fn&& __fn, _Args&&... __args)
{
return std::async(launch::async|launch::deferred,
std::forward<_Fn>(__fn),
std::forward<_Args>(__args)...);
}
|
In practice, the selected policy will be launch::async
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
| /// async
template<typename _Fn, typename... _Args>
_GLIBCXX_NODISCARD future<__async_result_of<_Fn, _Args...>>
async(launch __policy, _Fn&& __fn, _Args&&... __args)
{
std::shared_ptr<__future_base::_State_base> __state;
if ((__policy & launch::async) == launch::async)
{
__try
{
__state = __future_base::_S_make_async_state(
std::thread::__make_invoker(std::forward<_Fn>(__fn),
std::forward<_Args>(__args)...)
);
}
#if __cpp_exceptions
catch(const system_error& __e)
{
if (__e.code() != errc::resource_unavailable_try_again
|| (__policy & launch::deferred) != launch::deferred)
throw;
}
#endif
}
if (!__state)
{
__state = __future_base::_S_make_deferred_state(
std::thread::__make_invoker(std::forward<_Fn>(__fn),
std::forward<_Args>(__args)...));
}
return future<__async_result_of<_Fn, _Args...>>(__state);
}
|
LLVM
LLVM has a special launch policy for default options called launch::any
:
1
2
3
4
5
6
7
8
| template <class _Fp, class... _Args>
_LIBCPP_NODISCARD_AFTER_CXX17 inline _LIBCPP_INLINE_VISIBILITY
future<typename __invoke_of<typename decay<_Fp>::type, typename decay<_Args>::type...>::type>
async(_Fp&& __f, _Args&&... __args)
{
return _VSTD::async(launch::any, _VSTD::forward<_Fp>(__f),
_VSTD::forward<_Args>(__args)...);
}
|
In essence, it’s a combination of launch::async
and launch::deferred
.
1
2
3
4
5
6
| enum class launch
{
async = 1,
deferred = 2,
any = async | deferred
};
|
And LLVM’s actual chosen policy will be launch::async
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
| template <class _Fp, class... _Args>
_LIBCPP_NODISCARD_AFTER_CXX17
future<typename __invoke_of<typename decay<_Fp>::type, typename decay<_Args>::type...>::type>
async(launch __policy, _Fp&& __f, _Args&&... __args)
{
typedef __async_func<typename decay<_Fp>::type, typename decay<_Args>::type...> _BF;
typedef typename _BF::_Rp _Rp;
#ifndef _LIBCPP_NO_EXCEPTIONS
try
{
#endif
if (__does_policy_contain(__policy, launch::async))
return _VSTD::__make_async_assoc_state<_Rp>(_BF(__decay_copy(_VSTD::forward<_Fp>(__f)),
__decay_copy(_VSTD::forward<_Args>(__args))...));
#ifndef _LIBCPP_NO_EXCEPTIONS
}
catch ( ... ) { if (__policy == launch::async) throw ; }
#endif
if (__does_policy_contain(__policy, launch::deferred))
return _VSTD::__make_deferred_assoc_state<_Rp>(_BF(__decay_copy(_VSTD::forward<_Fp>(__f)),
__decay_copy(_VSTD::forward<_Args>(__args))...));
return future<_Rp>{};
}
|
MSVC
For MSVC, the default option is also launch::async | launch::deferred
:
1
2
3
4
5
6
| _EXPORT_STD template <class _Fty, class... _ArgTypes>
_NODISCARD_ASYNC future<_Invoke_result_t<decay_t<_Fty>, decay_t<_ArgTypes>...>> async(
_Fty&& _Fnarg, _ArgTypes&&... _Args) {
// manages a callable object launched with default policy
return _STD async(launch::async | launch::deferred, _STD forward<_Fty>(_Fnarg), _STD forward<_ArgTypes>(_Args)...);
}
|
And the selected policy is launch::async
:
1
2
3
4
5
6
7
8
9
10
11
| template <class _Ret, class _Fty>
_Associated_state<typename _P_arg_type<_Ret>::type>* _Get_associated_state(launch _Psync, _Fty&& _Fnarg) {
// construct associated asynchronous state object for the launch type
switch (_Psync) { // select launch type
case launch::deferred:
return new _Deferred_async_state<_Ret>(_STD forward<_Fty>(_Fnarg));
case launch::async: // TRANSITION, fixed in vMajorNext, should create a new thread here
default:
return new _Task_async_state<_Ret>(_STD forward<_Fty>(_Fnarg));
}
}
|
In-depth Analysis of std::launch::async
We know that std::launch::async
indicates the function is executed on a new, independent thread. However, the C++ standard does not specify whether the thread is a new thread or a thread reused from a thread pool.
GCC
GCC calls __future_base::_S_make_async_state
, which creates an instance of _Async_state_impl
. Its constructor starts a new std::thread
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
| // Shared state created by std::async().
// Starts a new thread that runs a function and makes the shared state ready.
template<typename _BoundFn, typename _Res>
class __future_base::_Async_state_impl final
: public __future_base::_Async_state_commonV2
{
public:
explicit
_Async_state_impl(_BoundFn&& __fn)
: _M_result(new _Result<_Res>()), _M_fn(std::move(__fn))
{
_M_thread = std::thread{ [this] {
__try
{
_M_set_result(_S_task_setter(_M_result, _M_fn));
}
__catch (const __cxxabiv1::__forced_unwind&)
{
// make the shared state ready on thread cancellation
if (static_cast<bool>(_M_result))
this->_M_break_promise(std::move(_M_result));
__throw_exception_again;
}
} };
}
|
LLVM
LLVM calls _VSTD::__make_async_assoc_state
, which also starts a new std::thread
:
1
2
3
4
5
6
7
8
9
10
11
12
13
| template <class _Rp, class _Fp>
future<_Rp>
#ifndef _LIBCPP_HAS_NO_RVALUE_REFERENCES
__make_async_assoc_state(_Fp&& __f)
#else
__make_async_assoc_state(_Fp __f)
#endif
{
unique_ptr<__async_assoc_state<_Rp, _Fp>, __release_shared_count>
__h(new __async_assoc_state<_Rp, _Fp>(_VSTD::forward<_Fp>(__f)));
_VSTD::thread(&__async_assoc_state<_Rp, _Fp>::__execute, __h.get()).detach();
return future<_Rp>(__h.get());
}
|
MSVC
Here’s where it gets interesting! MSVC creates an instance of _Task_async_state
, which creates a concurrent task and passes a callable function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| // CLASS TEMPLATE _Task_async_state
template <class _Rx>
class _Task_async_state : public _Packaged_state<_Rx()> {
// class for managing associated synchronous state for asynchronous execution from async
public:
using _Mybase = _Packaged_state<_Rx()>;
using _State_type = typename _Mybase::_State_type;
template <class _Fty2>
_Task_async_state(_Fty2&& _Fnarg) : _Mybase(_STD forward<_Fty2>(_Fnarg)) {
_Task = ::Concurrency::create_task([this]() { // do it now
this->_Call_immediate();
});
this->_Running = true;
}
|
::Concurrency::create_task
is part of Microsoft’s Parallel Patterns Library. According to MSDN documentation, the task
class gets threads from the Windows ThreadPool instead of creating a new thread.
So it’s important to note that ThreadPool-based implementations cannot guarantee that thread_local
variables will be destroyed when a thread completes, because threads acquired from a thread pool aren’t destroyed. As a result, you’ll find that after using std::async
, the thread is not destroyed or released. This is equivalent to borrowing a thread from the system thread pool, which counts towards the user thread count, but this thread is not released, leading to the phenomenon that the more std::async
is used, the more threads accumulate.
The number of concurrent threads executed by std::async
is limited to the Windows thread pool default, which is 500 threads.
In-depth Analysis of std::future
Returned by std::async
According to cppreference:
If a std::future
obtained from std::async
is not moved from or bound to a reference, the destructor of the std::future
will block at the end of the full expression until the asynchronous computation completes, essentially making the following code synchronous:
1
2
| std::async(std::launch::async, []{ f(); }); // destructor of the temporary waits for f()
std::async(std::launch::async, []{ g(); }); // does not start until f() completes
|
Note: The destructor of a std::future
obtained by means other than a call to std::async
does not block.
That is, the behavior of the destructor for a std::future
returned by std::async
differs from that of a std::future
obtained from a std::promise
. When these std::future
objects are destroyed, their destructors call the wait()
function, causing the thread generated at creation to join the main thread.
Using MSVC code as an example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| ~_Task_async_state() noexcept override {
_Wait();
}
void _Wait() override { // wait for completion
_Task.wait();
}
void WaitUntilStateChangedTo(_TaskCollectionState _State)
{
::std::unique_lock<::std::mutex> _Lock(_M_Cs);
while(_M_State < _State)
{
_M_StateChanged.wait(_Lock);
}
}
|
When _Task_async_state
is destroyed, it calls wait()
, which ultimately leads to _M_StateChanged.wait(_Lock);
, which is the wait()
function of a condition variable.
The implementation varies across platforms. In GCC and LLVM:
1
2
3
4
5
| ~_Async_state_impl()
{
if (_M_thread.joinable())
_M_thread.join();
}
|
During destruction, it waits for the thread to join()
and complete execution.
Conclusion
std::async
is a high-level thread abstraction tool in the C++ standard library that simplifies the implementation of asynchronous operations and makes code more concise. However, due to differences between compiler implementations, developers need to carefully consider these factors when using it to avoid potential issues. Special attention should be paid to thread_local
variables and the returned std::future
.
Implementations of std::async and how they might Affect Applications | Dmitry Danilov
functions | Microsoft Learn
std::async - cppreference.com
《Asynchronous Programming with C++》