Question for people more familiar with async/await semantics, how do you typically control what thread the async procedure runs on? Having spent a lot of time now in rxJava and really getting into the power of stream processing and composition, it feels like async/await is almost too simplistic.
async/await produces a Future. In order for that Future to execute, you place it on an executor. It won't start executing until you do so. Which thread it runs on, and how, is all the executor's job. So the answer is basically "pick the executor which has the semantics you want".
I never liked this behavior in Java. Or rather, I wasn't sure it was right or wrong and then I found code where people returned a future without starting it, causing the caller to wait forever in some situations.
Also if you use multiple modules with their own executors, they don't compose well. You can easily get oversubscription of CPUs.
Javascript's promises are self starting, which I would say is an improvement but the total lack of back pressure is anywhere from galling to highly problematic depending on the problem domain.
Goldilocks solution might be to have default executor behavior with an obvious and straightforward way to override it.
In .NET it defaults to scheduling continuations in the "main" thread as much as possible, with the easiest escape valve being Task<T>.ConfigureAwait(false) that tells it that it can finish continuations wherever they land in the ThreadPool. This is why you can often find a lot of (partly mistaken) advice in .NET to always use ConfigureAwait(false) for "performance" which as often as not leads to people rediscovering the hard way why the easy default is continuations on Main/UI threads. There actually is a more complex TaskScheduler mechanism underlying that big ugly valve, but TaskScheduler is not quite as featureful or easy to control as the RX-equivalent scheduler.
(I heard some rumblings that the .NET team was considering in the long run to better unify async/await TaskScheduler scheduling with the Scheduler model of RX.NET, but I don't know if anything has progressed yet along those lines.)
Python mostly requires manual management of event loops for async/await coroutine scheduling. Most Python code, even/especially using async/await is still generally single-threaded (due to artifacts like the GIL [Global Interpreter Lock] that are getting better with each passing version of Python), so more complicated scheduling is both generally not necessary and also left as an exercise to the user.
ECMAScript 2017 (JavaScript) async/await is also mostly running in single threaded event loops (though not manually managed like Python; generally browser UI event loop managed).
In C#, your tasks are submitted to the ambient task scheduler. In ASP.NET, this is (I believe) the main thread pool that handles requests. The only time I've ever had to care where something was running is when adding response headers because that's available as a thread local API that can't happen on another thread, but it's generally ended up as a "it hurts when I do this" kind of situation.
In C#, this is defined by the current thread's synchronization context.
If you're writing a console application, the main thread doesn't have one. When an async function wants to resume, it will usually resume on some thread pool's thread. Potentially different one each time. Can be any other thread too, the runtime just resumes running the function on the same thread which completed the await.
But if you're writing a GUI app and launching an async function from its GUI thread, that thread has a current sync.context. When the function will want to resume running or fails, the runtime will resume/raise exception on the same thread where it started. More precisely, the runtime delegates the decision to the sync.context, and the contexts set by GUI frameworks choose to resume on the GUI thread.
This may cause some funny deadlock bugs, but in most cases works surprisingly well in practice.
Also it's easy to implement custom synchronization contexts if needed.
Even there, you only care that you aren't running on a particular thread, just NOT on one particular thread. You still pass it to an executor that has it's own thread pool and shouldn't be terribly concerned with the particular thread.
Java parallel streams use a global thread pool by default. The problem with this is that if you mix IO bound code and CPU bound code in the same thread pool the IO bound code will block the CPU bound code. If you don't explicitly choose the thread pool your code runs on then you will suffer from impossible to debug performance problems.