Pull requests. A way to have an insight on a new feature, discuss design decisions, and ensure codebase quality. If done with the proper attitude and people, also a way to broaden your technical horizons and grow as a professional.
Recently, on one of the projects that used Entity Framework Core as an ORM, and as part of a pull request review, I noticed a thing that I probably have not paid attention to, nor have I seen it up to that point at all. (It is a relatively new feature in EF.)
In one of the repository classes, there was an AddAsync() method call on a DbContext instance.
At first, it confused me a bit - why would something, what essentially is a change to an in-memory data structure(s) that represent the state of the database, be asynchronous? That definitely did not sound like some I/O bound operation.
Then I immediately thought that it might be doing some query to the database as part of DbContext changes - perhaps it queries for the last generated ID or something like that.
In order to see what was going on, I went to the GitHub project of Entity Framework Core, opened up DbContext class, and searched for AddAsync() method. It turned out I was right to a certain degree. The documentation explained why the method was designed as async:
This method is async only to allow special value generators, such as the one used by 'Microsoft.EntityFrameworkCore.Metadata.SqlServerValueGenerationStrategy.SequenceHiLo' to access the database asynchronously. For all other cases, the non-async method should be used.
Even though we were not using that particular key generation strategy, it perhaps made sense to use the AddAsync() method there, as opposed to Add(), for flexibility reasons. If we were to change the key generation strategy in the future to be SequenceHiLo, there would be no need for a lot of code changes.
Good software design is the one that accommodates changes relatively easily, and we should generally always aim for it.
I started discussing some of these things with one of my fellow colleagues. The key generation strategy was, and is, almost never SequenceHiLo . That meant that we would be using an asynchronous method but it would basically always execute synchronously.
When we have an asynchronous code the compiler makes a state machine out of it. There is memory allocated for the state machine, task method builder, and in addition to that, there will be memory allocated for the returning Task or Task of a generic type T. We also cannot reuse the returning Task in case of multiple method invocations - each invocation will imply the allocation overhead.
With that in mind, we asked ourselves if we would be introducing unnecessary Task allocation overhead and work for the compiler, because the execution of the method, as I have said, would basically always be synchronous.
A naive answer is - yes.
A better one is - it depends. In most cases, we can probably get away with it. But if our code is performance-critical, and we can prove the performance suffers - then we should probably think about the optimization solutions.
There is a thing that can help us in these types of scenarios. Take a look at the signature of the AddAsync() method:
public virtual async ValueTask<EntityEntry<TEntity>> AddAsync<TEntity>(...)
The return type is not Task, instead, it is ValueTask.
ValueTask is a value-type counterpart of Task - it is a struct. In functional programming languages, there is a type that is commonly called Either. ValueTask is a similar type in the sense that it represents an abstraction of either the type T or a Task of type T.
The reason why the return type was changed from Task to ValueTask, in the AddAsync() method, is located in one of the EF Core release notes:
This change reduces the number of heap allocations incurred when invoking these methods, improving general performance.
A slightly better explanation of the ValueTask usage, than the one from the release notes above, is stated in the documentation:
A method may return an instance of this value type when it's likely that the result of its operation will be available synchronously, and when it's expected to be invoked so frequently that the cost of allocating a new Task for each call will be prohibitive.
The words that echoed through my mind were likely and expected.
I associated these words with probability/possibility, and it helped me understand the context of ValueTask usage.
What am I talking about? - Let me try to explain.
There is one common scenario where the usage of ValueTask might come in handy. An example of that scenario is when we are wrapping our repository with an in-memory cache proxy/decorator, and when the number of cache hits is relatively big - meaning the data lives in the cache for a relatively long time and the method is invoked relatively frequently.
Consider the following, improvised, code.
public class CachedEntityRepository
{
private InMemoryCache _inMemoryCache;
private EntityRepository _entityRepository;
// ctor
public async ValueTask<Entity> FindAsync(Guid entityId)
{
if(_inMemoryCache[entityId] is not null)
return _inMemoryCache[entityId];
var entity = await _entityRepository.FindAsync(entityId)
_inMemoryCache[entityId] = entity;
return entity;
}
{
For one asynchronous entity retrieval, we would be getting a large number of synchronous ones, considering the things we mentioned before.
So the probability of the code taking the synchronous path (cache hits) would in general be higher than the probability of taking the asynchronous path, therefore, we would expect the synchronous completion to be the more common case.
If the return type was Task we would be paying allocation overhead even for all of the times the method completes synchronously.
Using ValueTask as a return type makes sense when designing asynchronous APIs where the implementations will actually be, or there is a high probability for them to be, synchronous.
I believe that there is more to it and probably more lower-level details to consider, but this is how I view it.
There are also certain things that we need to pay attention to, certain restrictions on the ValueTask usage. Stephen Cleary explained those in this article, so give it a look.
*The cover picture is a picture of wild horses in Livno, Bosnia & Herzegovina, taken from here.