On Indexers In C#

C# is a programming language that provides certain syntactical features that we can use when designing members of our types. One of the probably well-known ones is properties, but in this article, I will provide my thoughts on indexers.

What are indexers?

Indexers are basically a form of properties that allow us to access instance members of types as if they were items of an array - by using array notation (square brackets) on an instance of the type.

Let us say that we have a collection of software engineer entities.

A common thing we are doing when programming is passing an explicit collection/sequence of objects throughout the system. We might be passing a collection of software engineers to a method, and then the method would operate and perform logic on the collection.

This is something that we might be doing in multiple places. The data and functions (methods) that operate on that data might not be well encapsulated/bundled together into a single and cohesive unit, which sometimes can be viewed as a bad design.

A better design, in which we leverage the power of abstraction and OOP, would be the one where we view the collection of software engineers as a type on its own.

In these kinds of scenarios, indexers can come pretty handy.

public interface ISoftwareEngineer
{
    string WorkEmail { get; }
}
public class SoftwareEngineer : ISoftwareEngineer
{
    public string WorkEmail { get; }

    public SoftwareEngineer(string workEmail) => WorkEmail = workEmail;
}
public class NullSoftwareEngineer : ISoftwareEngineer
{
    public string WorkEmail { get; }

    public NullSoftwareEngineer() => WorkEmail = string.Empty;
}
public class SoftwareEngineers
{
    private ISoftwareEngineer[] softwareEngineers = new SoftwareEngineer[] 
                                                    { 
                                                        new SoftwareEngineer("mark@itcompany.com"),
                                                        new SoftwareEngineer("john@itcompany.com"),
                                                        new SoftwareEngineer("rose@itcompany.com")
                                                    };

    public ISoftwareEngineer this[int index]
    {
        get => TryGetItem(index);

        set => TrySetItem(index, value);
    }

    private ISoftwareEngineer TryGetItem(int index)
    {
        try
        {
            return softwareEngineers[index];
        }
        catch(IndexOutOfRangeException)
        {
            return new NullSoftwareEngineer();
        }
    }

    private void TrySetItem(int index, ISoftwareEngineer softwareEngineer)
    {
        try
        {
            softwareEngineers[index] = softwareEngineer;
        }
        catch(IndexOutOfRangeException)
        {
            return;
        }
    }
}

Since it might be hard to distinguish between SoftwareEngineer and SoftwareEngineers, we could also rename the latter to SoftwareEngineerCollection.

The client code would then use array notation to access software engineers from the underlying array.

    var softwareEngineers = new SoftwareEngineers();
    var rose = softwareEngineers[2];

However, nothing magical happens under the hood.

The compiler will do basically the same thing it does when we define properties. It creates a backing field and corresponding accessor methods (commonly known as getters and setters).

With that said, the previous example would be equivalent to the following one, which we could see if we were to inspect the generated MSIL.

public class SoftwareEngineers
{
    private ISoftwareEngineer[] softwareEngineers = new SoftwareEngineer[] 
                                                    { 
                                                        new SoftwareEngineer("mark@itcompany.com"),
                                                        new SoftwareEngineer("john@itcompany.com"),
                                                        new SoftwareEngineer("rose@itcompany.com")
                                                    };

    private ISoftwareEngineer Item;

    public ISoftwareEngineer get_Item(int i) => TryGetItem(i);

    public ISoftwareEngineer set_Item(int i, ISoftwareEngineer value) => TrySetItem(i, value);

    private ISoftwareEngineer TryGetItem(int index)
    {
        try
        {
            return softwareEngineers[index];
        }
        catch(IndexOutOfRangeException)
        {
            return new NullSoftwareEngineer();
        }
     }

    private void TrySetItem(int index, ISoftwareEngineer softwareEngineer)
    {
         try
         {
             softwareEngineers[index] = softwareEngineer;
         }
         catch(IndexOutOfRangeException)
         {
             return;
         }
     }
 }

The backing field is named Item by default, although we can change that by using the IndexerName attribute from the System.Runtime.CompilerServices namespace, if it makes sense.

When using indexers we are actually implicitly calling one of the two accessor methods.

So indexers are a relatively neat way to access items of an underlying array.

Imagine now that, for whatever the reason might be, we decided to change the underlying data structure to be a linked list instead of an array. A linked list is a linear data structure as well, so why not?

What kind of issues would that create in the parts of the system that use indexer on SoftwareEngineers instances?

This is where we get to the potential issue (if we can call it an issue) I see with indexers, which comes out of a very fundamental computer science knowledge.

When we are using the array notation and when there exists the sense of an index, we are commonly expecting a constant-time lookup on an indexable data structure, such as an array or a data structure based on an array, such as a dynamic array (List).

This is the most important characteristic of arrays, due to the way they are stored in the memory - the items of an array are adjacent in a contiguous/uninterrupted chunk of the memory.

If we think this way, then not paying attention to indexer design, together with the underlying data structure choice, might lead to an interface that is, in a way, misleading.

The client code might assume constant time lookup which, as a consequence, might result in performance issues.

Imagine that the client code was using the indexer in a very dumb algorithm that runs the loop N times (N being the count of software engineers), and prints the last software engineer each time.

Algorithm()
    N = softwareEngineers.Count;
    for i in range 0 to N-1
        print softwareEngineers[N-1];

We would expect the worst-case time complexity to be linear - in each of the N cycles of the loop, we are using the indexer that provides constant-time lookup. This results in linear overall worst-case time complexity - O(N). This would be the case with an array as the underlying data structure.

How would the linked list change affect this part of the system?

Linked lists are not indexable - they do not provide constant-time lookups based on an index. A way to mimic this operation is to traverse the list from the beginning to the end, keeping the count of the nodes and returning the value when the counter matches the index. The worst-case time complexity of this operation would be linear.

In each of the N cycles of the loop, the indexer usage would result in the traversal of the underlying linked list from the beginning to the end, resulting in quadratic overall time complexity - O(N^2).

This could represent an issue if we are programming in an environment where performance plays an important role. The client code might be using indexers easily, without being aware of the performance implications. Everything because it assumed that the cost of the index-based access is constant (which should be).

Most of us will probably never run into issues of this type in our careers, but it is still worth having it in the mind.

In the previous example, the indexer accepted an index argument which was an integer.

The client code might want to access software engineers in a more convenient manner, perhaps by their work emails, instead of a true index.

We can redesign our indexer so that it accepts the work email string instead.

But, if we want to have the constant-time lookup based on the email value, then we should change the underlying data structure to be a hash table (Dictionary), for example, that uses work email as the key.

I do not think indexers should be part of public interfaces where the underlying data structure does not support constant-time lookups - whether the lookup is based on an array index, or on some other index-like value (key). To me personally, it somehow does not feel right.