Mar '18

14

Refactoring, Now With Generics!

Refactoring is one of the most satisfying programming tasks. It can be difficult, especially in a large or unfamiliar codebase, but I believe thinking critically about your code is beneficial. There is almost always a low-hanging fruit -- fix formatting, naming convention, remove duplication, etc. If you aren't careful however, you can easily refactor in circles and never actually improve anything.

"Quality" is completely subjective, so you should also spend time pondering where your opinions lay and why you think that way. Lively banter with coworkers about design choices and their reasons can be very informative, but I've found that there is no singular Correct Way, especially in coding. Each choice comes with trade-offs that may not be completely evident until much later. And what do you do when you realize six-months-ago You made a mistake? Refactor!

The Beast

Around the middle of last year, my team inherited a massive monolithic Java product and was tasked with stabilizing and improving the platform. My goal was to not only improve the perceived functionality, but also to clean up the code in as many ways as possible.

Note: All examples in this post are super contrived and are only vaguely similar to the real code.

This project is made up of several layers of co-dependent libraries, and the data model (entities, whatever you like calling them) consists of several related types. One major eyesore is that each layer defines a specialized subclass for each of these types. Imagine a Website, which might have multiple Campaigns, and both are defined in the base data access library. A tracking library then builds on top of both of those types, introducing its own subclass of each. Then a messaging library builds on top of those, adding another pair of subclasses. Repeat about 7 times and welcome to my reality!

The immediate issue I have with these kinds of "shared" inheritance hierarchies, especially in a language like Java, is that you end up casting things everywhere. For example, given a core.Campaign with a public core.Website getWebsite() method, every time you use that method in a subclass of core.Campaign, you get a core.Website. Concretely, if you call getWebsite() inside the tracking.Campaign subclass, you will receive a core.Website. Because of the shared hierarchy, it is expected that the actual instance you get will be an instance of tracking.Campaign. You just have to cast it:

package com.acme.tracking;

class Campaign extends com.acme.core.Campaign {
    public void contrivedExample() {
        // We are calling the getWebsite method defined in the core.Campaign superclass. 
        // It returns a core.Website! But we "know" its really a tracking Website. We hope.
        Website website = (Website) getWebsite();

        // do something worthwhile...
    }
}

Right off the bat your olfactories are assaulted with the pungent aroma of wonky code. See, Java is a statically- and strongly-typed language, so this means that the compiler can generally do a great job alerting you to incompatible types. But the moment you're forced to cast, you remove any compile-time guarantees. You have introduced a possible runtime fault. Neat, huh?

The Beauty

Now, as you can imagine, this platform has a ton of baggage. We can't just go around changing APIs, we have upstream and downstream dependencies that rely on our code. That is, we can't just fix the issue at hand by removing the crazy cross-project inheritance hierarchy. That would break an uncountable (literally, we can't be sure) number of other projects. This design decision really is baked-in to the platform at this point, and trying to fight it too much is probably a waste of time.

But that doesn't mean you can't improve things!

A Short Aside

Back in prehistoric times, the folks working on Java decided to implement non-reified generics. Reification, in this context, describes the compiler's and runtime's ability to determine, track, and enforce type information. Java's implementation of generics is non-reified because the compiler actually strips type information from the generics-- the runtime cannot know what these types were originally!

In a nutshell, when you write List<String>, the compiler can use the type parameter (String in this case) to ensure that no tomfoolery occurs with types at compile-time. But the runtime itself has no knowledge of this information and therefore cannot make any assurances about what a generic type contains when the code is actually executed. You may yourself have had the pleasure of experiencing an Integer in your List<String> in not-very-unusual circumstances.

You might imagine generics as sort of the "evil" twin of a type-cast operation. Instead of checking types at runtime with a cast, we do it at compile time and skip the runtime check. But, if you're like me, you vastly prefer knowing about incorrect types at compile-time, not at run-time (sometimes known as "production"). So in my opinion, this trade-off is worth it.

Back to the Beauty

Is there a way we can utilize generics to help ease the situation and provide stronger compile-time guarantees? In this next example, we are looking at another type that represents normalized information about an incoming HTTP request. This CampaignContext is passed around to various services that need to be able to get the current request's core.Campaign (or a subclass thereof).

package com.acme.app.context;

public abstract class CampaignContext implements com.acme.core.CampaignContext {
    private com.acme.core.Campaign campaign;

    public com.acme.core.Campaign getCampaign() {
        return campaign;
    }

    public void setCampaign(com.acme.core.Campaign campaign) {
        this.campaign = campaign;
    }
}

Notice that this class exists in an application-- very obviously denoted by the com.acme.app package name-- nothing else is using it beyond this app. Lots of things need that com.acme.core.CampaignContext interface, but we have a little wiggle room because we are using our own implementation of the interface. What does this mean? It means we can change it with no (okay, very little) impact!

package com.acme.app.context;

public abstract class CampaignContext<T extends com.acme.core.Campaign>
        implements com.acme.core.CampaignContext {

    private T campaign;

    @Override
    public T getCampaign() {
        return this.campaign;
    }

    public void setCampaign(T campaign) {
        this.campaign = campaign;
    }
}

Note that all references to com.acme.core.Campaign have vanished except for in the class-level type parameter constraint. This effectively forces any subclass of this CampaignContext to provide this type parameter, and the type specified must be an instance of core.Campaign.

Now, when we use this CampaignContext, we will leverage the compiler's ability to ensure that the type parameter constraint is enforced, and therefore calling code will always get the expected type (and not a core.Campaign)!

package com.acme.app.servlet;

import com.acme.tracking.Campaign;
import com.acme.app.context.CampaignContext;

public class TrackingContext extends CampaignContext<Campaign> {
    public void contrivedAf() {
        // Look, ma, no casts!
        Campaign campaign = getCampaign(); // Notice this is a tracking.Campaign!

        // do some tracking specific thing with the tracking.Campaign
    }
}

We have introduced a mostly* backward-compatible change to the abstract CampaignContext class. All we need to do is update every CampaignContext subclass to specify which subclass of core.Campaign it needs. In our application, we had a limited number of CampaignContext subclasses so we felt comfortable updating each. Code that called those contexts didn't need to change, except we could now remove the runtime casts!

Finito

This was a long post, but hopefully I've been able to convey my ideas in a digestible way. Let me know why you think our decisions were right or wrong, I'd love to hear counter-opinions.

 

* For what it's worth, you can avoid the BC break entirely by leaving the old CampaignContext implementation in place and using the new one on an opt-in basis as you make changes to other parts of the application.

javaProgramming

Comments

No comments yet! Say something.