Skip to content
Go back

The Perils of Premature Abstraction: When Code Duplication Isn't a Problem

Posted on:December 17, 2024 at 10:52 PM (5 min read)

Table of contents

Open Table of contents

Introduction

In the world of software development, we’re often taught to avoid code duplication. The DRY (Don’t Repeat Yourself) principle is drilled into us. While this principle has its merits, blindly applying it can lead to a common pitfall: premature abstraction. This post explores why abstracting and centralizing duplicated code too early, especially before understanding future requirements, can be more harmful than helpful.

The Temptation of DRY

The urge to eliminate duplicated code is strong. It feels cleaner, more organized, and more “professional.” When we see the same snippet of code repeated in multiple places, the immediate reaction is often to refactor it into a common function or class. This feels like a step towards better code.

However, this approach assumes that the duplicated code will always need to be the same across all instances. This assumption is dangerous because it can lead to brittle and tightly coupled systems.

Why Premature Abstraction is Risky

The issue with abstracting too early is that it often locks you into a particular implementation before you have a clear understanding of future needs. Here’s why it can be problematic:

  1. Reduced Flexibility: Once you’ve created a central abstraction, changing its behavior affects every place that uses it. If requirements diverge in the future (and they almost always do), you might have to either undo the abstraction or add complex conditional logic to it, which adds to code complexity.
  2. Tight Coupling: By centralizing code, you tightly couple different parts of your application, making it harder to make changes in one part without impacting others.
  3. Code Bloat: Attempting to make a single abstraction handle multiple slightly different use cases can lead to bloated code with complex parameters and conditional execution paths, which is far more complex than having a little bit of duplication.

Example: A Simple Logging Function

Imagine you have a simple logging function in two different modules:

// Module A
public void logEventA(String message) {
    System.out.println("Module A: " + message);
}

// Module B
public void logEventB(String message) {
    System.out.println("Module B: " + message);
}

Initially, they both just print to the console. The DRY principle might suggest extracting the System.out.println into a central logger function.

// Central logger
public void log(String module, String message) {
    System.out.println(module + ": " + message);
}

// Module A
public void logEventA(String message) {
    log("Module A", message);
}

// Module B
public void logEventB(String message) {
    log("Module B", message);
}

Now what happens when we get the requirement that Module A should log to a file while Module B continues to log to the console? We now need to either undo our abstraction or add more complexity to the central logger. You would have been better off keeping this duplicated.

The Right Time to Abstract

The key is to only abstract when you know that the different places using the code are likely to stay in sync, or if it’s so central to your application that you know that it’s going to be reused in a uniform way throughout the whole application. Here are a few guidelines for when it’s safe to abstract duplicated code:

Refactoring vs. Early Abstraction

Instead of immediately abstracting duplicated code, focus on refactoring. Refactoring means improving the internal structure of your code without changing its external behavior. If you find yourself duplicating code, ask yourself the following questions:

If the answer to any of these is no, hold off on abstracting. Refactor to improve code clarity first, and only abstract when you’re confident.

When Is Code Duplication Acceptable?

Code duplication is not always evil. It can be a pragmatic way to keep your code flexible and adaptable when you don’t know what the future holds. It’s okay to have duplicated code if:

Conclusion

The key is to be thoughtful about when you choose to abstract code. The DRY principle should guide you, not dictate you. Premature abstraction can be more harmful than good and introduce rigidity into your system. Only abstract when you are sure that the duplication you see is not going to diverge in the future. Sometimes, it is okay to keep the code duplicated for simplicity and maintainability. Remember, a little bit of duplicated code is much better than an over-engineered, tightly coupled, and complex abstraction that does not fit future needs.