I presume that pretty much everybody who has been in contact with the java programming language either in studies or professionally even for a little bit, knows that duplication is a code smell.
We all know that some consequences of duplication are: less maintainability, less readability, the code is more likely to contain bugs, performance issues... and many more.
I think there should not be need to explain what duplication is, but just in case let's write only one line to define it and also make sure that we understand it:
Are there any doubts until this point?...
Please go back and read once more the definition if you still are not sure about it.
So, how do we fix duplication?
But if it is so easy to understand what duplication is and also how simple it is to refactor, why do we find it all over the place, no matter what software we are looking at.
Well... there are many reasons why there is duplication all over the place:
So, what do you think?... Any of this may be the reason why you are not fixing duplication?
But if it is the 4th, the reason why you are not fixing the duplication, then you might find the following lines very interesting....
e.g
The most common way to fix this type of duplication is to create a local variable. Let's have a look at a possible refactor:
Structural duplication
We can recognize this situation when the logic is duplicated but it operates on different data.
See this situation in the following example:
e.g
As you can see I toke this code from the previous example. The assertThat() operation is identical, the only difference is the data that it uses. Let's have a little look at a way how to fix structural duplication:
Now that we extracted the method that does the assertion we no longer have duplication.
The definition of semantic duplication teaches us that duplication can be invisible from the point of view of the code("The definition from the start of this post makes a way more sense now, uh? :)" ). This kind of duplication is probably one of the most difficult to spot.
e.g
At first glance, when we look at the two methods in the code above we cannot really see much duplication. No matter how much we look at it, we will not see it. Semantic duplication is invisible from the implementation point of view. To spot it what we need to do is spot the behavioral anti pattern that is hidden in the code. If we think about it, basically the repetition going on is that both methods iterate one list of the same type and then apply certain criteria(Completely different implementations but one same concept).
Semantic duplication is more difficult to fix than other types of duplication and often requires a bigger refactoring effort. Here is one possible solution that helps us get rid of it using Java generics:
Duplication is not a little topic at all. The 3 types of duplication I mentioned in this post are from my point of view some of the most common but there are many more. Here a link where I found where at the bottom you can find a mention to other kinds of duplication: http://blogs.agilefaqs.com/tag/code-smells/
Just to conclude this post, I want to say that I have the impression that duplication many times is an underestimated code smell that undetected grows and grows and ends up transforming systems into spaghetti monsters.
We all know that some consequences of duplication are: less maintainability, less readability, the code is more likely to contain bugs, performance issues... and many more.
I think there should not be need to explain what duplication is, but just in case let's write only one line to define it and also make sure that we understand it:
"Duplication is the existence of multiple copies of a single concept"
Are there any doubts until this point?...
Please go back and read once more the definition if you still are not sure about it.
So, how do we fix duplication?
But if it is so easy to understand what duplication is and also how simple it is to refactor, why do we find it all over the place, no matter what software we are looking at.
Well... there are many reasons why there is duplication all over the place:
- sometimes we just rush to get things done and we don't care about it.
- maybe we think that refactoring is boring or is not profitable, so we have no patience to do it.
- other times we think there are always more important things to do and duplication is not really affecting us.
- also sometimes when we want to do something about it, we just can't see it even if it is in front of us.
So, what do you think?... Any of this may be the reason why you are not fixing duplication?
If your reason is any of the first 3, I am sorry, but this blogpost will not be of any use to you and the best you can do is close this browser tab or surf somewhere else.
But if it is the 4th, the reason why you are not fixing the duplication, then you might find the following lines very interesting....
This is the secret why many programmers sometimes find it so difficult to deal with duplication...
There are different kinds of duplication, this are some of the most important:
- Literal
- Structural
- Semantic
Literal duplication
This is a very easy to spot kind of duplication. It is basically literal values that are repeated in our code.
e.g
public class RepeaterSpecification {
@Test
public void repeatsInput() {
assertThat("value",is(repeat("value")));
}
@Test
public void repeatsEmptyInput() {
assertThat("",is(repeat("")));
}
//...
}
public class RepeaterSpecification {
@Test
public void repeatsInput() {
String input = "value";
assertThat(input,is(repeat(input)));
}
@Test
public void repeatsEmptyInput() {
String input = "";
assertThat(input,is(repeat(input)));
}
//...
}
It looks like we no longer have literal duplication. But unfortunately as you probably noticed, sometimes when fixing one type of duplication we generate another type of duplication. Keep reading to find out more...Structural duplication
We can recognize this situation when the logic is duplicated but it operates on different data.
See this situation in the following example:
e.g
public class RepeaterSpecification {
@Test
public void repeatsInput() {
String input = "value";
assertThat(input,is(repeat(input)));
}
@Test
public void repeatsEmptyInput() {
String input = "";
assertThat(input,is(repeat(input)));
}
//...
}
As you can see I toke this code from the previous example. The assertThat() operation is identical, the only difference is the data that it uses. Let's have a little look at a way how to fix structural duplication:
public class RepeaterSpecification {
@Test
public void repeatsInput() {
assertThatInputIsRepeated("value");
}
@Test
public void repeatsEmptyInput() {
assertThatInputIsRepeated("");
}
private void assertThatInputIsRepeated(String input) {
assertThat(input,is(repeat(input)));
}
//...
}
Now that we extracted the method that does the assertion we no longer have duplication.
Semantic duplication
This is the situation where different code implementations represent the same functionality or concept.The definition of semantic duplication teaches us that duplication can be invisible from the point of view of the code("The definition from the start of this post makes a way more sense now, uh? :)" ). This kind of duplication is probably one of the most difficult to spot.
e.g
public class TeamValidator {
public boolean isThereALeader(List<Members> team) {
Iterator<Member> iterator = team.getIterator();
while(iterator.hasNext()) {
Member member = iterator.next();
String role = member.getRole();
if(role.equals("Leader"))
return true;
}
return false;
}
public boolean areThereAtLeast2NewJoiners(List<Members> team) {
for(Member member:team) {
DateTime aMonthAgo = DateTime.now().minusMonths(1);
if(member.startingDate().isAfter(aMonthAgo))
return true;
}
return false;
}
//...
}
At first glance, when we look at the two methods in the code above we cannot really see much duplication. No matter how much we look at it, we will not see it. Semantic duplication is invisible from the implementation point of view. To spot it what we need to do is spot the behavioral anti pattern that is hidden in the code. If we think about it, basically the repetition going on is that both methods iterate one list of the same type and then apply certain criteria(Completely different implementations but one same concept).
Semantic duplication is more difficult to fix than other types of duplication and often requires a bigger refactoring effort. Here is one possible solution that helps us get rid of it using Java generics:
public class TeamValidator {
public boolean isThereALeader(List<Member> team) {
return new LeaderVerifier<Member>().evaluate(team);
}
public boolean areThereAtLeast2NewJoiners(List<Member> team) {
return new NewJoinersVerifier<Member>().evaluate(team);
}
}
public abstract class LoopEvaluator<T> {
public boolean evaluate(List<T> list) {
for (T element : list) {
if(evaluateElement(element)) {
return true;
}
}
return false;
}
public abstract boolean evaluateElement(T element);
}
public class LeaderVerifier<T extends Member> extends LoopEvaluator<T> {
@Override
public boolean evaluateElement(T element) {
return element.getRole().equals("leader");
}
}
public class NewJoinersVerifier<T extends Member> extends LoopEvaluator<T> {
private int newJoiners;
@Override
public boolean evaluate(List<T> list) {
this.newJoiners = 0;
return super.evaluate(list);
}
@Override
public boolean evaluateElement(T element) {
DateTime aMonthAgo = DateTime.now().minusMonths(1);
if(element.startingDate().isAfter(aMonthAgo))
this.newJoiners++;
return newJoiners == 2;
}
}
Duplication is not a little topic at all. The 3 types of duplication I mentioned in this post are from my point of view some of the most common but there are many more. Here a link where I found where at the bottom you can find a mention to other kinds of duplication: http://blogs.agilefaqs.com/tag/code-smells/
I am the spaghetti monster, You cant get rid of me!!!! Hahahaha......
Just to conclude this post, I want to say that I have the impression that duplication many times is an underestimated code smell that undetected grows and grows and ends up transforming systems into spaghetti monsters.
Lets get rid of it while it is just a bit between your teeth! ;)