Let me ask you something. You are the reviewer in a PR that creates a simple calculator which needs to know how to add numbers. Just that. There are no reasons to make us think that the calculator will need more functionality.
Despite that the PR includes a calculator that can add, subtract and multiply. What do you do as a reviewer?
I want to believe that you will, respectfully, discuss the removal of the extra functionality otherwise the calculator will violate the YAGNI principle and add (a) more code for the developers to maintain and (b) more ways to couple the project with the calculator. And all that with no immediate benefit.
Do we need all that functionality?
The same goes with data classes. There is no reason to have a class that can be uniquely identified by all of its properties if we don’t use its instances this way. There is no reason to have an extra getter for every property if we don’t use, extensively, the destructuring declaration. There is no reason to have a copy mechanism if we never copy instances!
If it is there it is going to be used
I recently removed the data keyword from one of our oldest classes and I noticed that many of our newest tests started to fail in compilation. The compiler could not find the copy method which was used to create dummy values from other dummy values by changing one property per test.
When to create a data class?
Here is my thought process when trying to decide the type of class I’ll use:
Q: Is this class anything but a domain entity or value object?
A: Then a simple class is just fine.
Q: Is this class a domain entity? Meaning that it can be uniquely identified by a subset of its properties (ex: an id)?
A: Then a simple class with an implementation of equals/hashCode
will be enough.
Q: Is this a value object? Meaning that it can be uniquely identified by all of its properties?
A: Yes.
Q: How many properties?
A: One. Then a value class is a must.
Q: Are you sure its just one?
A: Turns out its more! We’ll use a data class.
Don’t do it for the test
Our test code is the first consumer of our production code. Changing the production code, in this case change/create a class as data, to write more quickly a couple of tests will result in having tests that can easily break every time the production code changes since the tests know too much about the code’s internals and not its behavior.