To Karine, who supports my dysfunctionalities on a daily basis.
front matter
forewords
Every programming principle, every design method, every architecture style, and even most language features are about organizing complexity while allowing adaptation. Two characteristicsimmutable data and turning parts of the program into data inside the program itselfdrew me to Clojure in 2009 and more recently to Yehonathan Sharvits Data-Oriented Programming.
In 2005, I worked on one of my favorite projects with some of my favorite people. It was a Java project, but we did two things that were not common practice in the Java world at that time. First, we made our core data values immutable. It wasnt easy but it worked extraordinarily well. We hand-rolled clone
and deepClone
methods in many classes. The payoff was huge. Just as one example, suppose you need template documents for users to instantiate. When you can make copies of entire object trees, the objects themselves dont need to know whether they are template data or instance data. That decision is up to whatever object holds the reference. Another big benefit came from comparison: when values are immutable, equality of identity indicates equality of value. This can make for very fast equality checks.
Our second technique was to take advantage of generic datathough not to the extent Yehonathan will show you in this book. Where one layer had a hierarchy of classes, its adjoining layer would represent those as instances of a more general class. What would be a member variable in one layer would be described by a field in a map in another layer. I am certain this style was influenced by the several small talkers on our team. It also paid off immediately, as we were able to compose and recompose objects in different configurations.
Data-oriented programming, as you will see, promises to reduce accidental complexity, and raise the level of abstraction you work at. You will start to see repeated behavior in your programs as artificial, a result of carving generic functions into classes, which act like little namespaces that operate only on a subset of your programs values (their instances). We can fold together almost all of those values into maps and lists. We can turn member names (data available only with difficulty via reflective APIs) into map keys. As we do that, code simply melts away. This is the first level of enlightenment.
At this point, you might object that the compiler uses those member names at compile time for correctness checking. Indeed it does. But have faith, for Yehonathan will guide you to the next level of enlightenment: that those compile-time checks are a small subset of possible correctness checks on values. We can make the correctness checks themselves into data, too! We can make schemas into values inside our programs. Whats more, we can enforce criteria that researchers on the forefront of type systems are still trying to figure out. This is the second level of enlightenment.
Data-oriented programming especially shines when working with web APIs. There is no type of system on the wire, so attempting to map a request payload directly into a domain class guarantees a brittle, complex implementation. If we let data be data, we get simpler code and far fewer dependencies on hundred-megabyte framework libraries.
So, whatever happened to the OOP virtues of encapsulation, inheritance, and polymorphism? It turns out we can decomplect these and get each of them la carte. (In my opinion, inheritance of implementations is the least important of these, even though it is often the first one taught. I now prefer inheritance of interfaces via protocols and shared function signatures.) Data-oriented programming offers polymorphism of the traditional kind: dispatch to one of many functions based on the type of the first argument (in an OO language, this
is a disguise for the methods first argument. It just happens it goes before the .
). However, as with schema checking, DOP allows more dynamism. Imagine dispatching based on the types of the first two arguments. Or based on whether the argument has a birthday field with todays date in it! This is the third level of enlightenment.
And as for encapsulation, we must still apply it to the organizing logic of our program. We encapsulate subsystems, not values. This encapsulation embodies the decision-hiding of David Parnas. Inside a subsystem, we can stop walling off our data into the disjointed namespaces that classes impose. In the words of Alan Perlis, It is better to have one hundred functions operate on one data structure than ten functions on ten data structures.