Book Chat: Growing Object-Oriented Software Guided By Tests

Growing Object-Oriented Software Guided By Tests is an early text on TDD. Since it was published in 2010, the code samples are fairly dated, but the essence of TDD is there to be expressed. So, you need to look past some of the specific listings since their choice of libraries (JUnit, jMock, and something called Window Licker I had never heard of) seem to have fallen out of favor. Instead, focus on the listings where they show all of the steps and how their code evolved through building out each individual item. It’s sort of as if you are engaged in pair programming with the book, in that you see the thought process and those intermediate steps that would never show up in a commit history, sort of like this old post on refactoring but with the code intermixed.

This would have been mind blowing stuff to me in 2010, however the march of time seems to have moved three of the five parts of the book into ‘correct but commonly known’ territory. The last two parts cover what people are still having trouble with when doing TDD.

Part 4 of the book really spoke to me. It is an anti-pattern listing describing ways they had seen TDD go off the rails and options for how to try to deal with each of those issues. Some of the anti-patterns were architectural like singletons, some were specific technical ideas like patterns for making test data, and some were more social in terms of how to write the tests to make the more readable or create better failure messages.

Part 5 covers some advanced topics like how to write tests for threads or asynchronous code. I haven’t had a chance to try the strategies they are showing but they do look better than the ways I had coped with these problems in the past. There is also an awesome appendix on how to write a hamcrest matcher which when I’ve had to do it in the past was more difficult to to do the first time than it would look.

Overall if you are doing TDD and are running into issues, checking out part 4 of this book could easily help you immediately. Reading parts 1 through 3 is still a great introduction to the topic if you aren’t already familiar. I didn’t have a good recommendation book on TDD before and while this isn’t amazing in all respects I would recommend it to someone looking to get started with the ideas.

Continuation Passing Style

I have been doing some work with a library that creates guards around various web endpoints. They have different kinds of authentication and authorization rules, but are all written in a continuation passing style. The idea of the continuation passing style is that you give some construct a function to ‘continue’ execution with once it does something. If you’ve ever written an event handler, that was a continuation. The usages all look somewhat like

secureActionAsync(parseAs[ModelType]) { (userInfo, model) => user code goes here }

There was some discussion around whether we wanted to do it like that or with a more traditional control flow like

secureActionAsync(parseAs[ModelType]) match {
    case Allowed(userInfo, model) => user code goes here then common post action code
    case Unallowed => common error handling code
}

The obvious issue with the second code sample is the need to call the error handling code and post action code by hand. This creates an opportunity to fail to do so or to do so incorrectly. The extra routine calls also distract from the specific user code that is the point of the method.  

There were some additional concerns about the testability and debuggability of the code in the continuation passing style. The debugging side does have some complexity but it isn’t any more difficult to work through than a normal map call which is already common in the codebase. The testability aspect is somewhat more complex though. The method can be overwritten with a version that always calls the action, but it still needs to do the common post action code. The common post action code may or may not need to be mocked. If it doesn’t need to be mocked this solution works great. If the post action code needs to be mocked then putting together a method that will set up the mocks can simplify that issue.

This usage of a continuation helps collect the cross cutting concern and keeps it all on one place. You could wrap up this concern other ways, notably with something like like AspectJ. The issue with doing it with something like AspectJ is that it is much less accessible than the continuation passing style. AspectJ for this problem is like using a bazooka on a fly, it can solve it but the level of complexity introduced isn’t worth it.

Future[Unit] Harmful

I’m not the first person to have seen this behavior but I saw it bite another engineer recently so it’s clearly not well enough known. The Unit type in Scala is a special type, there is effectively an implicit conversion from every type to Unit. If you are interested in the specifics check out this post, I’m not going to get into the specifics of how it works or the edge cases. This means that you can write the something like

val doubleWrapped: Future[Unit] = Future.successful(Future.successful(true))

and that compiles. It breaks the type safety that we have come to expect. More intriguingly if you were to do

val innerFailure: Future[Unit] = Future.successful(Future.failed(new RuntimeException))
innerFailure.value

what would you expect the result to be?

If you said

Some(Success(()))

you would be right, but that would probably not be what you were looking for. If you had a map where you needed a flatMap you would end up with compiling code that silently swallows all error conditions. If you have an appropriate set of unit tests you will notice this problem, but you can have that code that looks simple enough that it doesn’t need any unit tests.

The compiler flag -Ywarn-value-discard should give you some protection, but you need to turn it on explicitly and it may produce a fair bit of news in an existing large codebase. So keep an eye out for this issue, and be forewarned.

Write the Code You Want

“Write the code you want and then make it compile” was a thought expressed on library design while I was at the NE Scala Symposium. It is a different way to describe the TDD maxim of letting the usage in tests guide the design. It is very much influenced by the extremely flexible syntax rules and DSL creation abilities in Scala. One of the talks, Can a DSL be Human? by Katrin Shechtman, took a song’s lyrics and produced a DSL that would compile them.

Since you can make any set of arbitrary semantics compile, there is no reason you can’t have the code you want for your application. There is an underlying library layer that may not be the prettiest code, or may be significantly verbose but you can always make it work. Segregating the complexity to one portion of the code base means that most of the business logic is set up in a clean fashion and that the related errors can be handled in a structured and centralized fashion.

Taking the time to do all of this for a little utility probably isn’t worth it, but the more often a library is used the more valuable this becomes. If you’ve got a library that will be used by hundreds, really refining the interface to make it match how you think would be really user friendly.

Building software that works is the easy part, building an intuitive interface and all of the comprehensive documentation so others can understand what a library can do for you is the hard part. I’m going to take this to heart with some changes coming up with a library at work.

This still doesn’t even cover the aspect of deciding what you want. There are different ways you can express the same idea. The difference between a function, a symbolic operator, or create a DSL can all express the same functionality. You can express the domain in multiple ways, case classes, enums, or a sealed trait. You can declare a trait, a free function, or an implicit class. Deciding on the right way to express all of this is the dividing line between a working library and a good library.

Category Theory Intro

While I was at the NE Scala Symposium there was a lot of discussion of the finer points of functional programming. There was a lot of discussion of Category Theory and Monads; while I’d like to say I understood everything that was going on I’m not going to claim I did. I had seen discussion of the topics before but never really got it, or why it was valuable. I got an understanding of the basic concepts for the first time and wanted to try and write it out with the beginner in mind, since the biggest issue is the terminology and wikipedia assumes you’ve got a significant education in abstract math. As someone who isn’t an expert on this I’m going to simplify and probably end up misusing some terms, but hopefully it will get the basic ideas through in a way that you understand enough of the terminology to read more advanced works, or to understand why you don’t need to.

Starting from the theoretical aspect, you’ve got a category that is made up of three different aspects: objects, arrows, and the means to compose multiple arrows in a way that is commutative and has an identity. The arrows represent transitions between objects. That’s it.

There is a special class of categories called monoids, which are categories with only one object. At first this seems kind of pointless, but depending on how you define that object they become interesting. For example, if you define the object as the set of integers, and the arrows as individual numbers then the composition of numbers becomes an operator like addition. Addition is commutative so that’s not an issue, and adding 0 is an identity. It is a somewhat odd way to define addition, but it ensures you get a couple of different properties out of the system.

But what good is this monoid structure? It is the mathematical underpinning of why you can fold over a collection and why MapReduce is provably correct.

A monad is a transformation within a category (i.e., an arrow) that defines two additional properties, where there is an identity available and it is associative. Where have we seen that before? It’s a monoid! So adding 3 to an integer is a monad. What’s the big deal? The integer example is straightforward if dubiously valuable; with something more complex it might make more sense. So imagine you have some structure, with the ability to transform into another structure that is similar, has an identity and is associative. So what’s is that? It’s flatMap.

I’m going to switch to the other end and come at this from the programming side now for a minute. We’ve got these structures that have values and can be converted into other similar structures. That sounds kind of like a program to me. If you’ve got a List[String] with some strings in it and you convert that into another List[String], you could have taken those urls and returned the contents of the url, or you could have taken a list of file names and read the contents of the files. This tells us a couple of things: string typed things interoperate with all sorts of stuff and the act of the transformation can be abstracted from the data of the transformation. The first part embodies the idea that if a functional program compiles, it’s probably right since functional programs will define more specific data types that ensure you are putting the pieces together. The second represents the separation of data from the transformations of data or from the actual business logic of the application. This is the opposite of object oriented programming where code and data are coupled together in an object. So we’ve separated the code and the data, and without objects you’ve essentially got free functions that eventually get composed together into larger and larger computational pieces. No reason those transformations can’t be set up to be associative and have identities. So now you have a program represented as a series of monad transformations, easy.

There is some space between the programming perspective and the math perspective. But a lot of it is just constructing categories in specific ways so the transformations and compositions are what you want so I’m going to leave that alone, but I would like to discuss some of what this matters. The separation of the transformation and the data also allows you to separate the definition of the transformation from the execution of the transformation. That makes your code inherently testable, as well as imbuing it with the ability to apply large classes of optimizations. Those optimizations are part of what makes functional programming parallel processing friendly and highly performant.

So while you probably don’t care that the code has these underlying mathematical properties, you do care what those properties can imbue upon your program. They make it easy to test and easy to be fast. You don’t really need to understand monads and category theory at a theoretical level to take advantage of all of this and what it means for your program. Libraries like Cats are full of the category theory terminology, making it harder to find what you want without understanding the underlying terminology. Having a combinable rather than a semigroup would make it more accessible, but harder to understand why it works.

Service Creation Overhead Followup

I previously mentioned the new service we were spinning up and the discussion of the overhead therein. Having finished getting the initial version of the service out into production, I feel like I have some answers now. The overhead wasn’t that bad, but could have been lower.

The repo was easy as expected. The tool for setting up the CI jobs was quite helpful, although we didn’t know about a lot of the configuration options available to us. We initially configured with the options we were familiar with, but found ourselves going back  to make a couple of tweaks to the initial configuration. The code generators worked out great and saved a ton of time to get started.

The environment configuration didn’t work out as well as expected. The idea was that the new service would pick up defaults for essentially all of its needed configuration, which would reduce the time we would need to spend figuring it out ourselves. This worked out reasonably well in the development environment. In the integration environment we ran into some problems because the default configuration was missing some required elements. This resulted in us not having any port mappings set up so nothing could talk to our container. We burned a couple of hours on sorting out this problem. But when we went to the preproduction environment we again found its port mapping settings were different from the lower environments and needed to be setup differently. Here we ended up burning even more time since the service isn’t exposed externally and we needed to figure out how to troubleshoot the problem differently.

In the end I still think spinning up the new services on this short timeframe was the right thing to do – we would have had to learn this stuff eventually when building a new service. Doing it all on the tight timeline was unfortunate but the idea of getting the services factored right is the best thing.

Productivity and Maintainability

As I had alluded to previously, the project I’m currently on has some very aggressive and fixed deadlines. The project got scoped down as much as possible but there have still been some trade-offs for productivity over maintainability. Some of it happened as a team decision – we discussed ways to compress the schedule and found some shortcuts to produce the needed functionality faster. But as new requirements came up in the process those shortcuts ended up causing much more work overall.

We knew that the shortcuts would cause more total work but we thought that the additional work could be deferred until after the initial deadline. The change of requirements meant that some of the automated testing that had been deferred was an issue. The initial manual testing plus the retest after the changes probably wasn’t much less than the effort to do the automated testing the first time. This specific shortcut wasn’t responsible for much of the time we shaved off the schedule, but when you’re doing similar things all over, the total cut was significant.

The key part of this tradeoff is that it assumed we would go back after the initial deadline and fill in the bits like we intended to instead of moving to something else (which is appearing more and more likely). Product management has a request for an additional feature in another existing application. This request has a similarly tight deadline, but is much smaller in scope (15 points) than the initial request(~120 points for the scoped down version). Picking up this request would mean that we end up further from the decision to defer portions of this and we’d be less likely to be able to deal with the technical debt in a timely manner that we accrued to the do the first request.

Since I’m still new at this job I’m not sure if this behavior is regular or since both of these items are related to the same event if it’s just a one-time thing. If this is a one-off problem then we can survive more or less regardless of how well we deal with it. If this is going to be a regular occurrence without an opportunity to resolve the problem, I feel like I’d need to take a different tactic in the future about how to work with the problem of short term needs.

This feels like the overall challenge of engineering: there are problems that need to be solved in a timeboxed way. We need to make good enough solutions that fit all of the parameters, be they time or cost. We can be productive and leave behind a disaster in our wake but succeed at business objectives, or we can build an amazing throne to software architecture but have the project fail for business reasons. The balance between the two is the where this gets really hard. If you’ve already been cutting corners day to day and the request to push the needle gets made, you have no resources left. The lack of an ability to really quantify the technical debt of an application in a systematic way makes it hard to project where we are on the continuum to others who aren’t working on the system daily.

Without this information, program-level decisions are hard to make and you end up with awkward mandates from the top that aren’t based on the realistic situation on the ground. Schedule pressure causes technical debt, testing coverage mandates cause tests that are brittle to just ratchet up the coverage, mandating technology and architectural decisions results in applications that just don’t work right.

Cloud Computing Patterns

A colleague recommended this list of application design patterns for cloud computing. The idea of common dictionary of terminology is always important. This is a good collection of the different patterns that can be used in a cloud application. Using the abstract pattern names rather than the specific implementation names can help abstract the discussion from a single implementation to the more generic architectural discussion. A specific implementation can have specific strengths, weaknesses and biases, but a generic pattern is the more pure concept. The concept allows the discussion to focus on the goals rather than implementation specifics at the initial level.

I read through the patterns and found a couple I hadn’t been aware of like timeout-based message processor, community cloud and data abstractor. The timeout-based message processor is a name for a concept I was familiar with, but never had a name for. Essentially it’s a message handling process where you ack the message after processing, as opposed to on receipt. The community cloud is a pattern for sharing resources across multiple applications and organizations. The data abstractor is a way to handle multiple services having eventually consistent data and show a single consistent view of the underlying data, by abstracting the data to hide the inconsistencies. None of these three patterns is a mind blowing solution to a problem I’ve been having, but they’re all interesting tools to have in my toolbox.

There are a bunch of different ideas which vary from the common base level knowledge of cloud computing to solutions for some quite specific problems. The definitions of the various patterns are concise and reference other related patterns, so even if you don’t know exactly what you are looking for you should be able to find your way to it fairly quickly. Overall worth checking out if you are working on a distributed application.

Reactive Manifesto

I ran across the Reactive Manifesto while I was doing some reading on Scala. I touched on this idea back in the Virtual Conference post with the presentation What does it mean to be Reactive? by Erik Meijer. His 45 minute presentation tried to explain the nuances of this concept. The manifesto is all about taking the basic concept of what reactive programming is and making the big ideas accessible.

The four aspects of reactive applications – responsive, resilient, elastic, and message-driven – all come together to reinforce one another. Responsive is straightforward: the application should return results quickly. Resilient is that the system is designed to resist faults and remain responsive. Elastic means the system can react to the changing input rate and add or remove resources to keep up with demand. Message-driven is that the system operates in an asynchronous manner. These four aspects work together to build a cohesive style of application.

The message-driven nature of a reactive application allows the system to utilize the concept of back pressure. Back pressure allows the system to control the flow of requests to ensure that the system remains responsive while the elastic aspect kicks in and applies additional resources. The message-driven nature also means that the resiliency has additional options since the services are decoupled further.

The reactive name may be new but the concepts are old. HTTP could be described as reactive. It is responsive since you can guarantee the timeout settings. The HTTP server is designed to be resilient and able to fail fast, with descriptive error codes that indicate how to react. A 404 is different from a 503 and the caller can react differently to the two of them. The system is elastic since you have ways available to scale horizontally. HTTP is clearly message-driven with the request and response paradigm.

This ties into the Scala ecosystem via Scala.react and other libraries to enable reactive programming on the platform. Scala is a good fit for reactive programming because the functional paradigm makes it easier to work in a reactive manner. Martin Odersky, the designer of Scala, also was an early proponent of reactive programming so that created a fertile breeding ground for working on reactive concepts.

Using this paradigm at a smaller scale seems like a great way to create modular systems. This pattern also compliments the microservices architecture. I’m looking forward to working with these concepts as I get more experience with Scala.

Book Chat: Don’t Make Me Think Revisited

Don’t Make Me Think Revisited by Steve Krug is a usability and user experience book. It’s a series of well-illustrated rules for getting the basics of usability into a web site. There is also an excellent chapter on how to run your own usability test on the cheap. The revisited edition includes some information about working on mobile apps, and mobile web sites as well. The rules of creating a good navigation system was a great way to codify things you know but may have struggled to express.

Krug’s guidance on running your own cheap usability study comes down to using fewer users and putting together a simpler reporting plan. Instead of having 20 users and generating a giant report, you get two or three users and select a couple of important things that can be actively fixed. You don’t need a report with hundreds of issues, since once you fix a couple of issues the rest of it might not be useful anymore. Once you run one study and fix those issues, iterate and try again. It’s almost like A/B testing except it works on smaller scales.

The information on mobile testing goes beyond the usual advice of “make the buttons bigger” and “put less on each page.” Krug’s advice on deep linking on mobile is obvious, that it needs to work when you navigate to a full site url on a mobile device needs to work, but nobody seems to get it right. And, even though this is minor, the number one thing I got from the mobile chapter was a way to combine a webcam and a clip on light, and build a mount that lets you see what the user’s hands are doing while they’re using the app. That’s way more useful than  conventional video recording equipment where you would mount the phone but then they don’t hold the phone normally. Plus it only takes an hour to put together and costs about $30.

Krug’s rules for a navigation system basically boil down to the obvious. Keep the navigation consistent page to page. Make it clear where you are in the application. Put the things most people want in easy to find places. Ensure that every page has a name and it is in a consistent place, which maybe isn’t as immediately intuitive as the other rules. However, when I went and checked some sites and most seem to do this. Putting all of these rules together comes out as an explanation of the tabbed design and why it became so common.

Overall design is not one of my strong interests, but this is a good primer on the topic to help keep you from doing anything too silly. I certainly feel like I can put Krug’s advice into practice. There are also lots of references to some other books if you want to try and dig in deeper.