Organizing Code

I’ve been arguing with myself about the proper way to split up some code that has related concerns. The code in question relates to fetching secrets and doing encryption. The domains are clearly related, but the libraries aren’t necessarily coupled. The encryption library needs secrets, but secrets are simple enough to pass across in an unstructured fashion.

As I mentioned before, we are integrating Vault into our stack. We are planning on using Vault to store secrets. We are also going to be using their Transit Encryption Engine to do Envelope Encryption. The work to set up the Envelope Encryption requires a real relationship between the encryption code and Vault.

There are a couple of options for how to structure all of this. There are also questions of binary compatibility with the existing artifacts, but that’s bigger than this post. The obvious components are configuring and authenticating the connection to Vault, the code to fetch and manage secrets, the API for consuming secrets, and the code to do encryption. I’m going to end up with three or four binaries, encryption, secrets, secret API, and maybe a separate Vault client.

organizingCode

 

That would be the obvious solution, but the question of what the Vault client exposes is complex, given that the APIs being used by the encryption and secrets are very different. It could expose a fairly general API that is essentially for making REST calls and leaves parsing the responses to the two libraries, which isn’t ideal. The Vault client could be a toolkit for building a client instead of a full client. That would allow the security concerns to be encapsulated in the toolkit, but allow each library to build their own query components.

Since the authentication portion of the toolkit would get exposed through the public APIs of the encryption and secret libraries, that feels like a messy API to me and I’d like to do better. There seems like there should be an API where the authentication concerns are entirely wrapped up into the client toolkit. I could use configuration options to avoid exposing any actual types, but that’s just hiding the problem behind a bunch of strings and makes the options less self-documenting.

Like most design concerns there isn’t a real right answer. There are multiple different concerns at odds with each other. In this case you have code duplication vs encapsulation vs discoverable APIs. In this case code duplication and encapsulation are going to win out over discoverable APIs since the configuration should be set once and then never really changed, as opposed to the other concerns which can contain the long term maintenance costs of the library since it will likely be used for a good while to come.

Advertisements

Book Chat: The Architecture of Open Source Applications Volume 2

The Architecture of Open Source Applications Volume 2 has writeups describing the internal structure and evolution of nearly two dozen different open source projects, ranging from tools to web servers to web services. This is different from volume one, which didn’t have any web service-like software, which is what I build day to day. It is interesting to see the differences between what I’m doing and how something like MediaWiki powers Wikipedia.

Since each section has a different author the book doesn’t have a consistent feel to it or even a consistent organization to the sections on each application. It does however give space to allow some sections to spend a lot of time discussing the past of the project to explain how it evolved to the current situation. If looked at from the perspective of a finished product some choices don’t make sense, but the space to explore the history shows that each individual choice was a reasonable response to the challenges being engaged with at the time. The history of MediaWiki is very important to the current architecture whereas something like SQLAlchemy(a Python ORM) has evolved more around how it adds new modules to enable different databases and their specific idiosyncrasies.

I found the lessons learned that are provided with some of the projects to be the best part of the book. They described the experience of working with codebases over the truly long term. Most codebases I work on are a couple of years old while most of these were over 10 years old as of the writing of the book, and are more than 15 years old now. Seeing an application evolve over longer time periods can truly help validate architectural decisions.

Overall I found it an interesting read, but it treads a fine line between giving you enough context on the application to understand the architecture, and giving you so much context that the majority of the section is on the “what” of the application. I felt that a lot of the chapters dealt too much with the “what” of the application. Some of the systems are also very niche things where it’s not clear how the architecture choices would be applicable to designing other things in the future, because nobody would really start a new application in the style. If you have an interest in any of the applications listed check out the site and see the section there, and buy a copy to support their endeavours if you find it interesting.

Type Aliases in Scala

I had an interesting conversation recently with one of the junior engineers on my team about when to use type aliases. Normally when I get asked for advice I’ve thought about the topic or at least have a rule of thumb I use for myself. Here all I could manage to express was that I don’t use type aliases but not for any particular reason. I felt I should do better than that and promised to get some better advice and see what we can do with that.

Having thought it through a little, here’s the guidance I gave. You can use type aliases to do type refinement to constrain an existing type. So, you could constrain that integer to only positive integers. Instead of assuming that some arbitrary integer is positive, or checking it in multiple places you can push that check to the edge of your logic. This gives you better compile time checks that your logic is correct and that error conditions have been handled.

They can also be used to attach a name to a complex type. So instead of having an

Either[List[Error], Validated[BusinessObject]]

being repeated through the codebase you can name it something more constructive to the case. This also allows hiding some of the complexities of a given type. So if, for example, you had a function that returns multiply nested functions itself like

String => (Int, Boolean) => Foo[T] => Boolean

it can wrap all that up into a meaningful name.

None of this is really a good rule for a beginner but it feels like it wraps up the two major use cases that I was able to find. I ended up going back to the engineer that prompted the question, with “use type aliases when it makes things clearer and is used consistently.” Neither of us were really happy with that idea. There are clearly more use cases that make sense but we weren’t able to articulate them. We’re both going to try it in some code and come back around to the discussion later and see where that gets us.

Scala Varargs and Partial Functions

I ran into a piece of code recently that looked like

foo(bar,
   {case item:AType => …}
   {case item:AnotherType => …}
{case item:YetAnotherType => …}
// 10 more cases removed for simplicity
)

I was immediately intrigued because that was a very odd construction and I was confused why someone would write a function accepting this many different partial functions and what they were up to. I went to look at the signature and found the below.

def foo(bar: ADTRoot, conditions: PartialFunction[ADTRoot, Map[String, Any]]*):  Map[String, Any]

It was using the partial functions to pick items from the algebraic data type (ADT) and merge them into the map. More interestingly it used the the ability of the partial function to identify if it can operate on the type that bar happened to be. Overall it was interesting combination of language features to create a unique solution.

Part of is that the ADT was missing some abstractions that should have been there to make this sort of work easier, but even then we would have had three cases not a dozen. I’m not sure if this pattern is a generalizable solution or even desirable if it is, but it got me thinking about creative ways to combine language features provided by Scala.

Book Chat: Refactoring

Refactoring sets out describe what refactoring is, why you should refactor code, and to catalog the different refactorings that can be done to an object oriented codebase. This isn’t the first instance of the idea of refactoring, but it was the big coming out party of the idea in 1999. It is an audacious goal in that the effort to catalog all of anything can be daunting. While I’m not an authority on refactoring by any means, it certainly captured all of the basic refactorings I’ve used over the years. It even includes refactoring to the template method design pattern, though it doesn’t reference something like refactor to the decorator pattern. It seems odd to have included refactor to one design pattern but not to several others.

The description of the “what” and “why” of refactoring are excellent and concise. The catalog is ~250 pages of examples and UML diagrams of each refactoring technique; that each refactoring needed to be shown, feels like overkill. In general, the author shows both directions of a refactor, e.g., extract method and inline method, which can be rather overwhelming. A newer volume on refactoring like Working Effectively With Legacy Code seems more useful in its presentation of actual refactoring techniques, in that it prioritizes where we wish to go, rather than exhaustively describing each individual modifications. Honestly, I think that since Refactoring predates automated tools for performing refactoring, given that  the internet in 1999 wasn’t as full of help on these sorts of topics, the book needed to be more specific since it was the only source of help.

It’s an interesting historical piece, but not an actively useful thing to try to improve your craft.

Serverless Architecture

The serverless architecture is an architectural pattern where you create a function to be run when an event occurs. This event can be a message being placed in a queue, or an email being received, or a call to a web endpoint. The idea is that you don’t need to think about how the code is being hosted which allows scale out to happen without you being involved. This is great for high availability, cost control, and reducing operational burdens.

The serverless architecture seems to result in setting up a number of integration databases. Imagine the route /foos/:id – there is a get and a put available. Hosting these in a serverless fashion means that you’ve got two independent pieces of code interacting with a single database. In my mind this isn’t significantly different from having two services interacting with the same database.

I went looking around for anyone else discussing this seemingly obvious problem, and found this aside from Martin Fowler comparing them to stored procedures. The stored procedure comparison seems apt since most endpoints seem to me to be wrappers around database calls, your basic CRUD stuff. These endpoints, just like most stored procedures, are a fairly simple query. Stored procedures started out as a good way to isolate your SQL to a particular layer, and enable you to change the underlying design of the database. If you’ve never worked in a large codebase with stored procedures they can evolve in lots of negative ways. You can end up with stored procedures that have business logic in the database, stored procedures that evolved and have 12 arguments (some of which were just ignored and only there for backwards compatibility), or six variations of stored procedure but it being unclear what the differences are. These are all downsides to stored procedures I have seen in real code bases.

I can imagine the serverless architecture equivalent of these problems, in that you would have an endpoint querying some database and be unaware of that shared database eventually the shared database becomes an impediment to development progress. Serverless architecture may still be interesting for a truly stateless endpoint. But, even then you could end up with the put and get endpoints not being in sync as to what the resource looks like. There isn’t a good way yet to package serverless applications. The idea of orchestrating an entire applications worth of endpoints seems a daunting task. Using something like GraphQL that radically limits the number of endpoints being exposed would simplify the deployment orchestration. While GraphQL has had a significant adoption it isn’t the solution for every problem.

Given these issues I don’t see the appeal of the serverless architecture for general application development and using it to back rest endpoints. For pulling from a queue, processing emails, or processing other event sources and interacting with web services it seems a good solution since there would be a single input source and it just produces additional events or other web service calls.

The serverless architecture for standard web apps seems like a choice that could result in a mountain of technical debt going forward. This is especially likely because the serverless architecture is most attractive to the smallest organizations due to the potential hosting savings. Those organizations are the least capable of dealing with the complexity. These are the same organizations that are prone to monolithic architectures or to integration databases since they have a short-term view of events and limited resources to deal with the issues.

I don’t know what the eventual fallout from applications built with this architecture will be but, I do suspect it will employ a lot of software engineers for many years to dig applications out of the pain inflicted.

Snipe Hunt

Recently I got pulled into a project to help get a feature that was mostly finished and needed to go through a “final QA round” before being ready for release. I felt that this wouldn’t require much of my time, but as you can imagine, things didn’t quite go as expected. The QA round found about a dozen errors in the new feature that I eventually divided into two classifications: requirements SNAFUs and code quality issues.

The requirements SNAFUs were the sorts of problems where the original programmer built what was explicitly asked for, but QA took the one of everything approach trying all sorts of cases that weren’t specified at all. These sorts of problems can be impactful from a time consumption perspective but aren’t that difficult to fix. The code quality issues are much more pernicious.

Digging into the code itself I quickly found an interesting fact. There were two fields, the currentPlanId and the activePlan, that were being mutated in various portions of the application, generally together. There wasn’t any clear distinction between the active plan and the current plan in the code, and at one point the currentPlanId was being set to the id from the active plan, sort of implying it’s the same thing but with poor naming. There were other places where one or both of them would mutate, and I went about tracing what caused the two to diverge or converge.

On initial page load the two would be different, with the active plan being blank, then when an item was selected on the drop down the two could converge, depending on what was selected.  I went and started looking for the tests covering this to see if there would be any clarification of the scenarios that were going on and turned up none. At this point I let others know of my findings and that while the problem seemed minor, there was a bigger quality problem under the hood of the system.

The first code change I made was a relatively minor one affecting when a particular button should show up; adding a special case and another test case started behaving. So far so good. Then I started tweaking the functions that were setting currentPlanId and activePlan. By this point I had managed to figure that current was a chronological state and active was a UI state, but it still wasn’t immediately clear how the system was making decisions about which plan was current. This obscured information seemed to be intertwined with the cause of a lot of the remaining broken cases.

I followed the webservice calls back through various layers microservices to where I knew the information had to be coming from and made an intriguing discovery. The way the frontend was deciding which plan was current was incorrectly being based on the timing between two different web service calls. I started digging around trying to find the right way to set all of this information and started to become clear that the initial architecture was missing a layer to coordinate the requests at the initial page load.

That got everything trending in the right direction. I still want to find some time to work through some additional unit tests and leave the code in a good state rather than just a better state.

Book Chat: Growing Object-Oriented Software Guided By Tests

Growing Object-Oriented Software Guided By Tests is an early text on TDD. Since it was published in 2010, the code samples are fairly dated, but the essence of TDD is there to be expressed. So, you need to look past some of the specific listings since their choice of libraries (JUnit, jMock, and something called Window Licker I had never heard of) seem to have fallen out of favor. Instead, focus on the listings where they show all of the steps and how their code evolved through building out each individual item. It’s sort of as if you are engaged in pair programming with the book, in that you see the thought process and those intermediate steps that would never show up in a commit history, sort of like this old post on refactoring but with the code intermixed.

This would have been mind blowing stuff to me in 2010, however the march of time seems to have moved three of the five parts of the book into ‘correct but commonly known’ territory. The last two parts cover what people are still having trouble with when doing TDD.

Part 4 of the book really spoke to me. It is an anti-pattern listing describing ways they had seen TDD go off the rails and options for how to try to deal with each of those issues. Some of the anti-patterns were architectural like singletons, some were specific technical ideas like patterns for making test data, and some were more social in terms of how to write the tests to make the more readable or create better failure messages.

Part 5 covers some advanced topics like how to write tests for threads or asynchronous code. I haven’t had a chance to try the strategies they are showing but they do look better than the ways I had coped with these problems in the past. There is also an awesome appendix on how to write a hamcrest matcher which when I’ve had to do it in the past was more difficult to to do the first time than it would look.

Overall if you are doing TDD and are running into issues, checking out part 4 of this book could easily help you immediately. Reading parts 1 through 3 is still a great introduction to the topic if you aren’t already familiar. I didn’t have a good recommendation book on TDD before and while this isn’t amazing in all respects I would recommend it to someone looking to get started with the ideas.

Continuation Passing Style

I have been doing some work with a library that creates guards around various web endpoints. They have different kinds of authentication and authorization rules, but are all written in a continuation passing style. The idea of the continuation passing style is that you give some construct a function to ‘continue’ execution with once it does something. If you’ve ever written an event handler, that was a continuation. The usages all look somewhat like

secureActionAsync(parseAs[ModelType]) { (userInfo, model) => user code goes here }

There was some discussion around whether we wanted to do it like that or with a more traditional control flow like

secureActionAsync(parseAs[ModelType]) match {
    case Allowed(userInfo, model) => user code goes here then common post action code
    case Unallowed => common error handling code
}

The obvious issue with the second code sample is the need to call the error handling code and post action code by hand. This creates an opportunity to fail to do so or to do so incorrectly. The extra routine calls also distract from the specific user code that is the point of the method.  

There were some additional concerns about the testability and debuggability of the code in the continuation passing style. The debugging side does have some complexity but it isn’t any more difficult to work through than a normal map call which is already common in the codebase. The testability aspect is somewhat more complex though. The method can be overwritten with a version that always calls the action, but it still needs to do the common post action code. The common post action code may or may not need to be mocked. If it doesn’t need to be mocked this solution works great. If the post action code needs to be mocked then putting together a method that will set up the mocks can simplify that issue.

This usage of a continuation helps collect the cross cutting concern and keeps it all on one place. You could wrap up this concern other ways, notably with something like like AspectJ. The issue with doing it with something like AspectJ is that it is much less accessible than the continuation passing style. AspectJ for this problem is like using a bazooka on a fly, it can solve it but the level of complexity introduced isn’t worth it.

Future[Unit] Harmful

I’m not the first person to have seen this behavior but I saw it bite another engineer recently so it’s clearly not well enough known. The Unit type in Scala is a special type, there is effectively an implicit conversion from every type to Unit. If you are interested in the specifics check out this post, I’m not going to get into the specifics of how it works or the edge cases. This means that you can write the something like

val doubleWrapped: Future[Unit] = Future.successful(Future.successful(true))

and that compiles. It breaks the type safety that we have come to expect. More intriguingly if you were to do

val innerFailure: Future[Unit] = Future.successful(Future.failed(new RuntimeException))
innerFailure.value

what would you expect the result to be?

If you said

Some(Success(()))

you would be right, but that would probably not be what you were looking for. If you had a map where you needed a flatMap you would end up with compiling code that silently swallows all error conditions. If you have an appropriate set of unit tests you will notice this problem, but you can have that code that looks simple enough that it doesn’t need any unit tests.

The compiler flag -Ywarn-value-discard should give you some protection, but you need to turn it on explicitly and it may produce a fair bit of news in an existing large codebase. So keep an eye out for this issue, and be forewarned.