Both Project Management and Coaching?

This is somewhat outside of the normal material I write about, but this idea came to mind and I wanted to take an opportunity to explore it further. The context of this thought was that a mentoring program is being launched at work. Why is a manager both responsible for guiding the day-to-day execution of a project and for development of staff beneath them?
The goals of the project and the goals of personnel development are often at odds. Any specific experience a person needs or wants to get is likely something they don’t have much (if any) experience with yet, which means they won’t be as efficient or as good at that particular skill. While management taking care of your people gains you flexibility and goodwill in the long term, it has short term costs.
The product owner is separated from the scrum master, separating the “what” from the “how” in order to prevent a conflict of interest. Similarly, splitting the staff development aspect from the technical management of the project seems like it could prevent a similar conflict of interest about an individual. Maybe a stronger explicit mentorship program would handle mitigate this conflict of interest, but unless you give out a mentor who is both capable of and interested in working on staff development it wouldn’t help much. I have seen an explicit mentorship program where serving as a mentor was informally required for promotion to higher level, which resulted in people becoming mentors explicitly to check the box for their own personal benefit.
By setting up part of management to be incentivized to do staff development and achieve technical excellence, rather than completing projects or shipping features, you can create an environment that allows those closest to the system under development to do their best without the pressure to achieve explicit results. This reminds me of the idea of slack in queueing theory, where putting less work into the system means more work will come out. Once you build up staff appropriately and get everyone cross-trained, the overall outcome becomes better. Think of it as an optimization problem where you may have achieved a local maximum; getting to a higher peak would be better but the cost of valley between the peaks needs to be paid.
You could theoretically have the manager of the developers on a team be involved with the work of a different team, but it would be hard to see what those developers needed in order to help develop them in the longer term. Spending all day with a group of other developers working on the technical problems you face doesn’t really give you the insights necessary to see what a completely separate group of developers is struggling with. If you look at a problem with a new perspective, you can easily see different solutions, you could also miss important details about the problem itself and backtrack to already tread territory.
Maybe this is just my perspective based on the places I’ve worked, where the scrum master has been more of a process leader and impediment resolver than technical coach or project management. In my experience, development managers have spent a large portion of time working with the product owner to help factor stories in a more completable fashion and to derive the technical requirements from the business requirements. It always felt like the development manager spent more of their time working on those urgent but not really important things, and ignoring the important but delayable things because they were hard.
To answer the original question “Why is a manager both responsible for guiding the day-to-day execution of a project and for development of staff beneath them?” it seems to be because splitting up the responsibility differently won’t give management the day-to-day visibility into how to effectively provide useful developmental guidance. The ability to coach people on how to perform their role better requires that you spend enough time seeing them perform the role, so you can’t really be engaged in other technical activities on a day-to-day basis. Even if they had the insight to do so, it is exceedingly hard to measure staff development, which would make it hard to create goals and metrics around the activity. If you’ve had a different experience with how these roles have been broken down post a comment.

Future[Unit] Harmful

I’m not the first person to have seen this behavior but I saw it bite another engineer recently so it’s clearly not well enough known. The Unit type in Scala is a special type, there is effectively an implicit conversion from every type to Unit. If you are interested in the specifics check out this post, I’m not going to get into the specifics of how it works or the edge cases. This means that you can write the something like

val doubleWrapped: Future[Unit] = Future.successful(Future.successful(true))

and that compiles. It breaks the type safety that we have come to expect. More intriguingly if you were to do

val innerFailure: Future[Unit] = Future.successful(Future.failed(new RuntimeException))
innerFailure.value

what would you expect the result to be?

If you said

Some(Success(()))

you would be right, but that would probably not be what you were looking for. If you had a map where you needed a flatMap you would end up with compiling code that silently swallows all error conditions. If you have an appropriate set of unit tests you will notice this problem, but you can have that code that looks simple enough that it doesn’t need any unit tests.

The compiler flag -Ywarn-value-discard should give you some protection, but you need to turn it on explicitly and it may produce a fair bit of news in an existing large codebase. So keep an eye out for this issue, and be forewarned.

Generators for Infrastructure

I listened to a recent episode of .NET Rocks about CI/CD pipelines and the guest had built a yeoman generator to build out the pipeline. It is specific to the Microsoft offerings of Visual Studio Team Services and Team Foundation Services, but it is an interesting idea. There isn’t a reason you couldn’t build something similar on top of Travis, Circle, or Semaphore. I had touched on some of the other generators at work previously, but those were bootstrapping service and application code, not infrastructure. But that’s the advantage of infrastructure as code – you get all of the toolchain available to work with it.

At work we have code to build out the CI/CD pipelines but it is more of a desired state thing than a generator. It’s all built on top of a local Jenkins install, and regularly recreates all of the job definitions to prevent any changes made by hand from living very long. At previous jobs the automation for building different services and libraries were all a little different or built in a brittle template/multipart style that didn’t allow for any changes in a safe way. Our tool has been great for keeping all of the microservice jobs the same, templated in a way that makes upgrading them in groups easy enough, and even fairly testable.

The testing aspect is part of what’s most interesting, we boot up a Jenkins instance inside a container and generate the jobs into it, with some different configuration for notifications and where to put artifacts generated to blackhole all of it. The deployment jobs work using the same version running to the dev cluster so it should all be idempotent. The biggest issue I’ve seen so far is that it runs a ‘representative sample’ of jobs as part of testing; that hasn’t  always been as representative as we would want it to be especially as we add new jobs. We’ve got roughly 300 repos but only maybe 150-200 use managed builds like this. The good news is they break much less often than the unmanaged ones, but the bad news is when they break it’s usually a large swath of them that go down all at once.

AWS CodeStar was recently announced as a hosted version of this sort of tool. It creates repos, scaffolds a new application, can deal with permissions, and carries through to the provisioning of hosting as well. It has options for multiple languages and hosting options. So it seems like it is simplifying a lot of the initial setup of a service.

I’m not sure where the tipping point on the value of setting up your own infrastructure would be, but it seems like at least 100 repos depending on the complexity of each build. This sort of automation will become more important as microservices deployment continues to get bigger. If you haven’t done any sort of automation like this it’s worth looking into depending on the scope of the project.

Tagging in Macwire and Type Currying

We’ve been using the Macwire dependency injection framework and I recently got to use the tagging functionality for the first time. This was a great feature that enabled me to instantiate multiple instances of the same type and have the framework differentiate between the instances when choosing which to inject.

First, you define a trait to be the marker; it doesn’t need anything in it. Then, at the injection site it becomes

BaseType @@ Marker

When setting up the initial site, instead of being

wire[BaseType]

it becomes

wire[BaseType].taggedWith[Marker]

That’s all there is to the base case. You can chain multiple tags in a couple of different ways.

wire[BaseType].taggedWith[Marker].andTaggedWith[OtherMarker]

or

wire[BaseType].taggedWith[Marker with OtherMarker]

 

I combined all of this with another neat construction of the type system.

trait Foo[Marker](dependency:Bar @@ Marker)

Once I set up the Foo type like this it lets me take a marked dependency in a generic way.

I would like to see some sort of syntax like

preference[Marker].wire[BaseType]

So I don’t need to make changes to the type being instantiated in order to control the preference of what instance gets supplied. There is also the module syntax, where you can create a module for each set and then nest those modules to get the injections as desired; but I don’t want to have to organize my code in a specific way in order to have it work correctly.

All of this had me thinking about currying/partial application and the application of generic types. You can take a type with multiple generic parameters and apply some of them and leave the rest for later. It’s not something that happens often but it came to mind with the way I was using generic types there and it was an interesting insight into the symmetry between types and data.

Write the Code You Want

“Write the code you want and then make it compile” was a thought expressed on library design while I was at the NE Scala Symposium. It is a different way to describe the TDD maxim of letting the usage in tests guide the design. It is very much influenced by the extremely flexible syntax rules and DSL creation abilities in Scala. One of the talks, Can a DSL be Human? by Katrin Shechtman, took a song’s lyrics and produced a DSL that would compile them.

Since you can make any set of arbitrary semantics compile, there is no reason you can’t have the code you want for your application. There is an underlying library layer that may not be the prettiest code, or may be significantly verbose but you can always make it work. Segregating the complexity to one portion of the code base means that most of the business logic is set up in a clean fashion and that the related errors can be handled in a structured and centralized fashion.

Taking the time to do all of this for a little utility probably isn’t worth it, but the more often a library is used the more valuable this becomes. If you’ve got a library that will be used by hundreds, really refining the interface to make it match how you think would be really user friendly.

Building software that works is the easy part, building an intuitive interface and all of the comprehensive documentation so others can understand what a library can do for you is the hard part. I’m going to take this to heart with some changes coming up with a library at work.

This still doesn’t even cover the aspect of deciding what you want. There are different ways you can express the same idea. The difference between a function, a symbolic operator, or create a DSL can all express the same functionality. You can express the domain in multiple ways, case classes, enums, or a sealed trait. You can declare a trait, a free function, or an implicit class. Deciding on the right way to express all of this is the dividing line between a working library and a good library.

Three Monad Laws

I’m continuing to dissect and share my experiences at the NE Scala Symposium. Several presenters referenced the “three monad laws,” which I had never heard of before. A monad must adhere to three laws:

  1. Left Identity
  2. Right Identity
  3. Associativity

The expression of left identity and right identity were new to me. I understood the concept of identity but wasn’t familiar with the distinction between right and left. Apparently the identity I was familiar with assumed a commutative relationship which means that the left and right identity are the same thing. From wikipediaThen an element e of S is called a left identity if ea = a for all a in S, and a right identity if ae = a for all a in S. If e is both a left identity and a right identity.” Trying to think of an example, the right identity of division would be 1 but the left identity would be the value squared.

Associativity is the final law: this is the normal associativity that we’re all used to from basic math,  which is expressed as X * (Y * Z) = (X * Y) * Z.

That’s the total of the laws, it’s nothing hard to accomplish. Like a lot of the pieces of functional programming each individual piece isn’t anything magical but the way they interact creates a special sort of synergy.

Category Theory Intro

While I was at the NE Scala Symposium there was a lot of discussion of the finer points of functional programming. There was a lot of discussion of Category Theory and Monads; while I’d like to say I understood everything that was going on I’m not going to claim I did. I had seen discussion of the topics before but never really got it, or why it was valuable. I got an understanding of the basic concepts for the first time and wanted to try and write it out with the beginner in mind, since the biggest issue is the terminology and wikipedia assumes you’ve got a significant education in abstract math. As someone who isn’t an expert on this I’m going to simplify and probably end up misusing some terms, but hopefully it will get the basic ideas through in a way that you understand enough of the terminology to read more advanced works, or to understand why you don’t need to.

Starting from the theoretical aspect, you’ve got a category that is made up of three different aspects: objects, arrows, and the means to compose multiple arrows in a way that is commutative and has an identity. The arrows represent transitions between objects. That’s it.

There is a special class of categories called monoids, which are categories with only one object. At first this seems kind of pointless, but depending on how you define that object they become interesting. For example, if you define the object as the set of integers, and the arrows as individual numbers then the composition of numbers becomes an operator like addition. Addition is commutative so that’s not an issue, and adding 0 is an identity. It is a somewhat odd way to define addition, but it ensures you get a couple of different properties out of the system.

But what good is this monoid structure? It is the mathematical underpinning of why you can fold over a collection and why MapReduce is provably correct.

A monad is a transformation within a category (i.e., an arrow) that defines two additional properties, where there is an identity available and it is associative. Where have we seen that before? It’s a monoid! So adding 3 to an integer is a monad. What’s the big deal? The integer example is straightforward if dubiously valuable; with something more complex it might make more sense. So imagine you have some structure, with the ability to transform into another structure that is similar, has an identity and is associative. So what’s is that? It’s flatMap.

I’m going to switch to the other end and come at this from the programming side now for a minute. We’ve got these structures that have values and can be converted into other similar structures. That sounds kind of like a program to me. If you’ve got a List[String] with some strings in it and you convert that into another List[String], you could have taken those urls and returned the contents of the url, or you could have taken a list of file names and read the contents of the files. This tells us a couple of things: string typed things interoperate with all sorts of stuff and the act of the transformation can be abstracted from the data of the transformation. The first part embodies the idea that if a functional program compiles, it’s probably right since functional programs will define more specific data types that ensure you are putting the pieces together. The second represents the separation of data from the transformations of data or from the actual business logic of the application. This is the opposite of object oriented programming where code and data are coupled together in an object. So we’ve separated the code and the data, and without objects you’ve essentially got free functions that eventually get composed together into larger and larger computational pieces. No reason those transformations can’t be set up to be associative and have identities. So now you have a program represented as a series of monad transformations, easy.

There is some space between the programming perspective and the math perspective. But a lot of it is just constructing categories in specific ways so the transformations and compositions are what you want so I’m going to leave that alone, but I would like to discuss some of what this matters. The separation of the transformation and the data also allows you to separate the definition of the transformation from the execution of the transformation. That makes your code inherently testable, as well as imbuing it with the ability to apply large classes of optimizations. Those optimizations are part of what makes functional programming parallel processing friendly and highly performant.

So while you probably don’t care that the code has these underlying mathematical properties, you do care what those properties can imbue upon your program. They make it easy to test and easy to be fast. You don’t really need to understand monads and category theory at a theoretical level to take advantage of all of this and what it means for your program. Libraries like Cats are full of the category theory terminology, making it harder to find what you want without understanding the underlying terminology. Having a combinable rather than a semigroup would make it more accessible, but harder to understand why it works.

Traveling Stories

Traveling has always made me introspective. When I was going to the West coast regularly for work, my wife knew she would get a rambling email from me that I wrote while in the plane. Some of them were sappy, some were crazy, some were a disjointed mess. But they were all those thoughts that came out when I was stuck there with nothing but those thoughts to keep me company.

Sure I’d read, listen to a podcast, or watch something but eventually the mind would wander to that place it wanted to go. Like a very slow form of meditation. Once my mind emptied of thoughts of the day or about where I was going, I achieved this zenlike state where answers to questions would just unfold like an origami crane.

The worst part is that the answers were always fleeting. There for a moment, gone the next without the chance to fully understand the epiphany. A deep insight into the universe that you know was there, but never get to appreciate.

This post started with one of those epiphanies I was on the train on the way to New York for the NE Scala Symposium and there was this moment of clarity about a communication struggle I had been having at work. For a moment, I had a vision of the creative action plan I had been looking for, then a PA announcement came on and the thought was gone. I hadn’t recorded the thought in any way but I know it was there right outside of Trenton.

On the ride home I managed to rekindle some of the thought, but it wasn’t the same deep insight that I had originally had. Initially, my work team had been proposing to adjust this existing framework to enable a new usage. But by rephrasing it as “replacing” the entire framework and building a brand new system everyone was immediately on board. By reframing the initial idea from being a change to a new thing it got everyone on board. I think this reasoning is twofold, it would mean we can roll out the change in smaller increments and we can go back without as much effort. It’s the same work and the same expected end state, but the reception was significantly different.

Scoring Intern Puzzles

I recently got to score coding puzzle solutions submitted by a crop of students who wanted to be interns for my company. Overall, I scored six puzzles and gave two passing grades. These were applications generally submitted by rising Juniors and Seniors, so I had fairly modest expectations about the overall quality of the solutions they would be presenting. I wanted to see some basic object oriented design, usage of the proper data structures, and correctness in a couple edge cases. The issues with the failures varied, but the most egregious and common issue was a failure in understanding object orientation. The typical solution was a java file, with one class generally nothing but static methods and everything public.

This led me to two immediate thoughts: are these students horribly unprepared or am I expecting too much? I went and asked one of the engineers who had scored another batch of solutions about what he got. His batch was distinctly better, one even had unit tests, but there were still some significant duds. The conversation attracted two recent grads who took a look at some of the solutions, and also thought they weren’t that good; since they were closer experience-wise, I thought that their expectations would be more reasonable. I know that they are getting a computer science education not a software engineering education, and I had talked about this difference before, but they need some ability in computer programming to demonstrate their computer science skills.

I tried to think back to when I was in school and what I was like at that point. I know I didn’t have the best practices then, but I feel like I was ahead of where they were. I wish I still had some of the code I had written then to look back at and compare. I don’t think I was great at that point in my programming journey. But, I feel like I would have at least put together a class or two as part of the solution, even if they weren’t really needed,  just to show I could decompose the problem.

I suspect that since internships have become a significant need that students are pushing out applications and doing this puzzle is a hurdle to them doing yet another application but they don’t recognize how to optimize their solution to show off their skills. Each solution had some positive aspects, but were missing most of the aspects I wanted to see. To me this meant that the skills were being taught, since they had all picked up some of them, just not enough of them yet.

Agile Architecture

I’ve been involved in a set of architecture discussions recently and it had me thinking about the role of architecture in an agile development team. The specific discussion mostly was around which services owned what data and which services were confederates using that data. My team owned all of the services in question and felt that the data should be centralized and the edge pieces should query the central store. There were some other interested parties that felt that the data should be distributed across the edge pieces which would coordinate amongst themselves and had a central proxy for outside services to query through. Both options had some pros and had some cons.

I had proposed some changes to the initial centralization plan to account for some issues raised by the other parties. Then something unexpected happened – they said that’s great but you need to do it our way anyway. This had me stunned at first. This didn’t change the way that they would use the resulting system so why was it their decision to make? Sure, they had higher titles but that isn’t a license to make architectural decisions universally. I hadn’t been in a situation like this before and wasn’t quite sure what to do with it to convince everyone involved that my proposal was the best option for the organization as a whole..

Fortunately the exchange was over email so I took a while to regroup and collect some opinions about what to do. I asked a few different people and got a couple of specific pieces of advice. First, that I should set a meeting with the all of the sets of stakeholders, so they would remember that there were more people involved. Second, that I should prep a full written description of the various options in advance. The goal of setting a meeting was to get a concentrated block of attention from the other parties rather than a series of ad hoc email exchanges. The written description  was to make it clear what the full state was, rather than just a series of deltas from the original plan. That would make it clearer what was going on. Bringing together all of the stakeholders meant that those opposed to it for reasonable but team-specific reasons would see the other stakeholders and hopefully see their perspective.

All of this went down over two sprints where we had set aside time to sort out this architecture for our next big project. Defining these tasks was tricky; the first sprint had tickets to handle figuring out all of the requirements and speccing several different options. The requirements weren’t put together by product since this was not an end user-facing feature. The second sprint was to gain consensus on the various options, which brings us back to the anecdote above.

The two points came together and once we had their concentrated attention and the complete description of the plan they came around to our centralization plan, with modifications.