Book Chat: How To Solve It

How To Solve It isn’t a programming book. It’s not exactly a math book either, but you will find yourself doing geometry while reading it. It isn’t a book on logic, but it is all about structured thought processes. I would describe it as a manual to teaching a systematic approach to problem solving to others, using geometry and a series of examples. It tries to lay out all of the thoughts that whiz through your head when you see a problem and understand how to solve it without really contemplating how you knew it. It’s a fast read, assuming that you know the geometry he uses in the examples.

The problem solving process is broken into four basic steps: understanding the problem, devising a plan, carrying out the plan, and looking back. At first it seems obvious, but that’s thing about a structured approach, you need to cover everything and be exhaustive about it. For example, to understand the problem you identify the unknown, identify the data, identify what you want to accomplish, try to draw a picture, introduce suitable notation, and figure out how to determine success. If you wanted to know should you buy milk at the store this sort of formal process is overkill, but if you are struggling with a more complex problem like trying to figure out what’s causing a memory leak or setting up a cache invalidation strategy it might be valuable to structure your thoughts.

I haven’t had a chance to apply it to a real problem yet. I did use some of the teaching suggestions – how to guide the pupil to solve their own problems – with one of the junior engineers I mentor and it seemed productive. I got him to answer his own question, however not enough time has passed to see if it improves his problem solving abilities in the future.

Overall the book was an interesting experience to read and seems practically applicable to the real world.

Advertisements

Monads for the Working Programmer

The monad is a backbone of functional programming. However, the mathematical definition of what a monad is highly inaccessible. I want to try and describe a monad for a programmer who is new to functional programming. This isn’t intended to be a precise definition, it is intended to be enough to help understand what the monad is, how to use it, and why it’s a helpful construct.

Consider the monad as a box. The box can be empty and the box doesn’t really care what’s in it. You can go get a box and put something in it. Someone can open up the box, take out what’s in there, work with it, then put their output back in the box. That’s really all there is to it.

Sometimes the box is less tangible than other boxes. For instance, Future in Scala or IO in Haskell don’t have an actual value yet, but the promise of a value later. Think of it like a package in the mail, the box may not be here yet but you can make plans for when it gets to you. You can plan to open the box and combine it with another value you have and put it back in the box. In programming terms it’s a continuation, you take the first box and attach some code to it so that when the value in the box does show up you run the continuation with it.

Sometime the box is a bit bigger than others, like the List monad. The List monad has more than one thing in the box like an egg carton, but they’re all of the same type. You can still open up the box and take out an egg and do something with it, or you can process each of the eggs in turn.

When you want to use a monad it’s as easy as that – your functions all start wrapping their return types in that monad. Their callers then open the box and do their work and return the new value to the box. So instead of pseudo code that looks like


values = [3, 4, 5]

for(int i =0; i<span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span>< values.length; i++) {

	values[i] = values[i] * 2;

}

 

 

You get pseudo code that looks like


values = [3, 4, 5]

doubled = values.map(value => value * 2)

But what’s the point of this programming style, what does it get you? First, it helps you work with immutable data. In the first example above you are mutating whatever values is. That’s fine in object oriented or procedural programming, but functional programming sacrificed mutable data for thread safe composability. Second, it produces code that is easily composable. If I wanted to subtract one before doubling the values in the above monad example, I could add another map statement entirely, separating the two concerns. In the initial example the two concerns would be mixed into the same control flow. Third, the monad itself can contain a fair bit of logic to enrich the programming experience. The ability to interact with a Future, an Option, a List, or any other monad in the same way simplifies a lot of refactorings. If you want to replace an Option with a List application logic doesn’t need to change much. Hopefully all of this helps demystify the Monad and helps you to write better code.

Media Diet

Last week I wrote about how I really tackled my imposter syndrome by reaching out into the wider community. It helped me feel like I was making progress outside of whatever was going on at work. I wanted to share the resources I use to find new ideas and keep up my continuous learning.

Blogs

Podcasts

 

This may seem like a lot of stuff, but most of the podcasts publish once a week, and blogs are generally less frequent than that. I generally try to get to a meetup or two a week on top of this. The whole diet helps me feel more informed and in touch with a software community outside work.

Imposter Syndrome Meetup

I was at a local meetup about imposter syndrome this week and it made me remember how far I have come in my own career. The speaker talked about his journey and the times he felt like an imposter, even though he had the sorts of experiences that would make most engineers jealous. I want to talk through my own background and about how I managed to come to grips with my own professional insecurities. Hopefully this will inspire others to have more confidence in themselves.

I remember at my first job how little I felt like I was learning and how little it seemed my coworkers around me knew. When I went to interview for my next job a couple of years later I was highly nervous that I was behind the curve, and I wasn’t sure that I entirely understood how real-world software engineering was meant to work. When I got the job, I told myself that I got lucky that the interview was devised by a bunch of alumni from my college, so it covered the kinds of questions I had seen in school.

At that job the imposter syndrome kicked in immediately. I was afraid that they had certain expectations of me based on my experience level, where I felt like I hadn’t progressed past what I had learned in school. I thought that I was behind the curve on version control practices, and I hadn’t gained any exposure to any sort of real domain modeling or object oriented programming. I knew these skills were going to be important at this job and had assumed that they were things they could expect me to know when I walked in the door. There was all of the domain information that goes with a new job as well, and at this company it was literal rocket science so I couldn’t really slouch on that aspect either. The first few months were definitely rough, I had a couple of days where I spent all day fighting with basic ideas and couldn’t get anything to work, which made me feel like I didn’t deserve to be there. It eventually got better as I gained experience with the domain and the technology. I gained confidence after running down a couple of very gnarly bugs and getting praised for a creative solution to an awkward problem. Ultimately though, my anxiety was misplaced. It turned out that my managers never had expected me to walk in the door with these skills, they had picked me because they were happy with what I knew already.

Sadly the company ran into hard times and I got laid off, but this paradoxically resulted in big confidence boost. When I got back to my desk from hearing the bad news from my boss I had a ringing phone from a former coworker to schedule an interview with his new company. All that time I had thought that I was barely getting by, he’d thought I was doing fantastic work. In the two weeks between then and my last day at that job I got another offer as well which helped boost my confidence.

I took my former colleague’s offer; I’m sorry to say it ended up not being a great culture fit for me. But by now, the increase in confidence meant I was more willing to take a chance and make a move, looking for something that would be more of a challenge. I took a position in the same domain as an expert to help salvage a failing software project. This job was a good fit on paper for me, since I had experience on both the domain and the technical stack. I was confident going in and was initially given a lot of latitude to do what needed to be done, which was great technical experience. But, it had me doing a lot of more management style activities than I wanted to do, which was an area that I also felt I didn’t really have the right skills and experience. Then after getting the software stabilized and finding a order of magnitude performance improvement, the reward was to bog down the entire project in a mountain of process. So, while I’d gained confidence in my own abilities and standing when it came to technical issues, I was circling back to feeling like an imposter in this new process/management role.

My discomfort resulted in me moving to a small startup to help anchor their development team. The environment was very unstructured and goals changed week to week. I was immediately being asked to give expert opinions on technologies I had never worked with before. The situation was stressful because I felt like I wasn’t qualified to give these opinions, but it wasn’t clear whether they had anyone more qualified. On one hand I was faking it in that I didn’t know a lot of what I was talking about, but on the other hand I took the initiative to learn a lot about these new technologies. Really, I learned how to learn about technologies. The few technology decisions I made during my time there all seemed to work out fine, but I don’t know how that compares to having made other choices and I wasn’t there long enough to see the long term outcomes. Even now, I still find myself downplaying the difficulty of the work I did, still feeling like I was just a pretender.

My next job was at a large tech company, and it was an eye opening experience. This was the first ‘normal’ web application I had worked on since my first job and I was worried that I was out of practice. Since I had so many more years of experience than the last time I worked on web applications, I assumed the expectations for me would be higher than I could meet. I was worried that I would show up and not know how to do anything and would be summarily fired. This turned out not to be the case, but the impression I had going in impacted my ability to leverage myself to accomplish anything. My assumption that I wouldn’t be able to contribute right away meant I stayed quiet about areas where I could have made improvements to benefit the company; I let mediocre practices I witnessed linger way too long before trying to change them.

Despite the good work I did there, my inability to change the culture and other improvable development practices really hurt my confidence about what I could achieve in this environment. This, combined with the lack of knowledge around building web applications, pushed me to do anything and everything I could do to try and grow more. I put a concerted effort into getting out into the local development community to try and find a broader sense of inspiration. This was the time period when I started writing this blog as well. I started attending a number of local meetups and listening to various podcasts. Talking to so many new people who shared my struggles helped me understand that others don’t know some magical trick that I don’t. And, it made me realize that learning how to learn was one of the most important things I had achieved. For me, moving on from imposter syndrome has been about accepting that I don’t know everything I wish I did on a topic, but neither does anyone else, it’s all about our willingness and ability to learn and improve.

This all culminates with my current position where I changed tech stacks to stuff I had never used at all before. My specific experiences weren’t immediately relevant to this new technology stack, but I did bring a lot of thoughts on doing unit testing, domain modeling and other good technical practices. Since this was my fourth stack in 12 years as a professional I had a fair idea about how to pick up a new stack and leverage what I did know to learn new things. There are still lots of things I don’t know, but I managed to get enough together to know how to ask reasonable questions and to apply the concepts from other stacks. I am still at points concerned that I don’t know enough about certain topics but I have become become fearless about asking questions and unafraid of looking uninformed. This question asking seems to have helped one of the junior engineers on my team to have the confidence to ask questions in pull requests when he doesn’t understand what’s going on. That sort of safe space amongst the team is the sort of environment that I want to be in and having accepted my own lack of knowledge on some fronts has empowered those around me to find a better way for themselves.

Strike Teams

At work there has been a new practice of starting up strike teams for different projects. The idea is that for projects that require expertise not found on any individual team, you pull in a person or two from multiple different teams to get all of the correct skills on a single team. That’s the pitch of the strike team model, but the hidden downside is that it breaks up the cohesion of the teams that people are pulled from and the created team may not be together long enough to create new cohesion. This post is mostly going to be a chronicle of the issues that I encountered starting up a strike team and what we did to try to resolve the problems.

The first problem was coordinating the various different teams to figure out who was going to be involved. In the case of the strike team I was forming, the two teams contributing resources both wanted to know who the other team was going to contribute before making their own decisions. They also wanted to have a fixed end date to the project before deciding who to contribute. We ended up resolving this issue with a fixed date for whoever they contributed to the team and getting the two of them to discuss the situation amongst themselves.

The second problem was aligning the start date of the strike team. The three teams contributing resources all have different schedules to their individual sprints so it makes coordinating a ‘start’ date for the strike team difficult.  Those who were on teams about to start another sprint wanted the strike team to start now, whereas those with other commitments made weren’t available. We ended up doing a rolling start where as each team finished their sprint they rolled onto the team and the team ramps up as people become available. We did some preparatory work to get everyone up to speed on the goals and challenges of the team so they were aware of what’s going on whenever they were able to join.

The third big problem is more specific to our particular organization, not to the strike team model. As part of setting up the strike team we needed to schedule things like standups and retros for the team. Since the team is split across both coasts, the hours for scheduling these meetings are limited. The conference rooms are also pretty much all booked because every other team beat us to scheduling. We ended up asking IT to rearrange some other non-recurring meetings and managed to get a consistent slot for the standups. The retro slot was more complicated but we managed to get meetings roughly spaced; by not being a stickler for strict week deliminations we managed to find times that worked.

So with all the overhead sorted out we finally get to move on to the real work of the situation which will be a nice change of pace from the administrative aspect.

Seven More Languages in Seven Weeks

Seven More Languages in Seven Weeks is a continuation of the idea started in Seven Languages in Seven Weeks that by looking at other languages you can expand your understanding of concepts in software engineering. While you may never write production code in any of these languages, looking at the ideas that are available may influence the way you think about problems and provide better idioms for solving them.

This installment brings chapters on Lua, Factor, Elixir, Elm, Julia, MiniKanren, and Idris. Each of these languages is out on the forefront of some part of software engineering. Lua is a scripting language with excellent syntax for expressing data as code. Factor is a stack-based programming language with interesting function composition capabilities. Elixir is Ruby-like syntax on the Erlang VM. Elm is reactive functional programming targeting javascript as an output language. Julia is technical computing with a more user friendly atmosphere, and good parallelism primitives. MiniKanren is a logic programming language and constraint solver built on top of Clojure. Idris is a Haskell descendent bringing in the power of dependent types to provide provably correct functional code.

Overall it was an interesting survey of the variety of programming languages. Some I had done a bit with before (Lua, Elixir) some I had heard of before (Elm, Julia, and Idris) and some I hadn’t even heard of (Factor and MiniKanren). Each chapter was broken into three ‘days’ indicating a logical chunk of the book to tackle at once. Each day ends with a series of exercises to help make sure you understand what’s being presented.

Since these languages are out on the edge of the world in programming terms, they are evolving fairly quickly. This ended up biting the Elm example code particularly hard since large portions of it have been deprecated in the releases since then and they didn’t work on the current runtime. Compared to the lineup from the original book (Clojure, Haskell, Io, Prolog, Scala, Erlang, and Ruby) you’ve got a much broader variety of languages in the sequel, but nothing with the popularity of Ruby or the legacy install base of Erlang. Since this was written in 2014, none of these have had a massive breakout in terms of popularity and adoption, however they do seem to do well in terms of languages people want to work with.

Overall it’s an interesting take on where things could be going.  I don’t think most of the languages covered have significant mainstream appeal right now. Two of these languages seem to be more ready for the primetime than the others. Julia definitely has a niche where it could be successful. I feel like the environment is ripe for something like Elm to surge in popularity since frontend technology seems to be going through constant revisions.

Serverless Architecture

The serverless architecture is an architectural pattern where you create a function to be run when an event occurs. This event can be a message being placed in a queue, or an email being received, or a call to a web endpoint. The idea is that you don’t need to think about how the code is being hosted which allows scale out to happen without you being involved. This is great for high availability, cost control, and reducing operational burdens.

The serverless architecture seems to result in setting up a number of integration databases. Imagine the route /foos/:id – there is a get and a put available. Hosting these in a serverless fashion means that you’ve got two independent pieces of code interacting with a single database. In my mind this isn’t significantly different from having two services interacting with the same database.

I went looking around for anyone else discussing this seemingly obvious problem, and found this aside from Martin Fowler comparing them to stored procedures. The stored procedure comparison seems apt since most endpoints seem to me to be wrappers around database calls, your basic CRUD stuff. These endpoints, just like most stored procedures, are a fairly simple query. Stored procedures started out as a good way to isolate your SQL to a particular layer, and enable you to change the underlying design of the database. If you’ve never worked in a large codebase with stored procedures they can evolve in lots of negative ways. You can end up with stored procedures that have business logic in the database, stored procedures that evolved and have 12 arguments (some of which were just ignored and only there for backwards compatibility), or six variations of stored procedure but it being unclear what the differences are. These are all downsides to stored procedures I have seen in real code bases.

I can imagine the serverless architecture equivalent of these problems, in that you would have an endpoint querying some database and be unaware of that shared database eventually the shared database becomes an impediment to development progress. Serverless architecture may still be interesting for a truly stateless endpoint. But, even then you could end up with the put and get endpoints not being in sync as to what the resource looks like. There isn’t a good way yet to package serverless applications. The idea of orchestrating an entire applications worth of endpoints seems a daunting task. Using something like GraphQL that radically limits the number of endpoints being exposed would simplify the deployment orchestration. While GraphQL has had a significant adoption it isn’t the solution for every problem.

Given these issues I don’t see the appeal of the serverless architecture for general application development and using it to back rest endpoints. For pulling from a queue, processing emails, or processing other event sources and interacting with web services it seems a good solution since there would be a single input source and it just produces additional events or other web service calls.

The serverless architecture for standard web apps seems like a choice that could result in a mountain of technical debt going forward. This is especially likely because the serverless architecture is most attractive to the smallest organizations due to the potential hosting savings. Those organizations are the least capable of dealing with the complexity. These are the same organizations that are prone to monolithic architectures or to integration databases since they have a short-term view of events and limited resources to deal with the issues.

I don’t know what the eventual fallout from applications built with this architecture will be but, I do suspect it will employ a lot of software engineers for many years to dig applications out of the pain inflicted.

Bug Bash

We had a big all-hands off-site meeting. In the runup to it,  there were two days that most teams had left out of  their planning process, so we ended up running a company wide bug bash to close out all of the existing bugs. Even those pesky minor bugs that are never really a priority. It felt like over those two days my team closed out an entire three week sprint’s worth of bugs by points. After the fact I went back and counted and found that the feeling was right. Some of the bugs had previously been pointed and some were quick estimates I put on them after the fact, but it was just a slew of 1-3 point tickets that all got closed out.

Dealing with ~30 tickets in two days was a furious endeavour for the team. It was really satisfying to deal with all of those tickets that had built up. I know they had been weighing on my mind in the sense that the list just seemed to keep growing and I wasn’t sure what to do about it.

This brought me to another interesting question: why were we so much more effective in this than in a normal sprint? Are we underestimating the stories relative to the bugs? That seems the most obvious answer. Is the low end of the point spectrum too compressed so the difference between a two point task and a three point task isn’t sufficiently granular? I spoke with some of the other members of my team about it and there was some speculation that since we each just took tickets related to what we already knew we just got more done, but that shouldn’t account for all of the difference. There was also a possibility that since all of the pieces were strongly independent we had less communication lag.

Maybe I should just be happy that all of those bugs got dealt with. But, I’d really love to find a way to bring that efficiency to our normal processes. If anyone has any ideas, please share in the comments.

Book Chat: Pair Programming Illuminated

My team has been doing more pair programming recently so I picked up a copy of Pair Programming Illuminated. I had never done a significant amount of pair programming before and while I felt I understood the basics, I was hoping to ramp up on some of the nuances of the practice.

It covers why you should be pair programming, convincing management that you should be able to pair program, the physical environment for local pairing, and common social constructs around different kinds of pairs. All of this is useful information, to varying degrees. Since the book was written in 2003, some of the specifics of the physical environment section didn’t age well – advising the use of 17” monitors most obviously. Both of the evangelizing sections seemed to cover the same ground, and did not seem to be written in a way to try and convince someone who is not already open to the concept. Neither section seemed to be written to the person who isn’t already in favor of doing pair programming. There were lots of references to studies, and some personal anecdotes, but none of it stuck in a way that felt like it would change someone’s mind.

The social aspects were interesting, however most of the section was stuff that felt obvious. If you have two introverts working together then they need to work differently than if you have two extroverts working together. A lot of the time the tips were common sense, and didn’t seem like it was necessary to write it down in the book. I would have liked to see more discussion of getting someone to vocalize more and clearly what they’re thinking about.

I feel like I’m better equipped to do pair programming because of having read this, but I also feel like a long blog post would have been just as good a resource and much more focused. I don’t know what else I would have wanted to fill out the rest of the book.

Java Containers on Mesos

I recently ran into an interesting issue with an application running in a container. It would fire off a bunch of parallel web requests (~50) and sometimes would get but not process the results in a timely manner. This was despite the application performance monitoring we were using saying the CPU usage during the request stayed very low. After a ton of investigation, I found out a few very important facts that contradicted some assumptions I had made about how containers and the JVM interact.

  1. We had been running the containers in marathon with a very low CPU allocation (0.5) since they didn’t regularly do much computation. This isn’t a hard cap on resource usage of the container. Instead it is used by Mesos to decide which physical host should run the container and it influences the scheduler of the host machine. More information available on this in this blog post.
  2. The number of processors the runtime reports is the number of processors the host node has. It doesn’t have anything to do with a CPU allocation made to the container. This impacts all sorts of under the hood optimizations the runtime makes including thread pool sizes and JIT resources allocated. Check out this presentation for more information on this topic.
  3. Mesos can be configured with different isolation modes that control how the system behaves when containers begin to contest for resources. In my case this was configured to let me pull against future CPU allocation up to a certain point.

This all resulted in the service firing off all of the web requests on independent threads which burned through the CPU allocation for the current time period and the next. So then the results came back and weren’t processed. Immediately we changed the code to only fire off a maximum number of requests at a time. In the longer term we’re going to change how we are defining the number of threads but since that has a larger impact it got deferred until later when we could measure the impact more carefully.