Strike Teams Part 2

This is a follow up to the original strike team post I did a while back. I’m writing this as the strike team is wrapping up. It’s been an interesting experience. We hit most of our broad goals but did it in a significantly different way than anticipated.

The first big change came early on when we were looking into the proposed changes to an existing piece of software that was going to be a consumer of what we were building. We realized that accommodating it was probably weeks of work by itself. The initial assumption of how much work was needed had been minimal, but apparently that particular solution was untenable to the people who supported the consuming software in the long term. This ended up with them pulling support from the strike team and becoming uninvolved. At the time, the scope and resourcing change was challenging since it threw everything into an uncertain state. I think it ended up being a good thing; we were able to focus on building out the functionality desired without spending a lot of time making changes to one consumer of the service. It does make the follow up more complex since we do need to make sure that work gets done otherwise the whole system will end up in a more complex state.

There was one portion of what we were going to deliver that had a complex deployment step. There were other changes that had to go either before or after this particular piece so we tried to move that deployment as far forward on the project plan as possible. The intention was to deliver it at the end of the first sprint. We encountered some significant scope changes that all unfortunately all had to go before that step. This ended up pushing the actual deployment out until the middle of the third sprint. At this point we had about ten tickets that we did the work for all sitting in this feature branch just waiting for the eventual opportunity to merge the whole thing back to master with nearly 5000 lines of changes. We ended up performing a series of messy merges bringing changes from master back to the feature branch. The final merge from the feature branch back to master was ugly since we wanted to preserve some but not all of the history of that branch. In retrospect we should have seen the merge issues coming and been more proactive about doing more small merges, but lesson learned for later.

Every team member lost some time to dealing with issues brought to them from the team they were on loan from. We had anticipated this during the first sprint but didn’t expect it to continue through the remainder of the strike team. I know I personally got sucked into some production oddities that were occurring and that after the first sprint without me my team missed my presence in code review to the point that they got me back to reviewing basically everything. Both of the other members of the team got pulled back to the their original team for issues that happened. We didn’t fight any of these needs figuring that since they seemed like critical issues for those teams, having their people available was the best overall usage of time, even if it hurt our overall ability to get things done on the strike team.

The team as a whole never really jelled to any significant degree. Everyone kept working the way their team did, which created some minor conflicts. One engineer’s team has a very lax overall structure and he would sort of disappear for days then show up with large amounts of excellent code, but it was frustrating for him to essentially go off the grid, without forewarning. The other came from a team with experience in a different technical stack and vacillated between asking not enough questions and spending too much time stuck and asking too many questions. The three of us were also in three different locations which made it difficult to really all get on the same page and get working together.

Overall the strike team model worked for this, since having representatives from some of the teams that were going to consume this service working on it made sure that we didn’t run into any issues with a misunderstanding of the domain. There were problems setting up a new team, that we should have attacked proactively since setting up any new team needs to initially get organized and form their own norms. The transient nature of the strike team prohibits a lot of identity building which, in my opinion is key to building a good team. Overall based on this experience I think that the strike team model can deal with software projects crossing multiple software team boundaries, but there may be other better ways out there to be found still.

Advertisements

Book Chat: The Psychology of Computer Programming

The Psychology of Computer Programming by Gerald Weinberg is describing the how and why of computer programming in the abstract. It covers topics like when and where to leave comments, how the choice of programming language influences the eventual program written, or how to go about hiring programmers. I read the silver anniversary edition which added some annotations about how events had changed between the original 1971 release and 1997 when the silver anniversary edition was written.

The book starts out with talking about reading programs with some example code written in PL/1. The example reads fine even if you know nothing about PL/1, it goes through several variants of the same program, dissecting the pros and cons of each implementation. While modern programming has mostly eschewed limits on memory usage and program size, similar pros and cons could be applied to things like GC pressure or context switching.

Each chapter closes with a series of introspective questions about the topic for programmers and  managers , mainly about how it could be applied to your day to day activities. After a chapter on programming as a social activity it asks the manager, “In setting your own working goals, what part is set by what is passed down from above and what part is set by what comes up from below? Are you satisfied with this arrangement, or would you like to alter it in some ways?” Whereas it asks the programmer “What part do you play in setting the goals of your team? What part would you like to play? What part would you like others to play?” These two sets of questions sort of suggest a  conversation between perspectives and helped me to understand the perspective of management better.

The section on time sharing systems vs batch systems did not age well since neither system is used anymore. It was still interesting to see a breakdown of the pros and cons of the two systems and how it impacts the culture of the workplace. It provided a case study of a company where they switched from a batch system to a time sharing system, which resulted in a breakdown of the informal communication system between the developers. Under the batch system they would congregate around the result return since when the results would be back wasn’t certain. Once they switched to a time sharing system everyone spent time in their office and there was little communication and teamwork among the programmers.

There was no single takeaway from this book where I would recommend that you should read this to achieve a particular end. Overall it was an interesting read from a conceptual perspective, but I don’t think it’s applicable to the average programmer. There is more value I think on the management side, but since that isn’t what I do it’s harder for me to judge.

Imposter Syndrome Meetup

I was at a local meetup about imposter syndrome this week and it made me remember how far I have come in my own career. The speaker talked about his journey and the times he felt like an imposter, even though he had the sorts of experiences that would make most engineers jealous. I want to talk through my own background and about how I managed to come to grips with my own professional insecurities. Hopefully this will inspire others to have more confidence in themselves.

I remember at my first job how little I felt like I was learning and how little it seemed my coworkers around me knew. When I went to interview for my next job a couple of years later I was highly nervous that I was behind the curve, and I wasn’t sure that I entirely understood how real-world software engineering was meant to work. When I got the job, I told myself that I got lucky that the interview was devised by a bunch of alumni from my college, so it covered the kinds of questions I had seen in school.

At that job the imposter syndrome kicked in immediately. I was afraid that they had certain expectations of me based on my experience level, where I felt like I hadn’t progressed past what I had learned in school. I thought that I was behind the curve on version control practices, and I hadn’t gained any exposure to any sort of real domain modeling or object oriented programming. I knew these skills were going to be important at this job and had assumed that they were things they could expect me to know when I walked in the door. There was all of the domain information that goes with a new job as well, and at this company it was literal rocket science so I couldn’t really slouch on that aspect either. The first few months were definitely rough, I had a couple of days where I spent all day fighting with basic ideas and couldn’t get anything to work, which made me feel like I didn’t deserve to be there. It eventually got better as I gained experience with the domain and the technology. I gained confidence after running down a couple of very gnarly bugs and getting praised for a creative solution to an awkward problem. Ultimately though, my anxiety was misplaced. It turned out that my managers never had expected me to walk in the door with these skills, they had picked me because they were happy with what I knew already.

Sadly the company ran into hard times and I got laid off, but this paradoxically resulted in big confidence boost. When I got back to my desk from hearing the bad news from my boss I had a ringing phone from a former coworker to schedule an interview with his new company. All that time I had thought that I was barely getting by, he’d thought I was doing fantastic work. In the two weeks between then and my last day at that job I got another offer as well which helped boost my confidence.

I took my former colleague’s offer; I’m sorry to say it ended up not being a great culture fit for me. But by now, the increase in confidence meant I was more willing to take a chance and make a move, looking for something that would be more of a challenge. I took a position in the same domain as an expert to help salvage a failing software project. This job was a good fit on paper for me, since I had experience on both the domain and the technical stack. I was confident going in and was initially given a lot of latitude to do what needed to be done, which was great technical experience. But, it had me doing a lot of more management style activities than I wanted to do, which was an area that I also felt I didn’t really have the right skills and experience. Then after getting the software stabilized and finding a order of magnitude performance improvement, the reward was to bog down the entire project in a mountain of process. So, while I’d gained confidence in my own abilities and standing when it came to technical issues, I was circling back to feeling like an imposter in this new process/management role.

My discomfort resulted in me moving to a small startup to help anchor their development team. The environment was very unstructured and goals changed week to week. I was immediately being asked to give expert opinions on technologies I had never worked with before. The situation was stressful because I felt like I wasn’t qualified to give these opinions, but it wasn’t clear whether they had anyone more qualified. On one hand I was faking it in that I didn’t know a lot of what I was talking about, but on the other hand I took the initiative to learn a lot about these new technologies. Really, I learned how to learn about technologies. The few technology decisions I made during my time there all seemed to work out fine, but I don’t know how that compares to having made other choices and I wasn’t there long enough to see the long term outcomes. Even now, I still find myself downplaying the difficulty of the work I did, still feeling like I was just a pretender.

My next job was at a large tech company, and it was an eye opening experience. This was the first ‘normal’ web application I had worked on since my first job and I was worried that I was out of practice. Since I had so many more years of experience than the last time I worked on web applications, I assumed the expectations for me would be higher than I could meet. I was worried that I would show up and not know how to do anything and would be summarily fired. This turned out not to be the case, but the impression I had going in impacted my ability to leverage myself to accomplish anything. My assumption that I wouldn’t be able to contribute right away meant I stayed quiet about areas where I could have made improvements to benefit the company; I let mediocre practices I witnessed linger way too long before trying to change them.

Despite the good work I did there, my inability to change the culture and other improvable development practices really hurt my confidence about what I could achieve in this environment. This, combined with the lack of knowledge around building web applications, pushed me to do anything and everything I could do to try and grow more. I put a concerted effort into getting out into the local development community to try and find a broader sense of inspiration. This was the time period when I started writing this blog as well. I started attending a number of local meetups and listening to various podcasts. Talking to so many new people who shared my struggles helped me understand that others don’t know some magical trick that I don’t. And, it made me realize that learning how to learn was one of the most important things I had achieved. For me, moving on from imposter syndrome has been about accepting that I don’t know everything I wish I did on a topic, but neither does anyone else, it’s all about our willingness and ability to learn and improve.

This all culminates with my current position where I changed tech stacks to stuff I had never used at all before. My specific experiences weren’t immediately relevant to this new technology stack, but I did bring a lot of thoughts on doing unit testing, domain modeling and other good technical practices. Since this was my fourth stack in 12 years as a professional I had a fair idea about how to pick up a new stack and leverage what I did know to learn new things. There are still lots of things I don’t know, but I managed to get enough together to know how to ask reasonable questions and to apply the concepts from other stacks. I am still at points concerned that I don’t know enough about certain topics but I have become become fearless about asking questions and unafraid of looking uninformed. This question asking seems to have helped one of the junior engineers on my team to have the confidence to ask questions in pull requests when he doesn’t understand what’s going on. That sort of safe space amongst the team is the sort of environment that I want to be in and having accepted my own lack of knowledge on some fronts has empowered those around me to find a better way for themselves.

Strike Teams

At work there has been a new practice of starting up strike teams for different projects. The idea is that for projects that require expertise not found on any individual team, you pull in a person or two from multiple different teams to get all of the correct skills on a single team. That’s the pitch of the strike team model, but the hidden downside is that it breaks up the cohesion of the teams that people are pulled from and the created team may not be together long enough to create new cohesion. This post is mostly going to be a chronicle of the issues that I encountered starting up a strike team and what we did to try to resolve the problems.

The first problem was coordinating the various different teams to figure out who was going to be involved. In the case of the strike team I was forming, the two teams contributing resources both wanted to know who the other team was going to contribute before making their own decisions. They also wanted to have a fixed end date to the project before deciding who to contribute. We ended up resolving this issue with a fixed date for whoever they contributed to the team and getting the two of them to discuss the situation amongst themselves.

The second problem was aligning the start date of the strike team. The three teams contributing resources all have different schedules to their individual sprints so it makes coordinating a ‘start’ date for the strike team difficult.  Those who were on teams about to start another sprint wanted the strike team to start now, whereas those with other commitments made weren’t available. We ended up doing a rolling start where as each team finished their sprint they rolled onto the team and the team ramps up as people become available. We did some preparatory work to get everyone up to speed on the goals and challenges of the team so they were aware of what’s going on whenever they were able to join.

The third big problem is more specific to our particular organization, not to the strike team model. As part of setting up the strike team we needed to schedule things like standups and retros for the team. Since the team is split across both coasts, the hours for scheduling these meetings are limited. The conference rooms are also pretty much all booked because every other team beat us to scheduling. We ended up asking IT to rearrange some other non-recurring meetings and managed to get a consistent slot for the standups. The retro slot was more complicated but we managed to get meetings roughly spaced; by not being a stickler for strict week deliminations we managed to find times that worked.

So with all the overhead sorted out we finally get to move on to the real work of the situation which will be a nice change of pace from the administrative aspect.

Book Chat: Pair Programming Illuminated

My team has been doing more pair programming recently so I picked up a copy of Pair Programming Illuminated. I had never done a significant amount of pair programming before and while I felt I understood the basics, I was hoping to ramp up on some of the nuances of the practice.

It covers why you should be pair programming, convincing management that you should be able to pair program, the physical environment for local pairing, and common social constructs around different kinds of pairs. All of this is useful information, to varying degrees. Since the book was written in 2003, some of the specifics of the physical environment section didn’t age well – advising the use of 17” monitors most obviously. Both of the evangelizing sections seemed to cover the same ground, and did not seem to be written in a way to try and convince someone who is not already open to the concept. Neither section seemed to be written to the person who isn’t already in favor of doing pair programming. There were lots of references to studies, and some personal anecdotes, but none of it stuck in a way that felt like it would change someone’s mind.

The social aspects were interesting, however most of the section was stuff that felt obvious. If you have two introverts working together then they need to work differently than if you have two extroverts working together. A lot of the time the tips were common sense, and didn’t seem like it was necessary to write it down in the book. I would have liked to see more discussion of getting someone to vocalize more and clearly what they’re thinking about.

I feel like I’m better equipped to do pair programming because of having read this, but I also feel like a long blog post would have been just as good a resource and much more focused. I don’t know what else I would have wanted to fill out the rest of the book.

Changing Stacks

I had an odd realization recently – the jobs that I changed stacks for seemed to be better than the jobs where I already knew the stack. I’m not sure if it’s a common experience or something that’s unique to my experiences.

I’ve got a few theories about why this might be the case for me. First, the jobs where I changed stacks appeared more engaging to me because there was more to learn. Second, the jobs where I changed stacks felt better because they used different hiring practices, where they looked for underlying talent and skills as opposed to specific stack-related experiences. Third, relating to the second point, the diverse points of view brought together because so many colleagues were also changing stacks creates a better workplace. Fourth, the people that are willing to change stacks are the kind of people who are more open to learning going forward. Lastly, the jobs where I changed stacks happened to be better by pure chance due to small sample sizes. None of these theories is particularly provable. Several of them could be true and working together. They could all be false and it could be something completely different.

I know that from talking to some of my coworkers that they all came into their current jobs without particular experience in the Scala/Play Framework/MongoDB stack as well, although most of them came from a much more similar Java stack rather than the C# stack I came from most recently, or any of the other stacks I worked with in the past. They mentioned that we have issues recruiting in our DC office because of the pay differentials for cleared work in the area;  there are lots of places you can go and get a comfortable government contracting job and not really stretch yourself and grow. There has also been some discussion about how the stack effects our ability to recruit since it’s not as common as other stacks, and some candidates have expressed reservations since the stack isn’t really growing in popularity. There was a lot of information in the most recent Stack Overflow Developer Survey that corroborates the idea that the Scala is shrinking, but it says it is shrinking in relative of share of all votes. In absolute terms the change is less clear.

I guess that I’ve enjoyed the stack changes I’ve made. Only once did I set out with an intention of changing stacks as part of a job change. Then I wasn’t looking to go to anywhere particular, but I was looking to move away from ColdFusion since it didn’t really mesh with how I like to do development. I don’t have any intention to make another change any time soon, but I’ll definitely consider another stack change when I do just because it opens up a wider variety of options and seems to have worked out for me in the past.

Scoring Intern Puzzles

I recently got to score coding puzzle solutions submitted by a crop of students who wanted to be interns for my company. Overall, I scored six puzzles and gave two passing grades. These were applications generally submitted by rising Juniors and Seniors, so I had fairly modest expectations about the overall quality of the solutions they would be presenting. I wanted to see some basic object oriented design, usage of the proper data structures, and correctness in a couple edge cases. The issues with the failures varied, but the most egregious and common issue was a failure in understanding object orientation. The typical solution was a java file, with one class generally nothing but static methods and everything public.

This led me to two immediate thoughts: are these students horribly unprepared or am I expecting too much? I went and asked one of the engineers who had scored another batch of solutions about what he got. His batch was distinctly better, one even had unit tests, but there were still some significant duds. The conversation attracted two recent grads who took a look at some of the solutions, and also thought they weren’t that good; since they were closer experience-wise, I thought that their expectations would be more reasonable. I know that they are getting a computer science education not a software engineering education, and I had talked about this difference before, but they need some ability in computer programming to demonstrate their computer science skills.

I tried to think back to when I was in school and what I was like at that point. I know I didn’t have the best practices then, but I feel like I was ahead of where they were. I wish I still had some of the code I had written then to look back at and compare. I don’t think I was great at that point in my programming journey. But, I feel like I would have at least put together a class or two as part of the solution, even if they weren’t really needed,  just to show I could decompose the problem.

I suspect that since internships have become a significant need that students are pushing out applications and doing this puzzle is a hurdle to them doing yet another application but they don’t recognize how to optimize their solution to show off their skills. Each solution had some positive aspects, but were missing most of the aspects I wanted to see. To me this meant that the skills were being taught, since they had all picked up some of them, just not enough of them yet.

Productivity and Maintainability

As I had alluded to previously, the project I’m currently on has some very aggressive and fixed deadlines. The project got scoped down as much as possible but there have still been some trade-offs for productivity over maintainability. Some of it happened as a team decision – we discussed ways to compress the schedule and found some shortcuts to produce the needed functionality faster. But as new requirements came up in the process those shortcuts ended up causing much more work overall.

We knew that the shortcuts would cause more total work but we thought that the additional work could be deferred until after the initial deadline. The change of requirements meant that some of the automated testing that had been deferred was an issue. The initial manual testing plus the retest after the changes probably wasn’t much less than the effort to do the automated testing the first time. This specific shortcut wasn’t responsible for much of the time we shaved off the schedule, but when you’re doing similar things all over, the total cut was significant.

The key part of this tradeoff is that it assumed we would go back after the initial deadline and fill in the bits like we intended to instead of moving to something else (which is appearing more and more likely). Product management has a request for an additional feature in another existing application. This request has a similarly tight deadline, but is much smaller in scope (15 points) than the initial request(~120 points for the scoped down version). Picking up this request would mean that we end up further from the decision to defer portions of this and we’d be less likely to be able to deal with the technical debt in a timely manner that we accrued to the do the first request.

Since I’m still new at this job I’m not sure if this behavior is regular or since both of these items are related to the same event if it’s just a one-time thing. If this is a one-off problem then we can survive more or less regardless of how well we deal with it. If this is going to be a regular occurrence without an opportunity to resolve the problem, I feel like I’d need to take a different tactic in the future about how to work with the problem of short term needs.

This feels like the overall challenge of engineering: there are problems that need to be solved in a timeboxed way. We need to make good enough solutions that fit all of the parameters, be they time or cost. We can be productive and leave behind a disaster in our wake but succeed at business objectives, or we can build an amazing throne to software architecture but have the project fail for business reasons. The balance between the two is the where this gets really hard. If you’ve already been cutting corners day to day and the request to push the needle gets made, you have no resources left. The lack of an ability to really quantify the technical debt of an application in a systematic way makes it hard to project where we are on the continuum to others who aren’t working on the system daily.

Without this information, program-level decisions are hard to make and you end up with awkward mandates from the top that aren’t based on the realistic situation on the ground. Schedule pressure causes technical debt, testing coverage mandates cause tests that are brittle to just ratchet up the coverage, mandating technology and architectural decisions results in applications that just don’t work right.

Automation and the Definition of Done

I recently listened to this episode of Hanselminutes about including test automation in the definition of done. This reminded me of acceptance test driven development (ATDD) where you define the acceptance criteria as a test which you build software to fix. These are both ways to try to roll the creation of high level automated tests into the normal software creation workflow. I’ve always been a fan of doing more further up the test pyramid but never had significant success with it.

The issue I ran into the time I attempted to do ATDD was that the people who were writing the stories weren’t able to define the tests. We tried using gherkin syntax but kept getting into ambiguous states where terms were being used inconsistently or the intention was missing in the tests. I think if we had continued to do it past the ~3 months we tried it, we would have gotten our terminology consistent and gotten more experience at it.

At a different job we used gherkin syntax written by the test team; they enjoyed it but produced some significantly detailed tests that were difficult to keep up-to-date with changing requirements. The suite was eventually scrapped as they were making it hard to change the system due to the number of test cases that needed to be updated and the way that they were (in my opinion) overly specific.

I don’t think either of these experience say that the idea isn’t the right thing to do, just that the execution is difficult. At my current job we’ve been trying to backfill some API and UI level tests. The intention is to eventually roll this into the normal workflow;  I’m not sure how that transition will go, but gaining experience with the related technologies ahead of time seems like a good place to get started. We’ve been writing the tests using the Robot Framework for the API tests and Selenium for the UI ones. The two efforts are being led by different portions of the organization, and so far the UI tests seem to be having more success but neither has provided much value yet as they have been filling in over portions of the application that are more stable. Neither effort is far enough along to help regress the application more efficiently yet, but the gains will hopefully be significant.

Like a lot of changes in a software engineering the change is more social than technical. We need to integrate the change into the workflow, learn to use new tools and frameworks, or unify terminology. The question of which to do keeps coming up but the execution has been lacking in several attempts I’ve seen. I don’t think it’s harder than other social changes I’ve seen adopted, such as an agile transition, but I do think it’s easier to give up on since it isn’t necessary to have high level automated testing.

Including automation in the definition of done is a different way to describe the problem of keeping the automation up to date. I think saying it’s a whole team responsibility to ensure the automation gets created and is kept up to date makes it harder to back out later. The issue of getting over the hump of the missing automation is still there, but backfilling as you make changes to an application rather than all upfront may work well.

Theories of Technical Debt

There are a couple of different major causes of technical debt even on the best run projects.

  1. Schedule pressure
  2. Changed requirements
  3. Lack of understanding of the domain

You can choose to take on debt strategically to accommodate an aggressive schedule, you can accumulate debt from having things change, or you can collect debt from doing what appeared to be the right thing but turned out not to be once you learned more about the underlying situation. There are plenty of other reasons that technical debt can be acquired, but most of those can be avoided. Changing requirements can’t really be avoided; things can change, that’s the nature of life. Understanding of the domain is a tricky issue, since you can spend more time upfront to better understand the domain but you will still uncover new aspects as you go.

Schedule pressure is the most demanding of the three. You can consciously say that technical debt should be taken on by doing something the expedient way. You can also have implicit schedule pressure that pervades the decision making process. This sort of pervasive pressure causes people to value different things. If leadership discusses the schedule day in, day out, but doesn’t mention quality, it ends up lacking.

Technical debt is fundamentally a lack of quality; not in the defect sense but in the lack of craftsmanship sense. All of those 5,000 line classes got written by other engineers who were doing the best they could within the constraints of the environment at the time. But some engineers look at that and don’t want to touch it afraid of what it is and how hard it is to change. Some other engineers look at it and see a mountain to be climbed, or a wilderness to be civilized. The problem code is something to be taken and broken to your will. Each kind of engineer has a place in the development lifecycle.

If a company needs to hit a product window and is only full of the kind of engineers who see debt as a challenge to be dealt with they might not be able to make that tradeoff. If you only have the engineers who are concerned with maximum velocity but leave behind chaos in their wake you will have trouble as the codebase matures. Every engineer seems to have a range on the spectrum where they are most comfortable. Where in that range they land day to day seems to be controlled by the environment around them.

If you are surrounded by one side or the other you might lean towards that side of the range. The messages management sends out about the quality of software relative to the messages they send out about schedule of the project is another important factor. If they keep hammering home to get more done, people will find corners they think can get cut. What those corners are will differ between people, but they will take on debt in various corners of the codebase. This sort of debt is really insidious since it was never consciously decided on. If you decide that you will defer good error messages, or avoid building out an abstraction now, but you do it explicitly because of the schedule is the right business choice, then since the team discussed and decided to do it, they know as a whole that’s not the best technical solution but is the best business solution, whereas if someone just does it everyone else may not even be aware of the other options.