Scoring Intern Puzzles

I recently got to score coding puzzle solutions submitted by a crop of students who wanted to be interns for my company. Overall, I scored six puzzles and gave two passing grades. These were applications generally submitted by rising Juniors and Seniors, so I had fairly modest expectations about the overall quality of the solutions they would be presenting. I wanted to see some basic object oriented design, usage of the proper data structures, and correctness in a couple edge cases. The issues with the failures varied, but the most egregious and common issue was a failure in understanding object orientation. The typical solution was a java file, with one class generally nothing but static methods and everything public.

This led me to two immediate thoughts: are these students horribly unprepared or am I expecting too much? I went and asked one of the engineers who had scored another batch of solutions about what he got. His batch was distinctly better, one even had unit tests, but there were still some significant duds. The conversation attracted two recent grads who took a look at some of the solutions, and also thought they weren’t that good; since they were closer experience-wise, I thought that their expectations would be more reasonable. I know that they are getting a computer science education not a software engineering education, and I had talked about this difference before, but they need some ability in computer programming to demonstrate their computer science skills.

I tried to think back to when I was in school and what I was like at that point. I know I didn’t have the best practices then, but I feel like I was ahead of where they were. I wish I still had some of the code I had written then to look back at and compare. I don’t think I was great at that point in my programming journey. But, I feel like I would have at least put together a class or two as part of the solution, even if they weren’t really needed,  just to show I could decompose the problem.

I suspect that since internships have become a significant need that students are pushing out applications and doing this puzzle is a hurdle to them doing yet another application but they don’t recognize how to optimize their solution to show off their skills. Each solution had some positive aspects, but were missing most of the aspects I wanted to see. To me this meant that the skills were being taught, since they had all picked up some of them, just not enough of them yet.

Productivity and Maintainability

As I had alluded to previously, the project I’m currently on has some very aggressive and fixed deadlines. The project got scoped down as much as possible but there have still been some trade-offs for productivity over maintainability. Some of it happened as a team decision – we discussed ways to compress the schedule and found some shortcuts to produce the needed functionality faster. But as new requirements came up in the process those shortcuts ended up causing much more work overall.

We knew that the shortcuts would cause more total work but we thought that the additional work could be deferred until after the initial deadline. The change of requirements meant that some of the automated testing that had been deferred was an issue. The initial manual testing plus the retest after the changes probably wasn’t much less than the effort to do the automated testing the first time. This specific shortcut wasn’t responsible for much of the time we shaved off the schedule, but when you’re doing similar things all over, the total cut was significant.

The key part of this tradeoff is that it assumed we would go back after the initial deadline and fill in the bits like we intended to instead of moving to something else (which is appearing more and more likely). Product management has a request for an additional feature in another existing application. This request has a similarly tight deadline, but is much smaller in scope (15 points) than the initial request(~120 points for the scoped down version). Picking up this request would mean that we end up further from the decision to defer portions of this and we’d be less likely to be able to deal with the technical debt in a timely manner that we accrued to the do the first request.

Since I’m still new at this job I’m not sure if this behavior is regular or since both of these items are related to the same event if it’s just a one-time thing. If this is a one-off problem then we can survive more or less regardless of how well we deal with it. If this is going to be a regular occurrence without an opportunity to resolve the problem, I feel like I’d need to take a different tactic in the future about how to work with the problem of short term needs.

This feels like the overall challenge of engineering: there are problems that need to be solved in a timeboxed way. We need to make good enough solutions that fit all of the parameters, be they time or cost. We can be productive and leave behind a disaster in our wake but succeed at business objectives, or we can build an amazing throne to software architecture but have the project fail for business reasons. The balance between the two is the where this gets really hard. If you’ve already been cutting corners day to day and the request to push the needle gets made, you have no resources left. The lack of an ability to really quantify the technical debt of an application in a systematic way makes it hard to project where we are on the continuum to others who aren’t working on the system daily.

Without this information, program-level decisions are hard to make and you end up with awkward mandates from the top that aren’t based on the realistic situation on the ground. Schedule pressure causes technical debt, testing coverage mandates cause tests that are brittle to just ratchet up the coverage, mandating technology and architectural decisions results in applications that just don’t work right.

Automation and the Definition of Done

I recently listened to this episode of Hanselminutes about including test automation in the definition of done. This reminded me of acceptance test driven development (ATDD) where you define the acceptance criteria as a test which you build software to fix. These are both ways to try to roll the creation of high level automated tests into the normal software creation workflow. I’ve always been a fan of doing more further up the test pyramid but never had significant success with it.

The issue I ran into the time I attempted to do ATDD was that the people who were writing the stories weren’t able to define the tests. We tried using gherkin syntax but kept getting into ambiguous states where terms were being used inconsistently or the intention was missing in the tests. I think if we had continued to do it past the ~3 months we tried it, we would have gotten our terminology consistent and gotten more experience at it.

At a different job we used gherkin syntax written by the test team; they enjoyed it but produced some significantly detailed tests that were difficult to keep up-to-date with changing requirements. The suite was eventually scrapped as they were making it hard to change the system due to the number of test cases that needed to be updated and the way that they were (in my opinion) overly specific.

I don’t think either of these experience say that the idea isn’t the right thing to do, just that the execution is difficult. At my current job we’ve been trying to backfill some API and UI level tests. The intention is to eventually roll this into the normal workflow;  I’m not sure how that transition will go, but gaining experience with the related technologies ahead of time seems like a good place to get started. We’ve been writing the tests using the Robot Framework for the API tests and Selenium for the UI ones. The two efforts are being led by different portions of the organization, and so far the UI tests seem to be having more success but neither has provided much value yet as they have been filling in over portions of the application that are more stable. Neither effort is far enough along to help regress the application more efficiently yet, but the gains will hopefully be significant.

Like a lot of changes in a software engineering the change is more social than technical. We need to integrate the change into the workflow, learn to use new tools and frameworks, or unify terminology. The question of which to do keeps coming up but the execution has been lacking in several attempts I’ve seen. I don’t think it’s harder than other social changes I’ve seen adopted, such as an agile transition, but I do think it’s easier to give up on since it isn’t necessary to have high level automated testing.

Including automation in the definition of done is a different way to describe the problem of keeping the automation up to date. I think saying it’s a whole team responsibility to ensure the automation gets created and is kept up to date makes it harder to back out later. The issue of getting over the hump of the missing automation is still there, but backfilling as you make changes to an application rather than all upfront may work well.

Theories of Technical Debt

There are a couple of different major causes of technical debt even on the best run projects.

  1. Schedule pressure
  2. Changed requirements
  3. Lack of understanding of the domain

You can choose to take on debt strategically to accommodate an aggressive schedule, you can accumulate debt from having things change, or you can collect debt from doing what appeared to be the right thing but turned out not to be once you learned more about the underlying situation. There are plenty of other reasons that technical debt can be acquired, but most of those can be avoided. Changing requirements can’t really be avoided; things can change, that’s the nature of life. Understanding of the domain is a tricky issue, since you can spend more time upfront to better understand the domain but you will still uncover new aspects as you go.

Schedule pressure is the most demanding of the three. You can consciously say that technical debt should be taken on by doing something the expedient way. You can also have implicit schedule pressure that pervades the decision making process. This sort of pervasive pressure causes people to value different things. If leadership discusses the schedule day in, day out, but doesn’t mention quality, it ends up lacking.

Technical debt is fundamentally a lack of quality; not in the defect sense but in the lack of craftsmanship sense. All of those 5,000 line classes got written by other engineers who were doing the best they could within the constraints of the environment at the time. But some engineers look at that and don’t want to touch it afraid of what it is and how hard it is to change. Some other engineers look at it and see a mountain to be climbed, or a wilderness to be civilized. The problem code is something to be taken and broken to your will. Each kind of engineer has a place in the development lifecycle.

If a company needs to hit a product window and is only full of the kind of engineers who see debt as a challenge to be dealt with they might not be able to make that tradeoff. If you only have the engineers who are concerned with maximum velocity but leave behind chaos in their wake you will have trouble as the codebase matures. Every engineer seems to have a range on the spectrum where they are most comfortable. Where in that range they land day to day seems to be controlled by the environment around them.

If you are surrounded by one side or the other you might lean towards that side of the range. The messages management sends out about the quality of software relative to the messages they send out about schedule of the project is another important factor. If they keep hammering home to get more done, people will find corners they think can get cut. What those corners are will differ between people, but they will take on debt in various corners of the codebase. This sort of debt is really insidious since it was never consciously decided on. If you decide that you will defer good error messages, or avoid building out an abstraction now, but you do it explicitly because of the schedule is the right business choice, then since the team discussed and decided to do it, they know as a whole that’s not the best technical solution but is the best business solution, whereas if someone just does it everyone else may not even be aware of the other options.

Quality Software

What makes a piece of quality software? Low current defect count is an obvious part of quality software. An intuitive and powerful interface is another obvious part; the interface should be easy for a beginner to start with, yet rich and expressive for an experienced user. These are both important to quality software from a product perspective, but quality on the engineering side is more complex.

On the engineering side, the most important part of quality software is that those who need to make changes can do so confidently. This is a multifaceted goal, with lots of components.

quality-software

You can accomplish each of these subgoals through multiple paths of action. You can get a well-tested application by doing TDD, ATDD, BDD, ad-hoc unit testing, or with enough time, just plain old manual testing. Organizations have all picked specific tools or practices to try to accomplish a particular intention and enshrined that particular practice as the right way to do things. The techniques give you practices but it is hard to say if you are achieving what they are designed to do without quantifying them somehow.

Most of these aspects are difficult to quantify if you’re trying to measure “quality.” Code coverage, while a quantifiable metric, isn’t the important part; if you cover 10,000 lines of models but skip 150 lines of serious business logic, the metric will say you’re doing great but you missed the important parts. The important part of code coverage is knowing that you got the important parts of the codebase. There is some interesting research going on into quantifying the readability of source code, but it acknowledges that the humans they used to score the samples did not regularly agree on what was readable. You can use Cyclomatic Complexity as a proxy against modularity, which can tell you that you successfully broke the problem into smaller pieces, but it doesn’t tell you anything about your ability to replace one without impacting others. Easy-to-build and easy-to-start-with can be quantified to a certain degree; if you give a new developer the codebase and related documentation, how long does it take them to get it running on their local machine. The problem with that metric is you don’t often bring in new developers to a project and new developers aren’t all created equal. I don’t think I’ve ever seen someone actively collect that metric in minutes and hours.

The more time I spend in this industry the more it seems most activity is designed to ensure that the system is functionally correct, since that is the first level of success. This second level of creating truly quality software seems to be underappreciated because it is hard to quantify and unless you spend your days with the code base you can’t appreciate the difference. I think this difference is what can make some tech companies unstoppable juggernauts in the face of competition and everything else. I suspect that a lot of the companies that are putting out new and innovative ideas on the edge of building higher quality systems have found ways to quantify the presence of quality, not just the lack of quality signified by downtime and defect counts.

Propinquity

Propinquity can be described as the rather obvious idea that you interact more with people near you. Not just physical nearness – it also encompases mental nearness like shared beliefs, common experiences, or a shared ubiquitous language. This comes up in software development in a number of different ways. Propinquity is teams that are more efficient because they understand each other and the domain at a deeper level. It is a team that intimately understands each other and the domain. It has negative expressions too. You get software ecosystems that become insular and don’t integrate new ideas. It is even sexism and racism. It is an unconscious bias that influences the way you think about things.

Recognizing propinquity can help you overcome your negative biases, build stronger teams, create solutions to your hardest problems from outside the box. Team culture is propinquity – it can be good forming bonds between coworkers, it can be bad when the team looks for people like themselves and ends up rejecting the best candidate because they didn’t have a strong enough resemblance. This can also be seen as a kind of implicit bias; when you start to think of the discussion of diversity in the workplace in terms of propinquity, it starts to mean something different.

I recently heard an interesting discussion with Leslie Miley of Twitter on a podcast (starting about 12 minutes in) about how the lack of diversity at Twitter could be negatively impacting their ability to solve problems. His thesis is that if you hire everyone with very similar backgrounds that they will all look at the same problem the same way and come up with the same solutions. They would then miss the same things, and get stumped by the same problems. This is why you often get great solutions from people outside of the problem, because they are looking at the problem from a different perspective.

I don’t have much insight into how other professions interview people, but in software it seems like the interview is more to determine if you can do the job than if the applicant is the person you actually want to be working with. But the way we go about determining if you can do the job is to ask questions that you don’t truly need to be able to answer to do the job, but implicitly reflect schools of thought on how the job should be done. If you ask a question about red-black trees you are looking for someone who had a computer science education; you aren’t expecting anyone to build a red-black tree but you are using that as a proxy for having been exposed to certain ideas. If instead you ask questions about TDD or design patterns it says you are the kind of people who care about those ideas instead and have less of an interest in computer science fundamentals and optimizations.

I went on an interview previously where they asked lots of low-level questions. The job had nothing particular to do with those skills, but they felt that those skills were what differentiated good programmers from bad ones. Their understanding of what makes a good programmer is that you can twiddle bits and will micro-optimize solutions. The interviews we gave at a place I used to work were focused on design patterns, which  was the language everyone there shared. At that job we didn’t have the low-level optimization skills to find a particular optimization that would have been a big win for an ongoing performance problem.

This ties back into the post I had on evolution of the javascript ecosystem. The ecosystem has propinquity in terms of javascript language itself, but has diversity in that most of the developers came from different stacks with different backgrounds. This level of creativity is what can be achieved if you balance propinquity and diversity against each other well.

Propinquity looks like culture superficially, but it really is a double-edged sword: a team that intuitively understands each other, but blocks out other perspectives that could be valuable to solve problems better. The values espoused by your organization are all about creating a culture and a shared experience to build the positive aspects of propinquity, but the negative aspects can appear too if not kept in mind. The different aspects of what an engineer looks like can’t be too rigid, otherwise you reject great candidates for lack of a degree, or interview for markers that imply competence but aren’t competence themselves. Conversely diversity is more than ethnic diversity – it’s diversity of thought. You need people who think in design patterns and people who think functionally and people who buck the trends of what you are doing right now.

Ideally you could get people who can do all of those things. That’s obviously not a realistic way to do recruitment. You want people who fill in your weak areas, ideally creating something like this.

overlapping

It represents the whole organization and how each individual’s competencies overlap but are each valuable. Each individual is close enough to the others to have some propinquity, but the whole group embodies the diversity you would want to get different viewpoints on problems. Conceptually with smaller units you can also look at this as a radar chart, if you map out where your team is currently strong and where it is currently weak, you see where you need help and where you don’t need as much, but it’s still hard to find the right way to find that person when they look different, and may not be in your normal talent pipelines.

Even just considering these sorts of issues points you in the right direction. If you consider that the person you are looking for is different than those you have, you might be willing to ask different questions or judge the responses differently.

Fix vs Replace

I was thinking about when to fix a piece of software versus replace it as part of our normal software lifecycle, prompted by a discussion at work of a particular piece of software. The software in question works great, but nobody is confident that they are successful when changes do need to be made. Part of the lack of confidence is because changes are only made once or twice a year, so getting it set up and running locally and testing it is a chore, plus it turns into a complex process with many chances of failure. The other part is it runs on top of another library that nobody is really familiar with; the library isn’t used anywhere else in our stack, so nobody develops any familiarity with it. This particular piece of software is also critical to expanding our business, as it’s the primary way new client data is loaded into the system.

If we wanted to fix it, we could take the existing solid software and enhance it to make the setup and testing easier, or we could clean up the usage of the underlying library so it’s more intuitive. However, the decision was instead made to replace it with something totally new. I wasn’t involved in the decision, but became involved with the original piece of software after the fact to make a change while the replacement is still being developed. It seems like this software could be rehabilitated at first glance, but clearly someone else thought otherwise.

I know in general I’m biased towards fixing existing software. I’ve spent most of my career working on brownfield applications and building oddly shaped pegs to fit back into the oddly shaped holes of those applications. I think that I’ve done this because I enjoy it; building everything from scratch is almost too easy since there are so many fewer constraints involved. I get a different sort of satisfaction from it. I know I’m not the only one who has this particular tendency; the folks at Corgibytes are specializing in this sort of work. I’ve even been nostalgic for a codebase I’ve worked on – not the application but the codebase itself.

I feel like most organizations are biased towards replacing software because it lets you just say the entire thing is bad and try again instead of having to pick a particular thing to do or change. You don’t have to agree on what’s wrong with it, or get into details of what to change. This flexibility of scope leads to the quid pro quo rewrite where a piece of software is replaced with a new version that also contains a major new feature; this concept was introduced to me by Re-Engineering Legacy Software. Re-Engineering Legacy Software describes this as a bargaining tool to enable you to gain acceptance of a plan to do the rewrite, but I’ve always seen the business bring up the rewrite with the idea that they’re unhappy with the team’s ability to change a piece of software and this would clean up the underlying causes of the problems.

That’s the big problem: the current software has problems due to something. And unless you deal with whatever that “something” is, the new software probably is not going to be significantly better than the old. It may not have had the chance to become crufty yet, so it seems better when it’s new, but given a few years goes back to the same sorts of problems you had the last time. You need normal software processes that enable you to create and maintain quality software even as requirements change.

You need feedback into your initial processes of what caused the cruftiness to accumulate. This can be seen as a form of double-loop learning, where the feedback of what happened impacts how you see the world and thereby influences your decision-making process, not just the decisions themselves. If you are accumulating cruft because you put schedule pressure on the initial development resulting in a less modular design, the feedback to the decision-making process would be different than if you are accumulating cruft because the requirements changed radically. To make true long-term improvements, that’s the step you need to take, which sometimes might lead you to fix, and sometimes might lead you to replace.

Book Chat: The Senior Software Engineer

The Senior Software Engineer by David Bryant Copeland is a manual for being a Senior Engineer and covers all of the things nobody bothered to tell you about. For me, it’s all of those things that I put together from reading dozens of blog posts and watching those senior to me do the job. Most of the book was the description of what the role is and describes it in sufficient detail to help an engineer who has not yet achieved the role to target the experience to show they can do the job. It is the sort of description of the role I would have liked to have gotten from a manager early in my career.

Since I’ve been doing the job of a Senior Engineer for years now, most of it wasn’t particularly novel to me. However, I would have liked to read this before I made the transition to Senior Engineer. It differentiates between building a new feature, fixing bugs, and solving problems, which are what I would describe as the three main technical components of being a Senior Engineer. It also describes quality technical writing, working with others, making technical decisions, interviewing job applicants, and leading a team.

Interviewing is probably one of the most critical areas that you don’t get a lot of exposure to before becoming a Senior Engineer, and it’s pretty well covered here. Copeland lists four components to the technical interview (in order):

  1. Informal get-to-know-you conversation
  2. Homework assignment
  3. Technical phone screen
  4. Pair Program

It seems like the phone screen should go before the homework assignment, especially if the assignment won’t be the focus  during the phone screen. I actually interviewed at a place where the author was working some years ago and the interview was significantly different than what he described. There was no informal conversation, no homework assignment, and no pair programming. There was a phone screen and a white boarding session. I suppose he may not have had total control of the process there but it is interesting to see the difference between the ideal and the actualized.

If you are looking to become a Senior Engineer and get a grasp of all of the aspects of the job before trying it, this would be a good read for you. If you are a Senior Engineer I’d pass and try something else. It might also be good read if you are a manager and looking to find a way to articulate some of these ideas to your junior team members.

Looking Back Again

I had reflected back on my last job before. However, I had a thought today that got me in a reflective mood again. I miss being involved in any and all problems that came up. I was in a position that I would get tapped whenever anything went wrong and it wasn’t clear whose problem it was. I got to take it apart and find the root cause and either fix it or pass it off the team responsible for that area. The rush of having a regular stream of critical problems to deal with was intense. It had stratospheric highs, and it had some amazing frustrations that went with it; in all it was intensely involving.

I miss the puzzle of it all. I used to go to the office, read my email, pick the most immediately  pressing problem, and go look into it. The problem of it was I spent all day putting out fires and no time doing fire prevention. The knowledge to help prevent the fires was only really learnable through dealing with them. Most of the developer team wasn’t exposed to the problems in a way to learn how to avoid them. The problems were varied enough that you couldn’t just put together a brown bag session to teach them.

It’s exhilarating to be someone who gets to work on the problems but it also becomes massively stressful. The situation is bad for the organization as well. Problems get solved but the organization doesn’t really learn, so it will never get better. There was one particular incident that sticks out in my mind where another developer sent me this error message asking if I knew what to do with it. I responded immediately with an explanation of what was needed to fix the problem. The developer later asked how I knew what to do so quickly, the answer was I had spent 6 hours figuring it out the day before when someone else ran into the same problem.

I don’t miss the stress aspect of this when you’ve got problems six deep and someone asks if you’re able to help them and the answer is no. It’s difficult when the person whose problem is way down your queue wants to know what is going on with it, and you need to tell them it isn’t your priority right now. I didn’t fully understand how much stress I was under there until I had given it all up and moved on.

I managed to avoid some of the negative aspects of being a hero in software development. I kept to normal hours. I didn’t create situations that need to be fixed by my personal efforts. I didn’t try to be involved with everything. In fact, I was actively avoiding proactively getting involved in some places that looked like they were going to get into problems, because I felt that while their problem was going to prevent them from making progress, it didn’t endanger the existing system and therefore might be a learning experience.

By understanding and being able to articulate the various issues there that I did and did not enjoy, I hope to be able to optimize my involvement at my new job. Learning from your past experiences is the only real way to do better.

ChatOps and ChatZero

Inbox Zero was always a concept I never fully understood the need for before I moved to a big company. You get some email every day and you read it, what’s the problem? Now that I’ve been in a big organization for a while, I think I understand the problem better: you eventually end up on email lists that are about things that are only tangential to what you do. I’m on one particular email list about communications from partners we integrate with; I’m not involved in those integrations at all, but can’t unsubscribe from the list. Inbox Zero both helps you stay focused on doing real work and makes sure you don’t miss anything important.

Chat and ChatOps has been replacing email usage across the industry, since it is an easier way to collaborate in real time and the chat histories make it easier to share information asynchronously. As ChatOps has picked up in my office I will get to the office and see there are hundreds of new messages spread across the roughly 20 channels I’m in. I used to read all of them every morning, but some of the channels had been getting chatty, which made it harder to keep up. I first dealt with this by muting or leaving channels that I was less interested in, though apparently leaving channels in Slack is considered bad etiquette in some circles. This hasn’t helped, since I keep getting involved in more and more channels, no matter how many I drop.

I had started skimming some of the channels I am in rather than really reading all of it. This helped keep up with the increased flow, but deciding which channels to skim was hard. I recently listened to Arrested DevOps talk about ChatOps, which was an interesting because they came from multiple different backgrounds and chat usage styles. They made a couple of suggestions, one of which was counterintuitive but has been working great for me since I’ve implemented it – make more channels, especially short-term channels for intense conversation. More channels helps keep the content of each channel more focused so instead of having four semi-related conversations in one channel, you have four channels each with one conversation. With each of those breakout conversations being linked as it moves out you can either go read it or skip it freely. One of the other great but more obvious suggestions was the use of dedicated update channels for notifications from automated systems.

The switch to chat from an email based system has been a great experience overall in my opinion, but it will take a while for all of the productive patterns of how to use chat to get established. The great part is that the Chat system is open and pluggable and not centrally controlled. This ability to innovate on how the platform works brings about hubot and all of the other bots and integrations that make ChatOps an amazing tool.