Dependency Scanning and Repeatable Builds

Following up on the discussion of the OWASP dependency check,  my team now has the means to scan a dependency manifest against a list of known vulnerable ones. The question now in front of us is whether this should this be a step in the build pipeline that breaks the build?

There are some immediate pros and cons that come to mind. On the pro side, you will find out if you are about to ship something that has a known vulnerability. On the con side, there is the problem of dealing with false positives, and builds breaking that used to work if the world around you changes. Evaluating the impact of the pro is relatively obvious: you gain information about the system to act proactively before something becomes a problem. The cons get a little more complex. You need the ability to feed information back to the tool telling it that is has made a mistake to reduce future false positives. This is easy enough with something like a dot file but it makes configuring and using the tool more complex.

For the other con imagine the situation where you have a pull request that has passed CI and is ready to merge. Once merged it kicks off a series of processes to build and deploy the code, and then the build of the merged code fails because a new vulnerability became known to the system between when the pull request was built and when it was merged. This should be a rare occurrence, but when it does happen it will be even more unfamiliar to those who see the outcome.

We’ve been toying with an in-between response where we run it in a pull request and break the build for critical security vulnerabilities that were strongly predicted to match the dependency and otherwise just push back informative status to the pull request. This feels like a the best of both worlds sort of system where we get some build breakage on the worst of the worst and some freedom with not having to set up a set of false positives on all of our repos. Once we’ve got it all hooked up and had time to bake I’ll report back and see if it worked the way we hope it will.

Advertisements

Book Chat: Debugging Teams

Debugging Teams is the second edition of Team Geek, rewritten to apply to a more general audience but still fundamentally the same material. It has a variety of fun and interesting anecdotes from the authors’ time running teams at Apple and Google as well as organizing a major open source project. There is some pragmatic advice about how to work within an organization as well as lead a team. Most of the book is directed towards a team leader, not a leader of a department, as would be expected by the title.

I feel as though the rewrite doesn’t make it apply to a general audience sufficiently enough. I felt that the rewrite does get away from the idea of running a software team, but doesn’t fully bring it to the any team perspective. The rewrite does get to the level of a team that works like a software team where you can iteratively deliver value. You couldn’t use much of the advice  to run a lot of other professional work where a report/result of knowledge work is delivered then you move onto another project. Some of the specific advice would still apply in these scenarios but it doesn’t work in the general management sense.

Overall it’s a fun read, but didn’t hit the target they were aiming for. It is valuable advice for someone who is running a software-like team. It still focuses on the basic practices of what to do as a leader, but doesn’t go enough into the how for me. I know I don’t personally run a team because I don’t enjoy some of the required aspects of project management, but I do really appreciate the skills and abilities required to do it well.

Who is the Customer for Security?

When looking at an application from a security perspective I can assess the controls involved and try to understand the threats it faces. But then comes the question, is this application secure? You can look at an application and declare it insecure easily enough. Finding a flaw in the design of the controls or the implementation makes an application insecure. Essentially all applications appear to have vulnerabilities according to this 2017 report. When you have the keys to the kingdom to protect you can go to significant lengths to protect them, like AWS does with its provable security measures.

For those of us with more limited means, we need to find a way to determine where is the line for what we should do. There isn’t a clear line for what’s good enough, a risk based approach makes too many guess, and an absolute approach doesn’t scale with organizational size. You build a system that looks secure to the best of your knowledge and monitor it to make sure it stays that way. You counter new known vulnerabilities as they come up. You can try to be proactive with scanning; you can try to put yourself in a good place to be reactive to issues that come up. You can build defenses in depth and apply any sort of new emerging technologies to the problem, but if the only restriction to how much work you put into security is how much money you have, how do you decide when you have enough?

Some searching turned up this formula

ROI=(Average loss expected / cost of countermeasures)

Conceptually it makes sense, but what is the average loss expected? The article defined it as (Number of Incidents per Year) X (Potential Loss per Incident). That’s still a guess at best, if you experienced some number of incidents in the past you can project forward. On the loss side you have reputational damage from the incident, the cost of response, any regulatory fines. So at this point you have a guess times a guess and you estimate how any action you take will impact those two guesses to determine if it is worth doing or not. That doesn’t seem like a good way to measure if you are making an impact.

I’ve been trying to think through other ways to analyze this sort of position. I keep coming around to something similar to downtime analysis. How confident are you in understanding your security and how long does it take you to react to something that needs to be remediated once you’ve identified it? This formulation gives you a different way to scope your security activity. It brings a number of different things as security related that wouldn’t make sense under the earlier formulation. Things like the scalability of your build and deployment pipeline are a security concern, because if you need to build and redeploy hundreds of services after finding out about a vulnerability in a commonly used library it would be good not to bottleneck the system.

I think this metric makes sense from a more technical perspective of the security of the system compared to the incidents and losses perspective which makes more sense from a business risk perspective. Having this additional perspective in a

Book Chat: Manage It!

Manage It! is an overview of modern project management techniques. Most of it was accessible to me as a person who has never formally run a project, but has been involved in many. The less accessible material was concentrated towards the end which I dutifully worked through thinking there might be more immediately relevant portions after. It heartily embraced three different practices: strong meeting facilitation, rolling wave planning, and avoiding schedule games.

The meeting facilitation advice started out fairly straightforward – have an agenda, stick to the topic at hand, and hold one-on-ones with the team. It then goes on to discuss some more radical advice, like don’t go to meetings that aren’t about solving problems, question why you had the meeting if it ends and nobody has any action items, and avoid serial status meetings. If your project has a problem, getting the relevant people into a room and coming out with a solution is a great way to break the impasse. Other sorts of meetings can impact the progress of a project, but to me that doesn’t make them immediately a bad idea. From the perspective of the project manager I can see that other sorts of long-term work or out of band activity can impact the potential of the project, but it seems necessary for the functioning of a healthy engineering organization. I agree that the lack of action items coming out of a meeting seems like a warning sign it wasn’t a good usage of time. If a decision was made generally one or more people would leave the room and do something because of it. Serial status meetings are more complex; if you are holding a meeting where people tell you about progress being made on initiatives where the others in the room aren’t involved or impacted, it may be a good use of your time but it’s a bad use of everyone else’s time. If you are being invited to meetings to provide status, the book advises to send status via email and skip the meeting, the idea being that if the organization doesn’t accept that behavior then it isn’t the place you should be. Daily standups are not impacted from this practice because they’re about the impediments, not just the status. Overall it seems like a good package of advice as to how to interact with meetings.

Rolling wave planning is an implementation of the idea that your plan will fail, but that the exercise of planning is valuable regardless. Your short term plan should be pretty solid but the further out, the more vague the plan gets. So, you don’t worry as much about the long term, and as you acquire more information you update the plan. This works both for changes from outside the project and things you learn from executing the project itself. The one experience I have had with explicit rolling wave planning did not go well, but I feel that was because we were trying to keep the near-term solid plan too far into the future and engaging in some devotion to the schedule.

The schedule games section was the part that felt the most real to me, it listed out 16 ways a project plan can go awry and ways to cope with each of them. I felt like I had seen the vast majority of the pitfalls out in the wild. This previous visibility involved me more in the rest of the material since I felt the authenticity of this compared to a lot of books which don’t give practical guidance on  how to get from poor practices to good ones. The section on “schedule chicken” felt particularly familiar to me after having been involved in a high stakes version many years ago. We even made progress in getting out of the situation using one of the techniques described.

I would recommend this book to someone who has already led a team or project for a little bit and is interested in doing more of it. If you’ve been doing that sort of work as your primary role it probably still has some bits of interest. If you haven’t run a team or project yet you may get something from it but I feel that for a lot of the information you need to have seen the problems in action some before you can appreciate the importance of  avoiding it.

OWASP Dependency Check

The Open Web Application Security Project (OWASP) provides a number of tools and resources to help create more secure web applications. Their dependency check tool is designed to integrate into the build step of your application and tell you if any of your downstream dependencies have known security issues. The idea is a simple but powerful one – when your application is being built it pulls in a number of libraries and those libraries pull in even more transitive dependencies. You need to look at all of them and cross-reference the National Vulnerability Database to see if any of them have known issues. While you may know to keep an eye out for announcements of security findings from libraries you use directly, but you may not easily know the transitive dependencies your system expects. It’s like a more traditional vulnerability scanner, but rather than pointing it at existing servers to find those that are vulnerable, you put it into the build pipeline to find those you are about to make vulnerable.

The tool has integrations for a number of different popular build systems for the JVM and .NET. It also has experimental support for Ruby, Python, C++, and Node. This predates the github vulnerability scanner but works in a similar way. Since I’m working with Scala day to day using SBT as my build system it isn’t supported by github. I’ve been using an SBT plugin to start integrating it into build systems in various services and libraries. The report generated provides both the severity of the vulnerability and the likelihood that it matched the vulnerability to the binary. So far the likelihood has always been correct in determining whether the binary matches the one indicated to be vulnerable, but I’m still not confident in that since most of my repos were using similar dependency sets and thus generating similar reports.

Taking an afternoon to setup the scanner, run it against a variety of repos you work with, and look through the findings seems highly worthwhile. If you’ve been keeping up with your library upgrades you likely won’t find anything earth shattering. But if you have a dependency that isn’t keeping up to date then you might be in for an interesting surprise.

Book Chat: Securing DevOps

Securing DevOps is an introduction to modern software security practices. It both suffers and succeeds from being technology- and tool- agnostic. By not picking any particular technology stack it will remain relevant for a long time, however it is not a complete solution for anyone since it gives you classes of tools to find but not a complete package for software security. If you need to start a software security program from zero this lays out a framework to get started with.

While I’ve only been doing software security full time for a few months now, I feel like the identification of the practices to engage isn’t the hard part, it’s the specifics of the implementation where I feel I want additional guidance. I know I should be doing static analysis of the code as part of my CI pipeline, but I don’t know how to handle false positives in the pipeline or what is worth failing a build because of. I don’t know what sort of custom rules I should be implementing in the scanner for my technology stack.

The book did go further into detail on the subject of setting up a logging pipeline. It describes how to set up rules to look for logins from abnormal geographic locations and how to look for abnormal query strings. The described logging platform is nothing abnormal for a midsized web application, however, I don’t know if you could have a small organization and have this level of infrastructure setup. Hooking up the ELK stack, while open source, is not easy, and the kibana portion requires a fair bit of customization and time to get everything together and working.

It feels as though we are missing a higher level of abstraction for dealing with these concerns. Perhaps, the idea that most software applications should have to go through this level of effort to get ‘standard’ security setup for a web application is reasonable. Even on the commercial tools side there seems to be a lack of complete solutions. Security information and event management (SIEM) tools try to provide this, but they each still require significant setup to get your logs in and teach the program how to interpret them. It feels like some of this could be accomplished by building more value in a web application firewall (WAF). WAFs were not fully endorsed by the book due to the author having had a bad experience with a bad configuration problem. Personally, I think a WAF seems necessary to protect against distributed denial of service style attacks.

Overall the book is an introductory to intermediate text, not the advanced practices I was looking for. If you’re bootstrapping an application security program this seems like a reasonable place to get started. If you’re trying to find new tactics for your established program, then you’ll probably be disappointed.

Back in the Cadence

After fighting a bout of burnout earlier this year I got back into writing the blog from May to September. Then I just ran out of steam again after I had transitioned into a role involved in application security. I found this making it harder to write about what I was doing day-to-day, due to a need to maintain some confidentiality around the sorts of problems and challenges we were facing which could manifest as security vulnerabilities. So I wrote some posts about the work I was doing, keeping to the high level and abstract parts of the situation. None of which turned out particularly engaging.

I don’t think it was burnout again, I was still engaged through the majority of the time off from writing. I still did plenty of reading in the gap, as opposed to the prior block of burnout where I just didn’t have the interest in doing anything software-related. There was a two-fold problem: I was more busy with work-adjacent things trying to bootstrap my own knowledge of the field, and the kinds of things that might inspire a blog post all had to go through that internal filter. I tried to write about some of the stuff I was learning, but for a lot of it, I found it hard to identify where the interesting portions are and where the rote everyday parts were.

While I’ve managed to write a fair bit about the nuts and bolts of learning functional programming, a lot of that is me trying to describe difficult concepts in a different way. The security knowledge of the type I’ve been developing lately it’s more about understanding the terminology rather than the details of any particular application of the knowledge. Getting to the point where I am conversant with someone and able to keep up with the terminology of the basic parts of the conversation wasn’t easy.

I started listening to the Security Now podcast just to get a consistent outside perspective. Despite having a more generic information security focus, it was still good to get into the right mindset to look at a vulnerability turned up by a static analysis tool. It also made sure I didn’t fall into only learning internal lingo for this, and brought concepts and terms to me without me having to go to them first. Listening to the first few episodes I probably spent more time doing background research to understand what I was hearing than actually listening to the content itself.

I want to get back into the habit of writing posts again. I don’t know if I’ll have the topics for the once a week cadence I was doing. I’ve got ~12 topics I generated over the 11 weeks I missed so it seems like I should. Some of the topics required the hindsight of having integrated into this new community to be able to see what’s worth talking about. I also had three topics from before that I’m not sure what to do with. While I may not be able to convert all of these ideas into posts, I’m going to post once a week again and try not to force too many topics that don’t fit.

Organizing Code

I’ve been arguing with myself about the proper way to split up some code that has related concerns. The code in question relates to fetching secrets and doing encryption. The domains are clearly related, but the libraries aren’t necessarily coupled. The encryption library needs secrets, but secrets are simple enough to pass across in an unstructured fashion.

As I mentioned before, we are integrating Vault into our stack. We are planning on using Vault to store secrets. We are also going to be using their Transit Encryption Engine to do Envelope Encryption. The work to set up the Envelope Encryption requires a real relationship between the encryption code and Vault.

There are a couple of options for how to structure all of this. There are also questions of binary compatibility with the existing artifacts, but that’s bigger than this post. The obvious components are configuring and authenticating the connection to Vault, the code to fetch and manage secrets, the API for consuming secrets, and the code to do encryption. I’m going to end up with three or four binaries, encryption, secrets, secret API, and maybe a separate Vault client.

organizingCode

 

That would be the obvious solution, but the question of what the Vault client exposes is complex, given that the APIs being used by the encryption and secrets are very different. It could expose a fairly general API that is essentially for making REST calls and leaves parsing the responses to the two libraries, which isn’t ideal. The Vault client could be a toolkit for building a client instead of a full client. That would allow the security concerns to be encapsulated in the toolkit, but allow each library to build their own query components.

Since the authentication portion of the toolkit would get exposed through the public APIs of the encryption and secret libraries, that feels like a messy API to me and I’d like to do better. There seems like there should be an API where the authentication concerns are entirely wrapped up into the client toolkit. I could use configuration options to avoid exposing any actual types, but that’s just hiding the problem behind a bunch of strings and makes the options less self-documenting.

Like most design concerns there isn’t a real right answer. There are multiple different concerns at odds with each other. In this case you have code duplication vs encapsulation vs discoverable APIs. In this case code duplication and encapsulation are going to win out over discoverable APIs since the configuration should be set once and then never really changed, as opposed to the other concerns which can contain the long term maintenance costs of the library since it will likely be used for a good while to come.

Book Chat: Extreme Programming Explained

Extreme programming (XP) is an alternative software development methodology that would be described as an agile methodology. It’s a competitor to scrum, but more focused on the developer experience, less prescriptive of specific organizational practices, and more prescriptive of technical practices. I was familiar with the concepts of XP and recently picked up the second edition of Extreme Programming Explained. This new edition refined some of the technical practices about deployment since tools now exist for even more rapid deployment than what was initially conceived.

The build time practice is interesting, the idea being that a continuous integration build/test cycle should take ten minutes. While you could make the build faster than 10 minutes, keeping it a bit longer generates a decent mental break to allow someone to get a cup of coffee or get up and stretch. Whereas, if it’s slower than that, there is a tendency to move onto a different task and you can lose context on the old task and the new task. It matches with my experience; although I hadn’t been able to articulate the solution, I had seen the problem.

The overall methodology seems solid, however it doesn’t market itself to the whole business the way scrum does which seems to have impacted the adoption of the methodology as a whole. The practices suggested are all pretty straight forward:

  • colocate the team,
  • construct a team with all necessary skills on the team,
  • have visible progress locators,
  • work when you can really concentrate on it,
  • pair program,
  • user stories,
  • a weekly cycle,
  • a larger quarterly cycle,
  • slack,
  • the above build time practice,
  • continuous integration,
  • test first programming, and
  • incremental design.

Most modern software teams would be in favor of most, if not all, of these practices. Some of the practices are outside the control of the team and would need significant management support, but most are things the team can control.

I don’t think that the differences between this and other agile project management methodologies are that significant. The biggest difference with scrum I can see would be that scrum has fixed reflection periods whereas XP has continuous reflection with impromptu kaizen events. I think that this difference between XP and scrum would allow you to differentiate yourself from all of the scrum implementations that are out there but never finished. I don’t think that the book adds much to my understanding of software engineering, however it’s an excellent selection of software engineering practices. If you’re looking for a different perspective on agile methodologies this would be an interesting read.

BadSSL.com

I ran across badssl.com recently, and needed to share. The basic idea of the site is that it hosts a number of subdomains with all sorts of variants of SSL certificates. The example certificates cover the whole range of things that can go wrong with a certificate, including expiration, self signed certs, revoked certificates, and certificates for the wrong host. It also checks the strength of cryptography being used and has certificates specifying multiple different kinds of encryption to be tested against. This is all so you can see that your browser is securing you properly.

There is a more interesting use case however. When you go over to the associated github repo there are instructions for booting up the site locally inside a docker container so you can test your code against it as part of your automated test suite to test all sorts of other networking code outside of a browser. The container hosting a separate copy of the site avoids putting your integration tests in a path where they reach out to the public internet for resources. Having your integration tests work with public resources on the internet isn’t a good practice for a number of reasons, such as the time it takes to round trip, the dependency on someone else’s infrastructure for your processes, and just being inconsiderate of someone else’s resources. But, this container lets you avoid all of the work associated with defining what certificates are needed, generating the various certificates, and installing all of certificates.

The test case we used the certificates for didn’t turn up any bugs, but it did make us confident in the implementation. This confidence helped us move along more quickly and be sure we were appropriately securing the connections.