Following up on the discussion of the OWASP dependency check, my team now has the means to scan a dependency manifest against a list of known vulnerable ones. The question now in front of us is whether this should this be a step in the build pipeline that breaks the build?
There are some immediate pros and cons that come to mind. On the pro side, you will find out if you are about to ship something that has a known vulnerability. On the con side, there is the problem of dealing with false positives, and builds breaking that used to work if the world around you changes. Evaluating the impact of the pro is relatively obvious: you gain information about the system to act proactively before something becomes a problem. The cons get a little more complex. You need the ability to feed information back to the tool telling it that is has made a mistake to reduce future false positives. This is easy enough with something like a dot file but it makes configuring and using the tool more complex.
For the other con imagine the situation where you have a pull request that has passed CI and is ready to merge. Once merged it kicks off a series of processes to build and deploy the code, and then the build of the merged code fails because a new vulnerability became known to the system between when the pull request was built and when it was merged. This should be a rare occurrence, but when it does happen it will be even more unfamiliar to those who see the outcome.
We’ve been toying with an in-between response where we run it in a pull request and break the build for critical security vulnerabilities that were strongly predicted to match the dependency and otherwise just push back informative status to the pull request. This feels like a the best of both worlds sort of system where we get some build breakage on the worst of the worst and some freedom with not having to set up a set of false positives on all of our repos. Once we’ve got it all hooked up and had time to bake I’ll report back and see if it worked the way we hope it will.
When looking at an application from a security perspective I can assess the controls involved and try to understand the threats it faces. But then comes the question, is this application secure? You can look at an application and declare it insecure easily enough. Finding a flaw in the design of the controls or the implementation makes an application insecure. Essentially all applications appear to have vulnerabilities according to this 2017 report. When you have the keys to the kingdom to protect you can go to significant lengths to protect them, like AWS does with its provable security measures.
For those of us with more limited means, we need to find a way to determine where is the line for what we should do. There isn’t a clear line for what’s good enough, a risk based approach makes too many guess, and an absolute approach doesn’t scale with organizational size. You build a system that looks secure to the best of your knowledge and monitor it to make sure it stays that way. You counter new known vulnerabilities as they come up. You can try to be proactive with scanning; you can try to put yourself in a good place to be reactive to issues that come up. You can build defenses in depth and apply any sort of new emerging technologies to the problem, but if the only restriction to how much work you put into security is how much money you have, how do you decide when you have enough?
Some searching turned up this formula
ROI=(Average loss expected / cost of countermeasures)
Conceptually it makes sense, but what is the average loss expected? The article defined it as (Number of Incidents per Year) X (Potential Loss per Incident). That’s still a guess at best, if you experienced some number of incidents in the past you can project forward. On the loss side you have reputational damage from the incident, the cost of response, any regulatory fines. So at this point you have a guess times a guess and you estimate how any action you take will impact those two guesses to determine if it is worth doing or not. That doesn’t seem like a good way to measure if you are making an impact.
I’ve been trying to think through other ways to analyze this sort of position. I keep coming around to something similar to downtime analysis. How confident are you in understanding your security and how long does it take you to react to something that needs to be remediated once you’ve identified it? This formulation gives you a different way to scope your security activity. It brings a number of different things as security related that wouldn’t make sense under the earlier formulation. Things like the scalability of your build and deployment pipeline are a security concern, because if you need to build and redeploy hundreds of services after finding out about a vulnerability in a commonly used library it would be good not to bottleneck the system.
I think this metric makes sense from a more technical perspective of the security of the system compared to the incidents and losses perspective which makes more sense from a business risk perspective. Having this additional perspective in a
The Open Web Application Security Project (OWASP) provides a number of tools and resources to help create more secure web applications. Their dependency check tool is designed to integrate into the build step of your application and tell you if any of your downstream dependencies have known security issues. The idea is a simple but powerful one – when your application is being built it pulls in a number of libraries and those libraries pull in even more transitive dependencies. You need to look at all of them and cross-reference the National Vulnerability Database to see if any of them have known issues. While you may know to keep an eye out for announcements of security findings from libraries you use directly, but you may not easily know the transitive dependencies your system expects. It’s like a more traditional vulnerability scanner, but rather than pointing it at existing servers to find those that are vulnerable, you put it into the build pipeline to find those you are about to make vulnerable.
The tool has integrations for a number of different popular build systems for the JVM and .NET. It also has experimental support for Ruby, Python, C++, and Node. This predates the github vulnerability scanner but works in a similar way. Since I’m working with Scala day to day using SBT as my build system it isn’t supported by github. I’ve been using an SBT plugin to start integrating it into build systems in various services and libraries. The report generated provides both the severity of the vulnerability and the likelihood that it matched the vulnerability to the binary. So far the likelihood has always been correct in determining whether the binary matches the one indicated to be vulnerable, but I’m still not confident in that since most of my repos were using similar dependency sets and thus generating similar reports.
Taking an afternoon to setup the scanner, run it against a variety of repos you work with, and look through the findings seems highly worthwhile. If you’ve been keeping up with your library upgrades you likely won’t find anything earth shattering. But if you have a dependency that isn’t keeping up to date then you might be in for an interesting surprise.
Securing DevOps is an introduction to modern software security practices. It both suffers and succeeds from being technology- and tool- agnostic. By not picking any particular technology stack it will remain relevant for a long time, however it is not a complete solution for anyone since it gives you classes of tools to find but not a complete package for software security. If you need to start a software security program from zero this lays out a framework to get started with.
While I’ve only been doing software security full time for a few months now, I feel like the identification of the practices to engage isn’t the hard part, it’s the specifics of the implementation where I feel I want additional guidance. I know I should be doing static analysis of the code as part of my CI pipeline, but I don’t know how to handle false positives in the pipeline or what is worth failing a build because of. I don’t know what sort of custom rules I should be implementing in the scanner for my technology stack.
The book did go further into detail on the subject of setting up a logging pipeline. It describes how to set up rules to look for logins from abnormal geographic locations and how to look for abnormal query strings. The described logging platform is nothing abnormal for a midsized web application, however, I don’t know if you could have a small organization and have this level of infrastructure setup. Hooking up the ELK stack, while open source, is not easy, and the kibana portion requires a fair bit of customization and time to get everything together and working.
It feels as though we are missing a higher level of abstraction for dealing with these concerns. Perhaps, the idea that most software applications should have to go through this level of effort to get ‘standard’ security setup for a web application is reasonable. Even on the commercial tools side there seems to be a lack of complete solutions. Security information and event management (SIEM) tools try to provide this, but they each still require significant setup to get your logs in and teach the program how to interpret them. It feels like some of this could be accomplished by building more value in a web application firewall (WAF). WAFs were not fully endorsed by the book due to the author having had a bad experience with a bad configuration problem. Personally, I think a WAF seems necessary to protect against distributed denial of service style attacks.
Overall the book is an introductory to intermediate text, not the advanced practices I was looking for. If you’re bootstrapping an application security program this seems like a reasonable place to get started. If you’re trying to find new tactics for your established program, then you’ll probably be disappointed.
After fighting a bout of burnout earlier this year I got back into writing the blog from May to September. Then I just ran out of steam again after I had transitioned into a role involved in application security. I found this making it harder to write about what I was doing day-to-day, due to a need to maintain some confidentiality around the sorts of problems and challenges we were facing which could manifest as security vulnerabilities. So I wrote some posts about the work I was doing, keeping to the high level and abstract parts of the situation. None of which turned out particularly engaging.
I don’t think it was burnout again, I was still engaged through the majority of the time off from writing. I still did plenty of reading in the gap, as opposed to the prior block of burnout where I just didn’t have the interest in doing anything software-related. There was a two-fold problem: I was more busy with work-adjacent things trying to bootstrap my own knowledge of the field, and the kinds of things that might inspire a blog post all had to go through that internal filter. I tried to write about some of the stuff I was learning, but for a lot of it, I found it hard to identify where the interesting portions are and where the rote everyday parts were.
While I’ve managed to write a fair bit about the nuts and bolts of learning functional programming, a lot of that is me trying to describe difficult concepts in a different way. The security knowledge of the type I’ve been developing lately it’s more about understanding the terminology rather than the details of any particular application of the knowledge. Getting to the point where I am conversant with someone and able to keep up with the terminology of the basic parts of the conversation wasn’t easy.
I started listening to the Security Now podcast just to get a consistent outside perspective. Despite having a more generic information security focus, it was still good to get into the right mindset to look at a vulnerability turned up by a static analysis tool. It also made sure I didn’t fall into only learning internal lingo for this, and brought concepts and terms to me without me having to go to them first. Listening to the first few episodes I probably spent more time doing background research to understand what I was hearing than actually listening to the content itself.
I want to get back into the habit of writing posts again. I don’t know if I’ll have the topics for the once a week cadence I was doing. I’ve got ~12 topics I generated over the 11 weeks I missed so it seems like I should. Some of the topics required the hindsight of having integrated into this new community to be able to see what’s worth talking about. I also had three topics from before that I’m not sure what to do with. While I may not be able to convert all of these ideas into posts, I’m going to post once a week again and try not to force too many topics that don’t fit.
I ran across badssl.com recently, and needed to share. The basic idea of the site is that it hosts a number of subdomains with all sorts of variants of SSL certificates. The example certificates cover the whole range of things that can go wrong with a certificate, including expiration, self signed certs, revoked certificates, and certificates for the wrong host. It also checks the strength of cryptography being used and has certificates specifying multiple different kinds of encryption to be tested against. This is all so you can see that your browser is securing you properly.
There is a more interesting use case however. When you go over to the associated github repo there are instructions for booting up the site locally inside a docker container so you can test your code against it as part of your automated test suite to test all sorts of other networking code outside of a browser. The container hosting a separate copy of the site avoids putting your integration tests in a path where they reach out to the public internet for resources. Having your integration tests work with public resources on the internet isn’t a good practice for a number of reasons, such as the time it takes to round trip, the dependency on someone else’s infrastructure for your processes, and just being inconsiderate of someone else’s resources. But, this container lets you avoid all of the work associated with defining what certificates are needed, generating the various certificates, and installing all of certificates.
The test case we used the certificates for didn’t turn up any bugs, but it did make us confident in the implementation. This confidence helped us move along more quickly and be sure we were appropriately securing the connections.
As a working programmer, encryption doesn’t seem like it changes much. AES and RSA public key cryptography have been fairly consistent in the world for a while. Key size recommendations have held up to the projections on computing power, so the overall landscape of implementation hasn’t had much movement. There has been a big emphasis on deciding to encrypt web traffic and lots of other things, but no real changes in the underlying technology.
The unveiling of a 72 qubit quantum computer and some of the work I’ve been doing on encryption at my job has had me thinking about the future of encryption. The jump from 17 qubits in 2017 to 72 already this year makes me think we’re getting close to an inflection point where quantum computing goes from a toy to a realistic threat to existing crypto systems.
Lattice-based cryptography is the leading contender for quantum resistant cryptography. The math behind it is based on the same math that describes the be arrangement of atoms in a crystal, but instead of happening in a three dimensional space it happens in an arbitrarily high dimension. I don’t understand the math behind this in three dimensions let alone higher dimensions. However, I do appreciate that the idea of the hard problem to be solved is based on a normal concept, like elliptic curve cryptography factoring integers. Understanding the idea helps me trust that the underlying math makes sense, even if I don’t understand the math itself.
Looking into this I stumbled into a different idea that was much more radical. Homomorphic encryption is the idea that you can do work over two different encrypted values such that the encryption is distributed over other arbitrary operations. So essentially
Encrypted(a) + Encrypted(b) = Encrypted(a+b)
However this works for all operations not just addition. Practically, this is overkill for any normal application; however, if the party with the data and the party with an algorithm are unwilling to trust each other you could use this to send the data to the algorithm securely and process it. While this seems like an amazing technology from a security and privacy perspective, there is a downside – it currently takes ~13 ms per logical gate to process. So, even something simple like adding two integers would take seconds to complete. You won’t be able to encrypt your data and give it to a foreign neural network anytime soon.
Realistically, nobody is going to implement this themself. There will be academic applications for now, and eventually something will emerge from NIST’s post-quantum cryptography program that everyone agrees seems right. Once there is agreement on a secure standard, different existing cryptography providers will start to add whatever that is to the package and application developers just need to update make new keys and reencrypt the world.
Recently I’ve been working on rolling out a Vault implementation at work and to migrate all of our existing secrets over. It is a tool designed to secure secret data and control access to it. It also offers a variety of ways to handle dynamic secrets for things like database credentials. The dynamic database credentials are are an interesting security feature; any particular set of database credentials can be shut off at any point if compromised and are effectively rotated each time a new instance starts up. It can also act as a certificate authority. This is all built on top of a configurable set of backends and HA clustering setups.
One of the most interesting things is the unsealing process. The system starts sealed, where all of the secrets are inaccessible. The unseal process requires a majority of key fragments to be provided to unseal the vault. This is an implementation of Shamir’s Secret Sharing which i sa cool concept. In the enterprise version, it also provides an auto-unsealing mechanism built on top of AWS Key Management Service.
The REST API is pretty good and most major languages have a third party client available already. The third party clients have different levels of compatibility with all of the features of the system; since it is a plugin based system they don’t necessarily support everything. Sadly, the UI also doesn’t support all of the features, which makes doing some basic testing about how the system works more painful.
Vault seems like a very good tool chest for dealing with secrets, but I would like a more opinionated system about how to do this. I can build my own system on top of it but would like to have integrated support for creating a key of some type and storing it securely. Similarly, its scheme to provide transit encryption requires a lot of work on my side if I wanted to use it. Despite these areas for improvement I’m still excited to get it integrated into our systems.