Credential stuffing is an attack where you take previously breached username/password combinations and reuse them against other sites with the hope that the user had an account there and that they used the same credentials. This is why users are recommended not to reuse the same password for multiple sites. On the application side, defensive options are generally varieties of MFA, but that is a significant burden to the user of your average web application. You can react to the attack by rate-limiting logins from the same IP and blacklisting IPs that attempt too many logins. This is a reactive approach to someone attempting to break into the system. Reactive defensive positions are great, but proactive ones are even better.
Discussing this with some coworkers over lunch we came to the idea of getting the credentials and doing it to ourselves. Find the users who had credentials breached and proactively prompt them to change their passwords. This was an interesting theory over lunch but turning it into a practice ran into some difficulties. The first was the legal status of the breached credentials. Could we pull these off the internet and use them to do something? Have I been Pwned? supplies lists of only passwords hashed in different ways, which are still in a dubious legal state. Even if you get past the legal barrier involved, how would you react to an email telling you that your password has been compromised even if it wasn’t the people sending the email who lost your password? It doesn’t seem like it would be a positive reaction. Forcing a password reset on the next login could work assuming the user was going to log in again soon, but for our particular use case we can’t make this assumptions.
This seems like a weakness to the entire username/password scheme of protection. Things like password managers exist to help cover part of the problem. A password manager requires the user to take a more active part in their own security. There has been ongoing discussions with replacing passwords with biometric identifiers or hardware devices like Yubikey. Yubikey and the like might be ready to augment security but don’t have general adoption yet. The opt-in nature of these measures means that they’re more likely to be adopted by those who are already not sharing passwords among sites. The self credential stuffing could be used to provide additional knowledge and security measures to those who are less aware of the problems or even just figure out how exposed your user base is to the problem. However, it doesn’t really seem to secure the application better, it might help the ecosystem as the whole.
I start most of these book chats by describing the book and end by describing the reader who it seems like it would benefit. This time I’m going to describe the reader first.
If you write software that interacts with another computer you should read this. To me, the title is somewhat misleading since it focuses on the data aspect, however the intensity comes from the size of the data and implies a distributed system. You also get an amazing survey of different database implementations as a side benefit. The only complaint that I have is that the first part of the book starts very much at the beginning of data-intensive systems (e.g., data locality and how it is organized on disk), which is important if you are building your own database system but isn’t as applicable to the average reader.
There are sections on a vast number of different topics. It covers both SQL-style ACID databases as well as BASE-style NOSQL databases. It even gets into things like graph databases that don’t fit neatly in either box. The authors cover topics as varied as locking and commit strategies, the levels of consistency available in a database and what they really mean, distributed consensus, replication, and streaming.
The majority of the text is written in a technology-agnostic way but it will reference specific implementations that demonstrate a concept. There is also a deep academic rooting with a well-referenced selection of footnotes to satisfy any further curiosities you may have on the topics. It seems like it should be fairly accessible as a whole text to a relative beginner since it introduces concepts in a way that doesn’t require much prior knowledge. I don’t think a beginner could jump into a chapter in the middle and be able to follow along, but given the complexity of the topic I don’t think that’s an unreasonable thing.
Even if you aren’t building a database, deeply understanding the tradeoffs of the database you are using will make your application more correct. The difficulty of testing into a lot of the concurrent failure scenarios makes understanding the system at a logical level the only way to attempt to handle all failure cases. I do think all software engineers working on the web would benefit from the material here. It won’t make your day-to-day much better but it will help keep you out of the really bad places where the system is intermittently failing.
Following up on the discussion of the OWASP dependency check, my team now has the means to scan a dependency manifest against a list of known vulnerable ones. The question now in front of us is whether this should this be a step in the build pipeline that breaks the build?
There are some immediate pros and cons that come to mind. On the pro side, you will find out if you are about to ship something that has a known vulnerability. On the con side, there is the problem of dealing with false positives, and builds breaking that used to work if the world around you changes. Evaluating the impact of the pro is relatively obvious: you gain information about the system to act proactively before something becomes a problem. The cons get a little more complex. You need the ability to feed information back to the tool telling it that is has made a mistake to reduce future false positives. This is easy enough with something like a dot file but it makes configuring and using the tool more complex.
For the other con imagine the situation where you have a pull request that has passed CI and is ready to merge. Once merged it kicks off a series of processes to build and deploy the code, and then the build of the merged code fails because a new vulnerability became known to the system between when the pull request was built and when it was merged. This should be a rare occurrence, but when it does happen it will be even more unfamiliar to those who see the outcome.
We’ve been toying with an in-between response where we run it in a pull request and break the build for critical security vulnerabilities that were strongly predicted to match the dependency and otherwise just push back informative status to the pull request. This feels like a the best of both worlds sort of system where we get some build breakage on the worst of the worst and some freedom with not having to set up a set of false positives on all of our repos. Once we’ve got it all hooked up and had time to bake I’ll report back and see if it worked the way we hope it will.
Debugging Teams is the second edition of Team Geek, rewritten to apply to a more general audience but still fundamentally the same material. It has a variety of fun and interesting anecdotes from the authors’ time running teams at Apple and Google as well as organizing a major open source project. There is some pragmatic advice about how to work within an organization as well as lead a team. Most of the book is directed towards a team leader, not a leader of a department, as would be expected by the title.
I feel as though the rewrite doesn’t make it apply to a general audience sufficiently enough. I felt that the rewrite does get away from the idea of running a software team, but doesn’t fully bring it to the any team perspective. The rewrite does get to the level of a team that works like a software team where you can iteratively deliver value. You couldn’t use much of the advice to run a lot of other professional work where a report/result of knowledge work is delivered then you move onto another project. Some of the specific advice would still apply in these scenarios but it doesn’t work in the general management sense.
Overall it’s a fun read, but didn’t hit the target they were aiming for. It is valuable advice for someone who is running a software-like team. It still focuses on the basic practices of what to do as a leader, but doesn’t go enough into the how for me. I know I don’t personally run a team because I don’t enjoy some of the required aspects of project management, but I do really appreciate the skills and abilities required to do it well.
When looking at an application from a security perspective I can assess the controls involved and try to understand the threats it faces. But then comes the question, is this application secure? You can look at an application and declare it insecure easily enough. Finding a flaw in the design of the controls or the implementation makes an application insecure. Essentially all applications appear to have vulnerabilities according to this 2017 report. When you have the keys to the kingdom to protect you can go to significant lengths to protect them, like AWS does with its provable security measures.
For those of us with more limited means, we need to find a way to determine where is the line for what we should do. There isn’t a clear line for what’s good enough, a risk based approach makes too many guess, and an absolute approach doesn’t scale with organizational size. You build a system that looks secure to the best of your knowledge and monitor it to make sure it stays that way. You counter new known vulnerabilities as they come up. You can try to be proactive with scanning; you can try to put yourself in a good place to be reactive to issues that come up. You can build defenses in depth and apply any sort of new emerging technologies to the problem, but if the only restriction to how much work you put into security is how much money you have, how do you decide when you have enough?
Some searching turned up this formula
ROI=(Average loss expected / cost of countermeasures)
Conceptually it makes sense, but what is the average loss expected? The article defined it as (Number of Incidents per Year) X (Potential Loss per Incident). That’s still a guess at best, if you experienced some number of incidents in the past you can project forward. On the loss side you have reputational damage from the incident, the cost of response, any regulatory fines. So at this point you have a guess times a guess and you estimate how any action you take will impact those two guesses to determine if it is worth doing or not. That doesn’t seem like a good way to measure if you are making an impact.
I’ve been trying to think through other ways to analyze this sort of position. I keep coming around to something similar to downtime analysis. How confident are you in understanding your security and how long does it take you to react to something that needs to be remediated once you’ve identified it? This formulation gives you a different way to scope your security activity. It brings a number of different things as security related that wouldn’t make sense under the earlier formulation. Things like the scalability of your build and deployment pipeline are a security concern, because if you need to build and redeploy hundreds of services after finding out about a vulnerability in a commonly used library it would be good not to bottleneck the system.
I think this metric makes sense from a more technical perspective of the security of the system compared to the incidents and losses perspective which makes more sense from a business risk perspective. Having this additional perspective in a
Manage It! is an overview of modern project management techniques. Most of it was accessible to me as a person who has never formally run a project, but has been involved in many. The less accessible material was concentrated towards the end which I dutifully worked through thinking there might be more immediately relevant portions after. It heartily embraced three different practices: strong meeting facilitation, rolling wave planning, and avoiding schedule games.
The meeting facilitation advice started out fairly straightforward – have an agenda, stick to the topic at hand, and hold one-on-ones with the team. It then goes on to discuss some more radical advice, like don’t go to meetings that aren’t about solving problems, question why you had the meeting if it ends and nobody has any action items, and avoid serial status meetings. If your project has a problem, getting the relevant people into a room and coming out with a solution is a great way to break the impasse. Other sorts of meetings can impact the progress of a project, but to me that doesn’t make them immediately a bad idea. From the perspective of the project manager I can see that other sorts of long-term work or out of band activity can impact the potential of the project, but it seems necessary for the functioning of a healthy engineering organization. I agree that the lack of action items coming out of a meeting seems like a warning sign it wasn’t a good usage of time. If a decision was made generally one or more people would leave the room and do something because of it. Serial status meetings are more complex; if you are holding a meeting where people tell you about progress being made on initiatives where the others in the room aren’t involved or impacted, it may be a good use of your time but it’s a bad use of everyone else’s time. If you are being invited to meetings to provide status, the book advises to send status via email and skip the meeting, the idea being that if the organization doesn’t accept that behavior then it isn’t the place you should be. Daily standups are not impacted from this practice because they’re about the impediments, not just the status. Overall it seems like a good package of advice as to how to interact with meetings.
Rolling wave planning is an implementation of the idea that your plan will fail, but that the exercise of planning is valuable regardless. Your short term plan should be pretty solid but the further out, the more vague the plan gets. So, you don’t worry as much about the long term, and as you acquire more information you update the plan. This works both for changes from outside the project and things you learn from executing the project itself. The one experience I have had with explicit rolling wave planning did not go well, but I feel that was because we were trying to keep the near-term solid plan too far into the future and engaging in some devotion to the schedule.
The schedule games section was the part that felt the most real to me, it listed out 16 ways a project plan can go awry and ways to cope with each of them. I felt like I had seen the vast majority of the pitfalls out in the wild. This previous visibility involved me more in the rest of the material since I felt the authenticity of this compared to a lot of books which don’t give practical guidance on how to get from poor practices to good ones. The section on “schedule chicken” felt particularly familiar to me after having been involved in a high stakes version many years ago. We even made progress in getting out of the situation using one of the techniques described.
I would recommend this book to someone who has already led a team or project for a little bit and is interested in doing more of it. If you’ve been doing that sort of work as your primary role it probably still has some bits of interest. If you haven’t run a team or project yet you may get something from it but I feel that for a lot of the information you need to have seen the problems in action some before you can appreciate the importance of avoiding it.
The Open Web Application Security Project (OWASP) provides a number of tools and resources to help create more secure web applications. Their dependency check tool is designed to integrate into the build step of your application and tell you if any of your downstream dependencies have known security issues. The idea is a simple but powerful one – when your application is being built it pulls in a number of libraries and those libraries pull in even more transitive dependencies. You need to look at all of them and cross-reference the National Vulnerability Database to see if any of them have known issues. While you may know to keep an eye out for announcements of security findings from libraries you use directly, but you may not easily know the transitive dependencies your system expects. It’s like a more traditional vulnerability scanner, but rather than pointing it at existing servers to find those that are vulnerable, you put it into the build pipeline to find those you are about to make vulnerable.
The tool has integrations for a number of different popular build systems for the JVM and .NET. It also has experimental support for Ruby, Python, C++, and Node. This predates the github vulnerability scanner but works in a similar way. Since I’m working with Scala day to day using SBT as my build system it isn’t supported by github. I’ve been using an SBT plugin to start integrating it into build systems in various services and libraries. The report generated provides both the severity of the vulnerability and the likelihood that it matched the vulnerability to the binary. So far the likelihood has always been correct in determining whether the binary matches the one indicated to be vulnerable, but I’m still not confident in that since most of my repos were using similar dependency sets and thus generating similar reports.
Taking an afternoon to setup the scanner, run it against a variety of repos you work with, and look through the findings seems highly worthwhile. If you’ve been keeping up with your library upgrades you likely won’t find anything earth shattering. But if you have a dependency that isn’t keeping up to date then you might be in for an interesting surprise.
Securing DevOps is an introduction to modern software security practices. It both suffers and succeeds from being technology- and tool- agnostic. By not picking any particular technology stack it will remain relevant for a long time, however it is not a complete solution for anyone since it gives you classes of tools to find but not a complete package for software security. If you need to start a software security program from zero this lays out a framework to get started with.
While I’ve only been doing software security full time for a few months now, I feel like the identification of the practices to engage isn’t the hard part, it’s the specifics of the implementation where I feel I want additional guidance. I know I should be doing static analysis of the code as part of my CI pipeline, but I don’t know how to handle false positives in the pipeline or what is worth failing a build because of. I don’t know what sort of custom rules I should be implementing in the scanner for my technology stack.
The book did go further into detail on the subject of setting up a logging pipeline. It describes how to set up rules to look for logins from abnormal geographic locations and how to look for abnormal query strings. The described logging platform is nothing abnormal for a midsized web application, however, I don’t know if you could have a small organization and have this level of infrastructure setup. Hooking up the ELK stack, while open source, is not easy, and the kibana portion requires a fair bit of customization and time to get everything together and working.
It feels as though we are missing a higher level of abstraction for dealing with these concerns. Perhaps, the idea that most software applications should have to go through this level of effort to get ‘standard’ security setup for a web application is reasonable. Even on the commercial tools side there seems to be a lack of complete solutions. Security information and event management (SIEM) tools try to provide this, but they each still require significant setup to get your logs in and teach the program how to interpret them. It feels like some of this could be accomplished by building more value in a web application firewall (WAF). WAFs were not fully endorsed by the book due to the author having had a bad experience with a bad configuration problem. Personally, I think a WAF seems necessary to protect against distributed denial of service style attacks.
Overall the book is an introductory to intermediate text, not the advanced practices I was looking for. If you’re bootstrapping an application security program this seems like a reasonable place to get started. If you’re trying to find new tactics for your established program, then you’ll probably be disappointed.
After fighting a bout of burnout earlier this year I got back into writing the blog from May to September. Then I just ran out of steam again after I had transitioned into a role involved in application security. I found this making it harder to write about what I was doing day-to-day, due to a need to maintain some confidentiality around the sorts of problems and challenges we were facing which could manifest as security vulnerabilities. So I wrote some posts about the work I was doing, keeping to the high level and abstract parts of the situation. None of which turned out particularly engaging.
I don’t think it was burnout again, I was still engaged through the majority of the time off from writing. I still did plenty of reading in the gap, as opposed to the prior block of burnout where I just didn’t have the interest in doing anything software-related. There was a two-fold problem: I was more busy with work-adjacent things trying to bootstrap my own knowledge of the field, and the kinds of things that might inspire a blog post all had to go through that internal filter. I tried to write about some of the stuff I was learning, but for a lot of it, I found it hard to identify where the interesting portions are and where the rote everyday parts were.
While I’ve managed to write a fair bit about the nuts and bolts of learning functional programming, a lot of that is me trying to describe difficult concepts in a different way. The security knowledge of the type I’ve been developing lately it’s more about understanding the terminology rather than the details of any particular application of the knowledge. Getting to the point where I am conversant with someone and able to keep up with the terminology of the basic parts of the conversation wasn’t easy.
I started listening to the Security Now podcast just to get a consistent outside perspective. Despite having a more generic information security focus, it was still good to get into the right mindset to look at a vulnerability turned up by a static analysis tool. It also made sure I didn’t fall into only learning internal lingo for this, and brought concepts and terms to me without me having to go to them first. Listening to the first few episodes I probably spent more time doing background research to understand what I was hearing than actually listening to the content itself.
I want to get back into the habit of writing posts again. I don’t know if I’ll have the topics for the once a week cadence I was doing. I’ve got ~12 topics I generated over the 11 weeks I missed so it seems like I should. Some of the topics required the hindsight of having integrated into this new community to be able to see what’s worth talking about. I also had three topics from before that I’m not sure what to do with. While I may not be able to convert all of these ideas into posts, I’m going to post once a week again and try not to force too many topics that don’t fit.
I’ve been arguing with myself about the proper way to split up some code that has related concerns. The code in question relates to fetching secrets and doing encryption. The domains are clearly related, but the libraries aren’t necessarily coupled. The encryption library needs secrets, but secrets are simple enough to pass across in an unstructured fashion.
As I mentioned before, we are integrating Vault into our stack. We are planning on using Vault to store secrets. We are also going to be using their Transit Encryption Engine to do Envelope Encryption. The work to set up the Envelope Encryption requires a real relationship between the encryption code and Vault.
There are a couple of options for how to structure all of this. There are also questions of binary compatibility with the existing artifacts, but that’s bigger than this post. The obvious components are configuring and authenticating the connection to Vault, the code to fetch and manage secrets, the API for consuming secrets, and the code to do encryption. I’m going to end up with three or four binaries, encryption, secrets, secret API, and maybe a separate Vault client.
That would be the obvious solution, but the question of what the Vault client exposes is complex, given that the APIs being used by the encryption and secrets are very different. It could expose a fairly general API that is essentially for making REST calls and leaves parsing the responses to the two libraries, which isn’t ideal. The Vault client could be a toolkit for building a client instead of a full client. That would allow the security concerns to be encapsulated in the toolkit, but allow each library to build their own query components.
Since the authentication portion of the toolkit would get exposed through the public APIs of the encryption and secret libraries, that feels like a messy API to me and I’d like to do better. There seems like there should be an API where the authentication concerns are entirely wrapped up into the client toolkit. I could use configuration options to avoid exposing any actual types, but that’s just hiding the problem behind a bunch of strings and makes the options less self-documenting.
Like most design concerns there isn’t a real right answer. There are multiple different concerns at odds with each other. In this case you have code duplication vs encapsulation vs discoverable APIs. In this case code duplication and encapsulation are going to win out over discoverable APIs since the configuration should be set once and then never really changed, as opposed to the other concerns which can contain the long term maintenance costs of the library since it will likely be used for a good while to come.