Serverless Architecture

The serverless architecture is an architectural pattern where you create a function to be run when an event occurs. This event can be a message being placed in a queue, or an email being received, or a call to a web endpoint. The idea is that you don’t need to think about how the code is being hosted which allows scale out to happen without you being involved. This is great for high availability, cost control, and reducing operational burdens.

The serverless architecture seems to result in setting up a number of integration databases. Imagine the route /foos/:id – there is a get and a put available. Hosting these in a serverless fashion means that you’ve got two independent pieces of code interacting with a single database. In my mind this isn’t significantly different from having two services interacting with the same database.

I went looking around for anyone else discussing this seemingly obvious problem, and found this aside from Martin Fowler comparing them to stored procedures. The stored procedure comparison seems apt since most endpoints seem to me to be wrappers around database calls, your basic CRUD stuff. These endpoints, just like most stored procedures, are a fairly simple query. Stored procedures started out as a good way to isolate your SQL to a particular layer, and enable you to change the underlying design of the database. If you’ve never worked in a large codebase with stored procedures they can evolve in lots of negative ways. You can end up with stored procedures that have business logic in the database, stored procedures that evolved and have 12 arguments (some of which were just ignored and only there for backwards compatibility), or six variations of stored procedure but it being unclear what the differences are. These are all downsides to stored procedures I have seen in real code bases.

I can imagine the serverless architecture equivalent of these problems, in that you would have an endpoint querying some database and be unaware of that shared database eventually the shared database becomes an impediment to development progress. Serverless architecture may still be interesting for a truly stateless endpoint. But, even then you could end up with the put and get endpoints not being in sync as to what the resource looks like. There isn’t a good way yet to package serverless applications. The idea of orchestrating an entire applications worth of endpoints seems a daunting task. Using something like GraphQL that radically limits the number of endpoints being exposed would simplify the deployment orchestration. While GraphQL has had a significant adoption it isn’t the solution for every problem.

Given these issues I don’t see the appeal of the serverless architecture for general application development and using it to back rest endpoints. For pulling from a queue, processing emails, or processing other event sources and interacting with web services it seems a good solution since there would be a single input source and it just produces additional events or other web service calls.

The serverless architecture for standard web apps seems like a choice that could result in a mountain of technical debt going forward. This is especially likely because the serverless architecture is most attractive to the smallest organizations due to the potential hosting savings. Those organizations are the least capable of dealing with the complexity. These are the same organizations that are prone to monolithic architectures or to integration databases since they have a short-term view of events and limited resources to deal with the issues.

I don’t know what the eventual fallout from applications built with this architecture will be but, I do suspect it will employ a lot of software engineers for many years to dig applications out of the pain inflicted.

Advertisements

Book Chat: The Pragmatic Programmer

For a long time this had been on my list of books to buy and read with a “note to self” saying to check if there was a copy of it somewhere on my bookshelf before buying one. It felt like a book I had read at some point years ago, but that I didn’t really remember anymore. Even the woodworking plane on the cover felt familiar. It felt like it was full of ideas about creating software that you love when you encounter them but are disappointingly sparse in practice. Despite being from the year 2000 it still contains a wealth of great advice on the craft of creating software.

Since it is about the craft of software, not any specific technologies or tools or styles, it aged much better than other books. That timeless quality makes the book like a great piece of hardwood furniture, it may wear a little but it develops that patina that says these are the ideas that really matter. There is an entire chapter devoted to mastering the basic tools of the trade: your editors and debuggers, as well as the suite of command line tools available to help deal with basic automation tasks. While we’ve developed a number of specialized tools to do a lot of these tasks it is valuable to remember than you don’t need to break out a really big tool to accomplish a small but valuable task.

It’s all about the fundamentals, and mastering these sorts of skills will transfer across domains and technical stacks. It was popular enough that is spawned an entire series of books – The Pragmatic Bookshelf – and while I have only written about one of them I have read a few more and they’ve all been informative.

About two-thirds of the way through the book I realized that I had indeed read it before – I had borrowed a copy of it from a coworker at my second job. He had recommended it to me as a source he had learned a lot from. I remember having enjoyed it a lot but not really appreciating the timeless quality. Probably since that would have been around 2007, it wouldn’t have seemed as old, especially since things seemed to be moving less quickly then. Maybe I just feel that way since I didn’t know enough of the old stuff to see it changing.

If you haven’t read it, go do it.

tumblr_inline_o2aushqfpx1slrvm0_1280

Session Fun

I’ve been working on getting a new session infrastructure set up for the web application I’m working on. We ended up going with a stateful session stored in mongoDB along with some endpoints to query the session with. This design has a couple of nice aspects – all of the logic about if a session is active or not can live inside of one specific service and a session can be terminated if needed.

Building a session infrastructure is a fairly common activity, but we are building a session that is used by multiple services and we can’t roll all of them out simultaneously. So we’ve been building a setup that can process both the new session and the old session as a way to have a backwards compatible intermediate step. This is creating some interesting flow issues. There are two specific issues I wanted to discuss: (1) how to maintain the activity of the new session during backwards compatibility and (2) processing of identity federation.

Maintaining the new session from applications that have not been updated is nuanced. Since, by rule, you haven’t made changes to the applications that haven’t been updated, you can’t add any calls. In our case there was a call made from all of the application frontends to a specific service, so we are using that to piggyback keeping the session alive. We looked into a couple of other options but didn’t find anything easy. We considered rolling out a heartbeat to the various application frontends but that would require an extra round of updates, testing, and deploys for code that was likely to all be ripped out when we were done.

The federation flow is extra complex because a federation in the new scheme is not that different from some of the session passing semantics under the old session scheme. This ends up mixing together the case where there is just a federation occurring and the case where the new session has timed out and the old session is still valid. This creates an awkward compromise; you’d like to be able to say that if the new session has timed out the entire session is expired, but if you can’t tell the difference between the two cases that’s not possible. This means that the new session expiration can’t be any longer than the old session expiration. But it also solved the problem with maintaining the new session while in applications that haven’t been updated yet.

The one problem solved the other problem which was a nice little win.

Book Chat: AWS in Action

Amazon Web Services in Action is a great introduction to the basics of AWS. It mostly discusses IaaS services, but touches on some of the PaaS services too. It also covers a good portion of what’s in the Architecting on AWS course if you were considering that. I was hoping for coverage of AWS Lambda, API Gateway and EC2 Container Service but they weren’t included. There were references to when to use AWS Cloudfront but it was never introduced like the other services were.

It’s a very quick read, even though it weighs in about 400 pages. There are lots of detailed examples including specific information about if the example was covered under the free tier of services and how to be sure to roll back everything. Lots of screenshots of consoles and the CLI interface and plenty of code samples of using the various SDKs, mostly in node.

If you are already familiar with AWS this probably isn’t the right book for you. It doesn’t cover the architecture aspect in depth, just simple examples of how to combine the various services. For my taste it doesn’t sufficiently cover how to decide when to use a PaaS solution vs rolling your own at the IaaS level. There is one little chart in the portion covering Elastic Beanstalk about the benefits of it vs other less managed options.

Overall it wasn’t the book I was looking for. But it is the sort of thing that can be helpful to lots of people who want to try and understand how to apply their existing architectural knowledge to the AWS platform.

Book Chat: Zero Bugs and Program Faster

Zero Bugs and Program Faster by Kate Thompson is a book that’s hard to describe. None of it is a really novel way of looking at creating software but it’s all of those things that you would expect to describe when you think about how to do programming well. It’s a breezy and fun read that is divided into enough small sections that you can read it in however much time you have available.

The book is structured in two parts. The first part is a series of short vignettes about programming. Some of which are more direct, like the chapter on ACID; some are more abstract, like the chapter entitled “The Many Sides of the Elephant.” I appreciated the dual chapters of “Do It Now” and “Do It Later” that are about how you can’t always do it now but you shouldn’t always defer it either. None of it was a mind shattering revelation but it was all solid advice about programming.

The second part is extracts from various programs to demonstrate a lot of different ideas. The code samples in the second part were generally significantly older, mostly in assembly or C. The low level nature of the examples made it more difficult for me to appreciate. Seeing Altair assembly from the 70s that’s notable for being clever and concise won’t help me build a better web service today.

If you are the sort of person who is reading lots of programming books, you will appreciate the book, however you may not get much from it. If you aren’t the kind of person who reads lots of programming books some of the more oblique points may be obscured. I don’t have anything bad to say about it but don’t know who I would recommend the book to.

Tagging in Macwire and Type Currying

We’ve been using the Macwire dependency injection framework and I recently got to use the tagging functionality for the first time. This was a great feature that enabled me to instantiate multiple instances of the same type and have the framework differentiate between the instances when choosing which to inject.

First, you define a trait to be the marker; it doesn’t need anything in it. Then, at the injection site it becomes

BaseType @@ Marker

When setting up the initial site, instead of being

wire[BaseType]

it becomes

wire[BaseType].taggedWith[Marker]

That’s all there is to the base case. You can chain multiple tags in a couple of different ways.

wire[BaseType].taggedWith[Marker].andTaggedWith[OtherMarker]

or

wire[BaseType].taggedWith[Marker with OtherMarker]

 

I combined all of this with another neat construction of the type system.

trait Foo[Marker](dependency:Bar @@ Marker)

Once I set up the Foo type like this it lets me take a marked dependency in a generic way.

I would like to see some sort of syntax like

preference[Marker].wire[BaseType]

So I don’t need to make changes to the type being instantiated in order to control the preference of what instance gets supplied. There is also the module syntax, where you can create a module for each set and then nest those modules to get the injections as desired; but I don’t want to have to organize my code in a specific way in order to have it work correctly.

All of this had me thinking about currying/partial application and the application of generic types. You can take a type with multiple generic parameters and apply some of them and leave the rest for later. It’s not something that happens often but it came to mind with the way I was using generic types there and it was an interesting insight into the symmetry between types and data.

Agile Architecture

I’ve been involved in a set of architecture discussions recently and it had me thinking about the role of architecture in an agile development team. The specific discussion mostly was around which services owned what data and which services were confederates using that data. My team owned all of the services in question and felt that the data should be centralized and the edge pieces should query the central store. There were some other interested parties that felt that the data should be distributed across the edge pieces which would coordinate amongst themselves and had a central proxy for outside services to query through. Both options had some pros and had some cons.

I had proposed some changes to the initial centralization plan to account for some issues raised by the other parties. Then something unexpected happened – they said that’s great but you need to do it our way anyway. This had me stunned at first. This didn’t change the way that they would use the resulting system so why was it their decision to make? Sure, they had higher titles but that isn’t a license to make architectural decisions universally. I hadn’t been in a situation like this before and wasn’t quite sure what to do with it to convince everyone involved that my proposal was the best option for the organization as a whole..

Fortunately the exchange was over email so I took a while to regroup and collect some opinions about what to do. I asked a few different people and got a couple of specific pieces of advice. First, that I should set a meeting with the all of the sets of stakeholders, so they would remember that there were more people involved. Second, that I should prep a full written description of the various options in advance. The goal of setting a meeting was to get a concentrated block of attention from the other parties rather than a series of ad hoc email exchanges. The written description  was to make it clear what the full state was, rather than just a series of deltas from the original plan. That would make it clearer what was going on. Bringing together all of the stakeholders meant that those opposed to it for reasonable but team-specific reasons would see the other stakeholders and hopefully see their perspective.

All of this went down over two sprints where we had set aside time to sort out this architecture for our next big project. Defining these tasks was tricky; the first sprint had tickets to handle figuring out all of the requirements and speccing several different options. The requirements weren’t put together by product since this was not an end user-facing feature. The second sprint was to gain consensus on the various options, which brings us back to the anecdote above.

The two points came together and once we had their concentrated attention and the complete description of the plan they came around to our centralization plan, with modifications.