Facets of Logging

I was debugging a ‘fun’ issue recently – a test that failed if you ran it through the CI test runner rather than running it through Visual Studio, even on the same machine. Solving it necessitated giving up on the interactive debugger and falling back to old school print statements. As I was adding yet more print statements to the code to narrow in on the problem, I started thinking about the kind of information I was logging and realized there were two basic kinds of information being logged as well as a third kind of information that was created by the logs as a whole.

The first kind of information is about where the execution is in the program. Did it execute the if or the else? Did execution reach line 68? The simplest most rudimentary logging statements fall into this category. They are essentially a fixed string output at some point in the program. A stack trace is a special sort of log data since it not only tells you where you are, it also tells you some information on how you got there. These types of logs can still have lots of value if you have no other insight into the activity of the program, but are generally a smaller component of  more valuable sorts of logs.

The second kind of information was program state; x=3 etc. This is fairly straightforward, since each variable has some value at this point. This encapsulates the information from the first kind of logging under most circumstances, but if the program has multiple places that could log the same output, then this doesn’t give you full information on “where.”

The third kind of information is the programmatic flow. A then B then C. It isn’t a single log statement but the combination of them in order. At first “ordered” seems like a given for a log – A happened then B happened then C happened – but in a multithreaded world chronologically A may have happened after B but been logged first. Knowing which thread/request a log came from is valuable as it can also help to untangle the mess you are getting into. Each operation can be looked at as an independent log if they were truly working in parallel and not communicating with each other.

In my case I managed to track the behavior of the application via the logs, with thread IDs. It showed that most of the threads were processing correctly, but one of them was failing to find a configuration file it needed. The interleaving of the messages in the log without the thread IDs made it more difficult to figure out which thread was failing and track back the missing resource.

Advertisements

The virtuous cycle of software change

In the past, programmers built software that we considered good. Today programmers build software that we consider good. Under the hood though, they are built differently even when the requirements are identical. If someone built a new system with the technologies, tools, and designs of 10 years ago, it would be looked at as odd, if not wrong, even if it worked perfectly for the needs of the system. We’ve adopted a dogmatic love of the new and it drives the obsolescence of software professionals.

We build new tools to deal with the flaws of the old, but just throw away the old stuff without considering the reasons that the old tools was considered good to start with. Maybe it’s because there are so many new tools that by the time someone has built something big enough to expose the problems with an old tool, there is already a set of new tools waiting to fix that problem; like SOA to microservices. There is no obligation to even contemplate whether the old tool continues to have value. There are also plenty of examples of technologies that ran into problems in public and developed a reputation (deserved or not) because of it, e.g., the usage of rails at Twitter. Obviously people are still using older tools, but the tools’ reputations haven’t really recovered from the public hit and adoption therefore slows.

The focus on newer tools and technologies, and more specifically tools and technologies you aren’t using at work already, feeds on itself, causing people to want get new technologies to keep up to date. Some businesses have reacted to this technological churn by trying to slow this sort of resume-driven development by locking down new technology introduction and upgrades through a change management process. This means that if you, as a programmer, aren’t taking care to maintain your own continuing professional education, you can get left behind. This probably won’t impact your ability to do or keep your current job, but it could impact your ability to get the next job you really want. If you are only getting experience on one stack or set of tools and not bringing in outside ideas then you will silo yourself. This is the same issue that Michael O. Church wrote about – that most organizations wouldn’t want to hire the people who had 4 years of experience of the calibre most people in their organization get, they want something new and shiny.

If the experience you get at work isn’t good enough to get you the same quality of job that means you need to go get that experience other ways. The obvious answer is some sort of open source project. There are a couple of other ways – making an app, moonlighting, a toy program, or just some sort of hobby project. This personal drive by passionate programmers intent on expanding their skills has caused an explosion of these sorts of projects, which has great consequences for the world. As software developers there are a plethora of tools and libraries to help us accomplish what we want to do (and feeding back into the cycle). Projects like the Humanitarian Toolbox (community-created software for disaster response), which nobody would have paid for, help those in need, and a great place to expand your technical horizons.

Graph Databases @ DC .Net

I was at the DC .Net User Group recently and heard a talk by David Makogon on Graph databases, specifically neo4j. I had been interested in graph databases for a while but never had both the time to dig into them and a problem that seemed suited to their use.

For those unfamiliar with graph databases, they are a set of nodes and relationships between pairs of them. This allows certain classes of queries that are difficult to write against a SQL or document database to be written almost trivially. The default package has instructions to set up some demo databases, including a movie database. You could query the demo movie database with:

MATCH (martin:Actor { name:"Martin Sheen" }),(michael:Actor { name:"Michael Douglas" }),
p = shortestpath((martin)-[*..15]-(michael))
return p;

That would find the shortest path through a database of movie data between Martin Sheen and Michael Douglas in the 6 Degrees of Kevin Bacon sense. It is that easy in Cypher, neo4j’s query language.

Learning things like this are why I love participating in local user groups for technologies and topics that I’m interested in. I use Meetup to find my groups, but there are other options. Find some local user groups for things you are working with or interested in and go hear a talk. You don’t need to be a regular, just go see what’s going on, nobody will bite.

Why is the JavaScript ecosystem evolving so rapidly?

The JavaScript ecosystem has progressed significantly faster than most other programming ecosystems over the last decade. The number of frameworks and tools that rose in popularity then lost mindshare is remarkable. I want to explore a couple of reasons why JavaScript has evolved faster than other languages or platforms.

There were four different factors that I think came together to start the JavaScript library revolution. First, there was significant churn on the browser side. Second, broadband penetration increased significantly. Third, AJAX became a common way to construct more interactive pages. Finally, an increasingly broad set of server-side technologies for generating web content also rose in popularity. Let me walk you through these factors and how they impacted JavaScript’s evolution.

When Firefox initially started to increase its marketshare, it started to put pressure on IE, and to a lesser extent, Safari. This led to a significant proliferation of browser compatibility issues, which itself led to the development of a series of JavaScript libraries to try and mitigate these problems. The underlying web technologies (XHTML, CSS 2.1 etc) changed at this time causing, people to want to try to adopt the new functionality. But, a slower end user update cadence for browsers meant that developers had a much wider variety of what needed to be handled under feature detection to ensure your site functioned under all supported browsers. This brought about an early generation of libraries to try and paper over all of the browser differences.

The rise of broadband and AJAX combined to cause a significant increase in the amount of JavaScript being used. Broadband allowed page sizes to increase without harming load times and AJAX took advantage of this to fill up the increased page size with JavaScript. This increase in hand-rolled JavaScript also created new places to try to build more libraries responsive to the new need. It also created a need for more framework-style libraries to guide developers toward newer ways to implement more complex architectures.

The broadening of server-side programming technologies brought a number of different perspectives to people interacting with JavaScript. PHP programmers would have a different view from Java programmers, and both would have a different view from Perl programmers. Everyone brought their perspective to the development of even more JavaScript libraries and promoted them in their individual communities. This also helped foster a number of different perspectives on what you should do with JavaScript and how to do it.

These factors combined to set the scene for the explosion of different JavaScript tools and frameworks we’ve seen in the past 10 years. We’ve also seen the browser wars got increasingly active, with Chrome taking share from IE and Firefox. Browser technology also continued to advance leading to HTML5 and CSS3 adoption. Broadband adoption continued to spread, and speeds picked up. Mobile added a new dimension to creating standards-compliant pages that look good at many different resolutions and screen sizes. AJAX was taken to its conclusion in single-page applications and their associated frameworks. The existing frameworks have also been cross-pollinating ideas and getting recombined in interesting ways.

All this movement has created a lot investment into JavaScript engines. For example, Chrome got a 5x performance increase between 2008 and 2012, and Chrome’s JavaScript engine was considered fast before that happened. The result of all of this investment was reaped by node.js, which  continued to create even more JavaScript libraries most of which could be included in the browser too. And that, of course, back more capabilities into the browser. This enabled even more client-side functionality and continued to feed the JavaScript ecosystems growth.

Stacked on top of all of this are the tools to help manage tools, e.g., grunt and gulp for running tools on the server side, and require.js to help load all of these libraries on your page. These tools, plus minification and linting tools now written in JavaScript, help bring JavaScript into a fully-tooled ecosystem.

I expect the different libraries to continue to recombine and evolve as time progresses. Assuming the underlying browser technologies stay as incremental improvements, like ECMAScript 5 to 6, I think we’ve crossed the largest period of flux in JavaScript libraries. There has even been backlash against frameworks as part of the zero framework movement, which, if its adherents succeed, should slow the growth significantly.

Management Shouldn’t Be The Only Option

Software engineering management is a complex topic. Finding good managers isn’t easy. Finding good software engineers isn’t easy. Taking a good software engineer and turning them into a manager can sometimes work, but in my experience, that’s very rarely successful. The two roles require very different skill sets. Engineers need attention to detail, computer science fundamentals, and specific technical skills. Managers need leadership skills, communication skills, collaboration skills, and project management skills. Even if you were to find an outstanding individual with both skill sets you don’t want to try and force their career growth and direction – you want to allow them to take their career where they want to.

Read More