Quite often you find compliance auditors around the world getting asked to investigate the applicable license terms for a given software component around the Internet. The most troublesome situation are cases when an initial analysis indicates that the code has been released under license terms that are friendly to closed-source developments but eventually a deeper analysis exposes that the applicable license terms should (legally speaking) be GPL.

When does this typically happens?

  • When the author mentions one license applicable for the software that he has written and disregards the GPL nature of libraries being used
  • When there is no description of the software libraries inside the software component and these are later exposed by analysis with code auditing tools

A notable example would be MapFish. This is a software component released under the closed-source friendly BSD licensing terms that you find at http://mapfish.org/ where the developer is kind enough to mention the license terms right on the front page and provides some details about the libraries being adopted.

On this case in particular, from a first view everything seems great. The libraries being used are open source too. However, it is noted that one of these components is ExtJS. This specific software is listed as licensed under LGPL terms. There is nothing wrong with LGPL, it is indeed friendly with the spirit of BSD in general. The issue is that recent editions of this software are no longer released as LGPL. Since a few years to now that the license terms for this library were replaced with the GPL version 3 or through a commercial license (http://www.sencha.com/products/extjs/license/) that is sold separately. Being GPL, now we have some uncertainty and risk that the licensing terms for ExtJs might not be correct, causing a change on the expected licensing terms for MapFish and by extension to our own software developments due to the copyleft nature of GPL.

We now have a case of non-obvious GPL in our hands, what do we do?
There are some questions to consider:

  • Which version of ExtJs has been used? The one licensed under LGPL, or the one under GPL?
  • How can we verify which software libraries are really being used there?
  • Do we need to buy any commercial licenses?

What started as a simple component with a BSD license as now become a compliance investigation that requires much more effort to conclude:

  • Uncertainty of not knowing if the GPL or LGPL versions are being used
  • Unknown applicable licensing terms for the whole software
  • Possible costs with the acquisition of a license to mitigate the uncertainty,
  • Additional cost for investigation effort that is now required to get conclusive answers

This kind of situation surfaces too often during the process of compliance investigation. If initially there is an effort estimation that predicts concluding an analysis under a given number of hours, a case like this one will cause a huge increase in time and the effort required to produce an answer is now uncertain.

To our rescue came knowledge databases. They help everyone save time because the investigation results are stored in a place where we can read them again in the future whenever stumbling on this component again. Makes sense, doesn’t it? This is particularly helpful for active organizations where the same components get reused often.

Either you find a database developed in-house where the results from each component analysis are stored or you use a professional code auditing database populated by experts. Either options will help considerably. The bottom line is that compliance analysis and answers to these licensing questions are directly related to how much money and time is available for investing on open source compliance.

An affordable (no costs) solution is looking for information available around the Internet. I’ve used quite often and recommend taking at look at http://olex.openlogic.com where you find a reasonable investigation around some open source components. It is not extensive but it is useful.

One thing that I should warn is that you should never take for granted what you find around the web. Different sources will have different results, it is your job to put the pieces together, evaluate your context and come up with a strategy. This is actually one of the reasons why we invest so much effort into open data at TripleCheck, there is no single answer and the more data we make available, the more complete is the picture about what we know about a given software component.

Therefore, it is of critical importance that you conduct the investigation either by yourself or by someone that is working closely with you and understands the business context.

In my opinion, it is quite efficient when a compliance auditor has a direct communication route with the engineers developing the software. Helps so much to understand the context without ambiguity. Yet, I’ve also seen the case where management wants to stay relevant and behave as proxy between the two parties.

If you’re the manager, don’t be that manager guy:

  • Get yourself truly involved in understanding what is happening
  • Let communication happen between all parties
  • Isolation keeps the parts from understanding the big picture and what really matters

In the end, be sure to double check what is being accepted inside your library of trusted components. As you note, what first started as a harmless component under BSD terms has later revealed a risk dependency that would have passed unnoticed until someone in the public eyes would point out a possible non-compliance situation.

TL;DR: No auditing tool or database is perfect nor fits perfectly your context. Don’t disconnect your brain from the compliance analysis and be careful to look deeper at the libraries used by your libraries.