The new way of testing for Asset-Importer-Lib

Our problem!

After struggeling with bugs in assimp for more than 10 years until now I have to accept the fact: I need to improve our unittests + regression-test-strategy:

  • New patches are breaking older behaviour and I haven’t recognized it until a new issue comes up created by a frustrated user.
  • I do not have a clue about most of the importers. I didn’t implement them and it is really really hard to get in the code ( special when there is no up-to-date spec ).
  • When not knowing the code of a buggy importer it is also hard to start a debugging session for tracing issues down to its root cause.

Yes, these are all signs of legacy code :-)…

So I started to review our old testing-strategy:

  • We have some unittests, but most of the importers are not covered by these tests.
  • We have a great regression testsuite ( many Kudos to Alexander Gessler ). When running this app it will import all our models and generate a md5-checksum of its content. This checksum will be saved in our git-repo. For each run the md5-checksum will be calculated again and the testsuite checks it against the older value, a mismatch will cause a broken test. Unfortunately also a new line-break will cause a different checksum as well and the test will fail. And what you see is that a model is broken. You do no see which special part of its importer or data. But some model-importers have more than 1000 lines of code …
  • This regression test suite was part of our CI-service. Without a successful regression-run it was possible to merge new pull-requests into our master-branch. In other words: we were blocked!
  • We are doing manual tests …

So I started to redesign this kind of strategy. A lot of people are spending hours and hours to support us and a lot of software is using Asset-Importer-Lib. So for me as a software-engineer it is my duty to guaratee that we are not breaking all your code with every change we are doing.

So I will change some things …

The idea

  • Measuring our legacy: how much code is really under test-coverage. I started this last week and the result ( 17% ) is nothing I am proud of.
  • Add one unittest for each Importer / Exporter:
      • For this task I have added a new test-class called AbstractImportExportBase:
    class AbstractImportExportBase : public ::testing::Test {
    public:
        virtual ~AbstractImportExportBase();
        virtual bool importerTest() = 0;
    };
    

    All Importer tests will be derived from this class. As you can see the importerTest-method must be implemented for each new importer test. So we will get a starting point when you want to look into a special importer issue: look for the right unittest: if you can find one: start your investigations. If not: build you own test fixure, derive it from AbstractImportExportBase and make sure, that there is an importer test. At the moment this is just a simple Assimp::Importer::ReadFile()-call. But to make sure that the results will not change there is another new class.

  • Introducing SceneDiffer: This class is used to evaluate if the result of an import-process creates the expected result. You can declare which results you are expecting. So in each unittest this result will be checked against the expected data. When something breaks you can recognize this in executing our unit-test-suite. And the best part: you will see will part of the model data has changed.
  • Use Static-Code-Analysis via Coverity: A cron-job will running a Coverity-Analysis once a week to make sure that new commits or pull-requests haven’t introduce too much new issues.
  • Run the Regression-Test-Suite once a week: the same cron-job, who will trigger the coverity-run will run the regression test suite.  When some files will generate different results or just crashes you can see it and investigate this model in our unittest-suite. The regression-test suite was moved into a separate repo to make it more easy to deal with it.
  • Run the Performance-Test-Suite once a week: I introduced a new repository with some bigger models. The same cron-job for the static-code-analysis and for the regression-test-suite will trigger this import as well. The idea is to measure the time to import a big-big file. When after a week the time increases, someone introduced some buggy ( for importer code slow code is buggy code ) code and we need to fix it.
  • Release every to weeks: in the last couple of months I got a lot of new issues, which were reports of already solved issues solved on our current master branch. But the released version was outdated ( 2-3 months behind the latest master version ). To avoid this one release every two weeks could help our users to keep up-to-date without getting all the issues when looking on an experimental branch. The release process shall be run automatically. Until now there is no Continuous-Delivery-Service to generate the source release, the binary release for our common platforms and some installer for windows. Special the several deliveries to different platforms generated most of our new issues after releasing a new version. So doing this automatically will test our devilervy-process as well.

I already started to implements the unit-tests and the SceneDiffer-class. And I am using our unittests to reproduce new issues. When fixing them the test to reproduce the underlying issue is checked in as well.

Hopefully these things will help you Assimp-users to get a better User-Experience with Asset-Importer-Lib.

Feel free to give me any kind of feedback …

Getting starting with a Legacy-Code-Project

Day zero

Imagine the following situation: you are starting a new job, you are looking forward to your bright future. Of course you are planning to use the newest technologies and frameworks. And then you are allowed to take a first look into the source you have to work with. No tests, no spec, which fit to the source, of course no doc but a lot of  ( angry ) customers, which are strongly coupled to this mess. And now you are allowed to work with this kind of … whatever.

We call this Legacy-Code and I guess this situation is a common one, every developer or most of them will face a situation like this during his/her career. So what can we do to get out of this? I want to show you some base-techniques which will help you.

Accept the fact: this is the code to work on and it currently solves real problems!

No developer is planning to create legacy code. There is always a reason like: we needed to get in buisness or we had failed. Or the old developers had not enough resources to solve all upcoming issues or develop automatic tests. 10 years ago I faced this situation again and again: nobody want to write automatic tests because it costs a lot of time and you need some experience how to design your architecture in a way that its testable. And there were not so much tools out in those days.

The code is there for a reason and you need to accept this: this working legacy code ensures that you got the job. So even when its hard try to be polite when reading the code. Someone invested a lot of lifetime to keeps it up and running. And hopefully this guy is still in the company and you can ask him some questions.

You can kill him later ;-).

Check, if there is any source-control-management

The first thing you should check is the existence of an Source-Control-Management-Tool like Subversion, Git or Perforce. If not: get one, learn how to use it and put all your legacy code into source control! Do it now, do not discuss. If any of the other developers are concerned about using one install a SCM-tool on you own developer-pc and use it there. I promise: it will save your life some day. One college accidentally killed  his project-files after 6 weeks of work. He forgot the right name of his backup-folder and removed the false on, the one containing the current source. He tried to save disk-space, even in those old day disk-space was much cheaper than manpower.

To avoid errors like this: use a SCM-tool.

Check-in all your files!

Now you have a working SCM-tool check if all source-files, scripts and Makefiles are checked-in. If not: start doing this. The target of this task is just to get a reproducible build for you. Work on this until you are able to build from scratch after checking out your product. And when this works write a small KickStarter-Doc how to build everything from scratch after a clean checkout. Of course this will not work in the beginning. Of course you will face a lot of issues like a broken build, wrong paths or a different environment. But this is also a sign of legacy-code: not reproducible builds. Normally not all related files like special Makefiles are checked in. Or sometimes the environment differs between the different developer-PCs. And this causes a lot of hard to reproduce issues.

Do you know the phrase: “It worked on my machine?” after facing a new bug. Sometimes the developer was right. The issue was caused by different environments between the developer machines ( for instance different compiler version, different IDE, different kernel, different whatever … ).

When you have checkin all you files try to ensure that everyone is using the same tools: same compiler version, same libs, same IDE and document this in your KickStarter-Doc. Let’s try other guy’s to work with this and fix all upcoming issues.

This can slow down the ongoing development tasks. To avoid this you can learn how to work with branches with your SCM-tool ( for instance this doc shows how to do branches in git: https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging ).

More Quality-Assurance on GitHub via SAAS

When you are working on Github with your project there are a lot really handy services which you can use. This kind of software-usage is called “Software-As-A-Service”. Why? You can use it via a nice Web-API without having all the maintain-work.

For instance when you want to use a Continuous-Integration-Service for your project you can setup a new PC, install Jenkins. Or you just use Travis on Github instead.

So I just started to use some more services on GitHub for my projects, in special for Asset-Importer-Lib ( see https://github.com/assimp/assimp and its dependency https://github.com/kimkulling/openddl-parser.git ) of course:

 

Watch your logs in your unittests!

The idea

Unittests and integration-tests are a great tool not to break your code. They are building a safety-net to help you when you have to add a new feature or fixing a bug in an existing codebase.
But of course there will be situations when a bug will occur which was not covered by your test-suite. One way to get an understanding what went wrong are logfiles. You can use them to write a protocol what happened during runtime. When something useful ( like creating a new entry into a database ) happened you can write this information with a timestamp into your protocol. When a bug occurs like disk is full an error-entry will be created. And you can use it to log some internal states of your application. When the log-entries are well maintained they help you to get a better understanding what happened during a crash ( and of course what went wrong before ). And you can use them to post warnings like: be careful, this API-call is deprecated.
But do you watch your logs in your unit-tests and integration-tests? Maybe there are interesting information stored in them for a test-fixure which you should take care of as well. For instance when you declare an API-call as deprecated, but this call is still in use in a different sub-system it would be great to get this as an error in a unittest. Or when some kind of warning occurrs at some point of your log. We observed stuff like that in production code more than once. To take care of these situations we added a functionality called a log-filter: you can use it to define expected log-entries like an error which must occur in a test because you want to test the error behaviour. When unexpected entries are there the test will fail. So you will see in your tests what shall happen and what not.

Do some coding

Lets start with a simple logging service for an application:
My basic concept for a logging service looks like:
A logger is some kind of a service, so only one is there. Normally I am building a special kind of singleton to implement it ( yes I know, they are bad, shame on me ). You can create them on startup at the beginning. You shall destroy them at the end of the application ( last call before exit( 0 ) )
Log entries have different severities, for instance:
Debug: Internal states messages for debugging
Info: All useful messages
Warn: Warnings for the developer like “API-call deprecated”
Error: External errors like disk full or DAU-user-error
Fatal: An internal error has occurred caused by a bug
You can register log-streams to the logger, each logstream will write the protocol to a special output like a log-file or a window from the frontend.
In code this could look like:

class AbstractLogStream {
public:
  virtual ~AbstractLogStream();
  virtual void write( const std::string &message ) = 0;
};

class LoggingService {
public:
  // the entry severity
  enum class Severity {
    Debug,
    Info,
    Warn,
    Error,
    Fatal
  };
  static LoggingService &create();
  static void destroy();
  static LoggingService &getinstance();
  void registerStream( const AbstractLogStream &stream );
  void log( Severity sev, const std::string &message, 
    const std::string &file, unsigned int line );
  
private:
  static LoggingService *mInstance;
  LoggingService();
  ~LoggingService();
};

With this simple API you can create and destroy your log service, log messages of different severities and register your own log-streams.
Now we want to observer the entries during your unittests. A simple API to do this could look like:

class TestLogStream : public AbstractLogStream {
public:
  TestLogStream();
  ~TestLogStream();
  void write( const std::string &message ) override {
    TestLogFilter.getInstance().addLogEntry( message );
  }
};

class TestLogFilter {
public:
  static TestLogFilter &create();
  static void destroy();
  static TestLogFilter &getInstance();
  void addLogEntry( const std::string &message );
  void registerExpectedEntry( const std::string &message );
  bool hasUnexpectedLogs() const {
    for ( auto entry : m_messages) {
      std::vector::iterator it( std::find( m_expectedMessages.begin, m_expectedMessages.end(), entry );
      if ( it != m_expectedMessages.end() ) {
        return true;
      }
    }
    return false;
  }

private:
  TestLogFilter();
  ~TestLogFilter();

private:
  std::vector m_expectedMessages;
  std::vector m_messages;
};

The filter contains two string-arrays:
One contains all expected entries, which are allowed for the unittest
The other one contains all written log-entries, which were written by the TestLogStream during the test-fixure
Let’s try it out

You need to setup your testfilter before runnning your tests. You can use the registerExpectedEntry-method to add an expected entry during your test-execution.
Most unittest-frameworks support some kind of setup-callback mechanism before executing a bundle of tests. I prefer to use gtest. So you can create this singleton-class here:

#include <gtest/gtest.h>

class MyTest : public ::testing::test {
protected:
  virtual void SetUp() {
    LoggingService::create();
    TestLogFilter::create();
    LoggingService::getInstance().registerStream( new TestLogStream );
    TestLogFilter::getInstance().registerExpectedEntry( "Add your entry here!" );
  }

  virtual void TearDown() {
    TestLogFilter::destroy();
    EXPECT_FALSE( TestLogFilter::getInstance().hasUnexpectedLogs() );
    LoggingService::destroy();
  }
};

TEST_F( MyTest, do_a_test ) {
  ...
}

First you need to create the logging-service. In this example only the TestLogStream will be registered. Afterwards we will register one expected entry for the test-fixure.
When all tests have proceeded the teatDown-callback will check, if any unexpected log-entries were written.
So when unexpected entries were detected the test will be marked as a failure. Andy you can see if you forget to deal with any new behaviour.
What to do next

You can add more useful stuff like:
Add wildcards for expected og entries
Make this thread-safe

Static code analysis with QtCreator-4.0.0, part 1

The latest version of QtCreator brings an option to run static-code-analysis using Clang. I struggled a lot with the setup of Coverity for Asset-Importer-Lib, so I had some hope that the setup for Clang will be a little bit easier. I wanted to run it on Windows 10 first, then move to Linux. So here is the report of my experiences:
First thing to do is get the latest QtCreator-version, the latest one is QtCreator-4.0.0. You can find it here: https://www.qt.io/ide/ .
QtCreator is able to open CMake-based projects. What a luck: Asset-Importer-Lib is based on a CMake build. So open it and run the clang-analyser, theoretically.

Unfortunately there is a bug with clang-analyser when you are using the Visual-Studio to build it. You can find the corresponding bug here: https://bugreports.qt.io/browse/QTCREATORBUG-16234 . When using VS together with the clang-analyser the executable of clang cannot been started in the correct way. The workaround to get it running is easy: add the folder conaining clang in the QtCreator-bin-directory to your Enrironment-variable path.
Did it, restarted QtCreator, open Asset-Importer-Lib, clang analysis began to work …

To be continued …

Build Asset Importer Lib for 64bit with Visual Studio from source-repo

If you want to generate a 64bit-build for Asset-Importer-Lib by using the Visual Studio project files generated by CMake please follow these instructions:
Make sure that you are using a supported cmake ( 2.8 or higher at the moment )- and Visual-Studio-Version ( on the current master VS2010 is deprecated )
Clone the latest master Asset-Importer-Lib from github
Generate the project files with the command: cmake -G”Visual Studio 14 Win64?
Open the project and build the whole project
Enjoy the 64-bit-version of your famous Asset-Importer-Lib
This should help you if you a struggeling with this feature. We just learned that just switching to code generation for 64bit does not work.

Asset Importer Lib binaries of the latest build

If you are looking for the latest Asset Importer Lib build: we are using appveyor
( check their web-site https://ci.appveyor.com, its free for OpenSource projects ).
as the Continuous Integration service for windows. If the build was successful it
will create an archive containing the dll’s, all executables and the export
libraries for Windows. At the moment we are supporting the following versions:
– Visual Studio 2015
– Visual Studio 2013
– Visual Studio 2012
I am planning to support the MinGW version as well. Unfortunately first I have to
update one file which is much too long for the MinGW-compiler ( thanks to the
guy’s from the Qt-framework ).

Please use only one statement per assert

Do you know the assert-macro? It is an easy tool for debugging: You can use it to
check if a pointer is a NULL-pointer or if your application is in a proper state
for processing. When this is not the case, if will stop your application,
when you are using a debug mode, in release mode normally nothing happens.
Depending on your platform this can vary a little bit. For instance the
Qt-framework prints a log-message if you have a failed assert test to stderr when
you are currently using a release build. So assert is a nice tool to check
pre-conditions for you function / method. And you will see your application crashing
when this precondition is not fulfilled. Thanks to some preprocessor-magic the
statement itself will be printed to stdout. So when you are writing something
like:

void foo( bar_t *ptr ) {
  assert( NULL != ptr );
  ...
}

and your pointer is a NULL-pointer in your application you will get some info on
your stdout like:

assert in line 222, file bla.cpp: assert( NULL != ptr );

Great, you see what is wrong and you can start to fix that bug. But sometimes you
have to check more than one parameter or state:

global_state_t MyState = init;

void foo( bar_t *ptr ) {
  assert( NULL != ptr && MyState == init );
  ...
}

Nice one, your application still breaks and you can still see, what went wrong?
Unfortunately not, you will get a message like:

assert in line 222, file bla.cpp: assert( NULL != ptr &amp;&amp; MyState == init );

So what went wrong, you will not be able to understand this on a first look.
Because the pointer could be NULL or the state may be wrong or both of the tests
went wrong. You need to dig deeper to understand the error.
For a second developer this will get more complicated, because he will most likely
not know which error case he should check first, because he didn’t wrote the code.s
So when you have to check more than one state please use more than one assert:

global_state_t MyState = init;

void foo( bar_t *ptr ) {
  assert( NULL != ptr );
  assert( MyState == init );
  ...
}

Thanks!