Logging in Java - with users in mind!

Logging in a software application is often considered as a non-functional requirement. I suspect that categorizing logging as 'non-functionality' and assuming it will NOT have any user impact, often lead to hasty and premature requirements handling.

The result might be focusing too soon on implementation details such as which logging framework to use and how to inject logging statements in the code instead of first identifying users and their needs.

In this article I will identify user needs regarding logging, map those needs to specific requirements and propose a solution implemented with one of the more popular Java logging frameworks, SLF4J with Logback.

Some basic knowledge of Java logging frameworks are required to get the most out of this article.

1. Problem analysis

"As a system owner, I want my system to audit log important events, e.g. user creation, to have key business information available. Logging should not affect user experience in any negative way."

"As a system operator, I want my system to log precise information about errors affecting the systems ability to function properly and what to do to correct them. I also want the log to contain as little as possible about normal activity in the system, everything else is just noise."

"As a system developer/tester, I want to configure the system to do trace/debug logging, so I can verify that my code is behaving as expected, at an early stage of development. In production it would be nice if the log could provide as much information as possible about what happens in the system before/during an error occurs, so I can locate the bug causing the error."

The user stories above describes some common needs for application logging. The two first user stories are for production environment only, the developer/tester user story is for both development, test and production environments.

The users needs can be mapped into a set of specific (cross) functional requirements. The requirements are in some ways conflicting and can be summarized as the table below shows:

REQUIREMENT WHO NEEDS IT 
1 Audit log important events Owner
2 Maximize performance (i.e. minimize logging) Owner
Give precise error information for correcting problems Operator/developer/tester
Minimize logging (no noise) during normal operation Operator
Maximize logging during development Developer/tester
Maximize logging (make noise) when in trouble, not otherwise Developer/tester

 

2. Solution

How do you solve all requirements for the different users? A common approach is to add logging statements in the code, evenly distributed over a set of logging levels, with increasing criticality. But that approach will not cover all the different requirements at the same time! An evenly distribution of logging statements will make a lot of unnecessary noise or too little information in the logs, depending on the actual situation. You have to carefully consider WHEN to use WHAT logging level. This is my recommendation:

ERROR  Only errors so critical that the application will malfunction 
WARN All other errors or unexpected events
INFO Only lifecycle events
DEBUG  All other events (which might be worth logging)
TRACE (Optionally) all method invocations with parameter and return values

The code below shows an example of using this approach, using SLF4J API with Logback. Here are the the prerequisites you need to run the example:

If a class named UserManager is implemented like this:

private static final Logger LOGGER = LoggerFactory.getLogger(UserManager.class);
private static final Marker ENTER = MarkerFactory.getMarker("ENTER");
private static final Marker LEAVE = MarkerFactory.getMarker("LEAVE");
private static final Marker AUDIT = MarkerFactory.getMarker("AUDIT");

public static final UserManager INSTANCE = new UserManager();
private final Database database;

private UserManager() {
    LOGGER.trace(ENTER, "constructor");
    try {
        LOGGER.debug("Start creating database object");
        database = new Database("");
        LOGGER.debug("Finished creating database object");
    } catch (Exception e) {
        LOGGER.error("Failed to initialize database, this might be due to an configuration or network error", e);
        throw e;
    }
    // Lifecycle information - happens only once in this JVM - use INFO level for logging
    LOGGER.info("Initialized user manager '{}'", this.toString());
    LOGGER.trace(LEAVE, "constructor");
}
public void registerUser(String name) {
    LOGGER.trace(ENTER, "registerUser: {}", name);
    try {
        boolean userCreated = database.create(name);
        if (userCreated) {
            LOGGER.debug(AUDIT, "Successfully created user '{}'", name);
        } else {
            LOGGER.warn(AUDIT, "Failed to created user '{}'", name);
        }
        } catch (Exception e) {
            LOGGER.warn(AUDIT, "Failed to create user '{}'", name, e);
        }
        LOGGER.trace(LEAVE, "registerUser");
    }

And then you invoke #registerUser like this:

UserManager.INSTANCE.registerUser("John Doe");

Scenario 1 - When you run this code with INFO as configured level for the root logger you will get following output on the console:

INFO UserManager - Initialized user manager 'no.kantega.texts.examples.UserManager@68558426'
AUDIT DEBUG UserManager - Successfully created user 'John Doe'

Scenario 2 - If an error occurs during database initialization you will get this:

ERROR UserManager - Failed to initialize database, this might be due to an configuration or network error
java.lang.RuntimeException: Failed to create connection to database
at no.kantega.texts.examples.UserManager$Database.(UserManager.java:52) ~[classes/:na]
at no.kantega.texts.examples.UserManager$Database.(UserManager.java:46) ~[classes/:na]
at no.kantega.texts.examples.UserManager.(UserManager.java:36) [classes/:na]
at no.kantega.texts.examples.UserManager.(UserManager.java:15) [classes/:na]
at no.kantega.texts.examples.LoggingExample.main(LoggingExample.java:5) [classes/:na]

Scenario 3 - And if an error occurs in database during user creation (because the user already exists), this might be the output:

INFO UserManager - Initialized user manager 'no.kantega.texts.examples.UserManager@68558426'
AUDIT WARN UserManager - Failed to create user 'John Doe'
java.lang.RuntimeException: Failed to execute SQL for inserting user 'John Doe', user already exists
at no.kantega.texts.examples.UserManager$Database.create(UserManager.java:65) ~[classes/:na]
at no.kantega.texts.examples.UserManager.registerUser(UserManager.java:21) ~[classes/:na]
at no.kantega.texts.examples.LoggingExample.main(LoggingExample.java:5) [classes/:na]

Scenario 4 - Running the happy day scenario with TRACE level enabled in Logback configuration will get you this:

ENTER TRACE UserManager - constructor
DEBUG UserManager - Start creating database object
ENTER TRACE UserManager$Database - constructor
DEBUG UserManager$Database - Initialized database with connection string ''
DEBUG UserManager - Finished creating database object
LEAVE TRACE UserManager$Database - constructor
INFO UserManager - Initialized user manager 'no.kantega.texts.examples.UserManager@528c0d5d'
LEAVE TRACE UserManager - constructor
ENTER TRACE UserManager - registerUser: John Doe, {}
ENTER TRACE UserManager$Database - create: John Doe
DEBUG UserManager$Database - Start creating user 'John Doe'
DEBUG UserManager$Database - Created user 'John Doe'
LEAVE TRACE UserManager$Database - create
AUDIT DEBUG UserManager - Successfully created user 'John Doe'
LEAVE TRACE UserManager - registerUser

All together these logging outputs satisfy all requirements except #6 - we will see how to deal with that later.

Requirement #1 'Audit log important events' is solved by using the Marker interface in SLF4J and TurboFilter in Logback. Note that both DEBUG and WARN levels are used in combination with the audit logging. This covers ALL attempts to execute important functionality, both failed and successful ones. The successful audit event is logged with DEBUG level only, because this is not an system lifecycle event and does not 'deserve' INFO level. It will anyway stand out in the log output, because of the AUDIT marker. Also note that audit events are always logged, even if logging level in the root logger is set to OFF.

Another consideration to make is whether audit logging should be solved by the logging framework at all, or if audit logging actually is so important that it should be an application feature. It COULD be solved by putting the audit events in a database and providing a proper GUI for the user (system owner). But that is another story...

Requirement #2-5 are solved by using the recommended logging levels at the correct places in the code, in combination with adjusting the configured logging level. Try adjusting the logging level for yourself and see the resulting output.

3. Maximize logging when in trouble, not otherwise

What about requirement #6 'Maximize information (make noise) when in trouble, not otherwise'? One solution is to buffer all the last X logging events and flush them to the log if an logging event with e.g. level ERROR occurs. An example of an log4j appender providing such buffering is PositronLogger (https://github.com/holmbech/PositronLogger). For Logback I would recommend implementing such a buffering component as an Encoder (http://logback.qos.ch/manual/encoders.html), to make it more versatile for use with all types of appenders.

If you will implement such a buffering Encoder you probably wish to be able to configure at least three factors:

Implementation of a buffering Encoder is not included in this article, but you get the idea how it should be working. Assuming we have implemented such a buffering Encoder in the class 'no.kantega.BufferingEncoder' the STDOUT appender in the Logback configuration used in this example will then look like this:

                       
        %-5marker %-5level %logger{0} - %msg%n   
        50                                 
        ERROR            
        INFO                 
                                                          

With such a buffering Encoder in place (NB! In this case TRACE must be configured logging level for the root logger, to be able to buffer ALL events) the logging output for scenario 1, 3 and 4 would be exactly as before, but in scenario 2, when a critical error occurs, the output will contain information from all logging levels for the steps before the actual error occurred:

ENTER TRACE UserManager - constructor
DEBUG UserManager - Start creating database object
ENTER TRACE UserManager$Database - constructor
LEAVE TRACE UserManager$Database - constructor
ERROR UserManager - Failed to initialize database, this might be due to an configuration or network error
java.lang.RuntimeException: Failed to create connection to database
at no.kantega.texts.examples.UserManager$Database.(UserManager.java:52) ~[classes/:na]
at no.kantega.texts.examples.UserManager$Database.(UserManager.java:46) ~[classes/:na]
at no.kantega.texts.examples.UserManager.(UserManager.java:36) [classes/:na]
at no.kantega.texts.examples.UserManager.(UserManager.java:15) [classes/:na]
at no.kantega.texts.examples.LoggingExample.main(LoggingExample.java:5) [classes/:na]

 

4. Wrap-up

One of the main goals with logging in a production environment is to have buggy code make a lot of 'noise' in your log, and quality code to be 'quiet'. This could be achieved by:

Happy logging!

Litt om forfatteren

HåvardKB

Håvard Kvam Buvik

Håvard har jobbet i Kantega siden 2003 og er opptatt av mange aspekter ved programvareutvikling: Fra kravhåndtering og gode løsninger til effektiv arbeidsmetodikk og koding.
comments powered by Disqus