Friday, November 01, 2013

WinJSLog - A Logging and Reporting JavaScript Module For Windows 8 Store Apps

Background

Usually developers use console.log() or WinJS.Log() function to log error and debug information. But that only helps during debugging in local machine. How about those users who installed your app and had errors or experienced crashes? Microsoft has Quality reports in Windows 8 app's dashboard if its telemetry service is enabled, but that function is quite limited, and it only collects crash data when the user accepts the so called Customer Experience Improvement Program (CEIP). It would be great to have errors and custom data reported to your own server so you can do some investigation at real-time. In this post I will have some discussion on this topic and present an implementation to log and report crashes, errors, and potentially any info you want for a Windows 8 store app.

Why logging and reporting are important for mobile apps?

Is your Mobile app working nicely in different devices? Are there any unexpected issues happening in different environment? The customer feedback is likely your main source of investigation. But that's not something you as a developer should rely on. Some issues may never surface, and people would just uninstall the app when they are not happy on it. Unless for some royal users, no many people would spend their precious time to write feedback or report to help you to improve your application.

Even you are lucky enough to have a nice user having a polite comment like this: "This's a good app! However sometimes it crashes when I am on editing mode. Please fix it.", you have no idea what's the real problem is. You may end up spending countless time just to reproduce the issue in specific situation.

So you want to monitor your app's running issues in real-time, and you want to get as much info when crash or error occurs. When logging service is on board, you may want more data to analyze the customer usage. Then it becomes an analytics framework and you have another communication channel to your app (carefully not to abuse using it).

All that sound to be essential to maintain the app quality for a distributed mobile app. But surprisingly I couldn't find a public available solution online. So I decided to write my own, and I called it WinJSLog. The goal of the implementation is light, effective, and easy to use.

Why an app crashes?

I have discussed WinJS exceptions in a previous post. Basically the unhandled exception will result in app crash by default. Crashes is the worst user experience ever because the running app is suddenly disappear without any prompt and reason to end user. A better experience would be redirect the user to a landing page. People may get confused too "why I am switched to a different place?", but at least the app maintains the running condition, giving user a chance to do other stuff or retry the previous action.

Unhandled exception in Windows 8 can be caught in Windows.Application.onerror global event. Following code snippet shows how to register an event handler to redirect user to a specific page when unhandled exception occurs:

    Windows.Application.onerror = function (e) {
        showErrorPage();
        return true; // telling WinJS that the exception was handled, don't terminate
    }

What data to collect?

Errors like "Object reference is null" is not so helpful. As a development you want as much info as possible to identify an issue, such as the stack trace which provides more context about the error. The device and environment variance could make some issue super hard to figure out. A problem may only happen in one specific device, in one specific language or in one specific screen size.

My implementation includes the error message detail, as well as following context info:

  • application version
  • language
  • device maker
  • device model
  • screen size
  • suspend and resume history
  • page navigation history
  • method invocation history
Above context information could help us to pin point the issues occurring in certain environment. Ideally the memory usage info should be included but I have not figured out how to get that data from WinJS libraries yet.

What's the data format?

Data are collected as key-value JavaScript objects, and JSON is used to transfer the logged data to different media, like file system or cross the network. JSON was born for such usage and it's a total no-brainer. Following JSON data is an example of crash log sent from a Windows 8 test app running in a VMWare virtual machine:

{
    "os" : "Windows 8",
    "version" : "1.0.0.1",
    "manufacturer" : "VMware, Inc.",
    "model" : "VMware Virtual Platform",
    "lang" : "en-CA",
    "screen" : "1680x1050",
    "orientation" : "landscape",
    "timezone" : "-5", 
    "latitude" : "43.666698",
    "longitude" : "-79.416702",
    "logtime" : "November 1, 2013 9:42:56 AM", 
    "pagetrace" : "home[9:42:27AM]=>user[9:42:41AM]=>test[9:42:48AM]",
    "level" : "crash", 
    "log": [ 
            {
             "source" : "promise", 
             "message" : "'username' is undefined", 
             "description" : "'username' is undefined",
             "stacktrace" : "ReferenceError: 'username' is undefined\n 
                at ready (ms-appx://testApp/pages/test/test.js:85:13)\n 
                at Pages_ready (ms-appx://microsoft.winjs.1.0/js/base.js:4511:29)\n
                at notifySuccess (ms-appx://microsoft.winjs.1.0/js/base.js:1404:21)\n 
                at enter (ms-appx://microsoft.winjs.1.0/js/base.js:1091:21)\n
                at _run (ms-appx://microsoft.winjs.1.0/js/base.js:1307:17)\n
                at _completed (ms-appx://microsoft.winjs.1.0/js/base.js:1275:13)\n
                at notifySuccess (ms-appx://microsoft.winjs.1.0/js/base.js:1404:21)\n
                at enter (ms-appx://microsoft.winjs.1.0/js/base.js:1091:21)\n
                at _run (ms-appx://microsoft.winjs.1.0/js/base.js:1307:17)\n
                at _completed (ms-appx://microsoft.winjs.1.0/js/base.js:1275:13)", 
             "level" : "crash", 
             "time":"2013-11-01T14:42:56.438Z"
           }
        ]
}
In next post I will present a simple logging server implementation to handle such requests.

When to send the logged data? What happen if no Internet access?

Crash or unhandled exception is considered serious issue, and the error detail should be sent to the logging server immediately when it happens. For regular error messages or debugging info we could accumulate them and send them out once current page is completed or a different page is visited. There's also a timer process to check and cleanup the pending logs in background (2 minutes by default and is configurable), so the logged message could be delivered even the user stay on one page for a long time.

The logged data will be stored to file system if Internet service is not available. When the application starts up, it will check if such log file(s) exist, if so the log file(s) will be read and send to logging server if Internet is available. Usually there're many operations during the application booting up. To avoid the action congestion, it's a good idea to defer to run the check-and-send action a little bit later than the application launch (30 seconds by default). If the Internet access is not available, the background process will also periodically recheck the network status (2 minutes by default) and send out the file logs when Internet access becomes available.

Other consideration

You app may target to some specific features such as map and camera stuff, and the location and camera info would be good for you to diagnose the issue. In such case, the location and camera data should be obtained right in the spot when they are used, and store them in a global variable like sessionState, so the logging code can read it directly without a redundant code executed again, which makes the logging service lighter and less intrusive.

What happens if logging server requires authentication? As I discussed in WinJS Authentication, it would be better to allow anonymous access for such general logging service. Hard-coded logging username/password is not safe, and it's not feasible to let user to type the username and password. Would it be a potential DOS/DDOS attach when anonymous access is open? No service in Internet can avoid that, but firewall in front of your servers should take care of that threat. However we can still do a bit more to protect ourself in the code level, such as not logging abnormal data with extremely large size of URL or message.

How about the errors inside logging service? In general any errors or potential errors within logging and reporting code should be silently skipped. Optionally those errors can be sent to the logging server, or send to a different server if errors occur during sending data to logging server.

How to use the code?

The WinJSLog module exposes following functions:

    function fatal(error)                // log crash
    function error(description, error)   // log error
    function warning(error, description) // log warning
    function page(pageName)              // navigate to a page
    function method(methodName)          // invoke a method
    function registerLogging()           // register logging
    function unregisterLogging()         // unregister logging

First include the winjslog.js file inside default.html or your custom page, then register the logging service before app.start():

(function () {
    "use strict";

    WinJS.Binding.optimizeBindingReferences = true;

    var app = WinJS.Application;
    var nav = WinJS.Navigation;

    app.addEventListener("activated", function (args) {
      ...// skip for abreviation
    }); 
   
    app.onerror = function (e) {
        Logging.fatal(e);
        Logging.page("homepage");
        nav.navigate("/pages/home/home.html");
        
        return true; // the app will terminate if false
    };

    Logging.registerLogging("http://myloggingserver/");
    
    app.start();

    ...
In above code an onerror event handler is registered to catch the unhandled exception to avoid the app's crash, log the crash by Logging.fatal(e), then redirect to home page. Note that the action of redirecting to home page is also logged as a navigation path by Logging.page() function. Following code snippet shows how to log a function invocation and its potential error message:
    function doComplicatedTask() {
        Logging.method("doComplicatedTask");
        try {
            ... // job with potential error
        } catch (e) {
            logging.error("Error occur!", e);
        }
    }
Alternatively you can log lower level warning, info and debug message by using Logging.warning(), Logging.info() and logging.debug() functions respectively.

Source code of WinJSLog.js

Update: the source code now is available at github and you can get the latest version there.

(function () {
    "use strict";

    var logs = [], pages = [], methods = []; // log data
    var loggingServer, loggingEnabled, debugEnabled; // service setting
    var version, manufacturer, model; // app context
    var currentLevel = 'debug', levels = { 'debug': 1, 'info': 2, 'warning': 3, 'error': 4, 'crash': 5 };

    // register logging service
    function registerLogging(serverUrl, doDebugging, deferRunInSeconds, recheckInSeconds) {
        if (!serverUrl) {
            throw 'registerLogging error: logging server not defined.';
        }

        loggingEnabled = true;
        loggingServer = serverUrl;
        debugEnabled = doDebugging || false;
        deferRunInSeconds = deferRunInSeconds || 30;
        recheckInSeconds = recheckInSeconds || 60;

        // Register event handler for suspend, resume and relaunch (terminated then restarted)
        WinJS.Application.addEventListener("activated", restoreSessionData);
        Windows.UI.WebUI.WebUIApplication.addEventListener("suspending", suspend);
        Windows.UI.WebUI.WebUIApplication.addEventListener("resuming", resume);

        // defer to run the check-log-file-and-send process
        WinJS.Promise.timeout(deferRunInSeconds * 1000).then(function () {
            cleanupLogs(recheckInSeconds);
        });
    }

    // unregister logging service
    function unregisterLogging() {
        loggingEnabled = false;
        WinJS.Application.removeEventListener("activated", restoreSessionData);
        Windows.UI.WebUI.WebUIApplication.removeEventListener("suspending", suspend);
        Windows.UI.WebUI.WebUIApplication.removeEventListener("resuming", resume);
    }

    // restore logging data after app relaunch from terminated state
    function restoreSessionData(args) {
        if (args && args.detail && args.detail.kind === Windows.ApplicationModel.Activation.ActivationKind.launch) {
            if (args.detail.previousExecutionState === Windows.ApplicationModel.Activation.ApplicationExecutionState.terminated) {
                var sessionState = WinJS.Application.sessionState;
                if (sessionState.logPages)
                    pages = sessionState.logPages;
                if (sessionState.logMethods)
                    methods = sessionState.logMethods;
                if (sessionState.logLogs)
                    logs = sessionState.logLogs;
                if (sessionState.logLevel)
                    currentLevel = sessionState.logLevel;
            }
        }
    }

    // log suspending event and store log data into session
    function suspend() {
        if (loggingEnabled) {
            var pageEntry = { 'time': new Date(), 'page': 'suspending' };
            pages.push(pageEntry);
            var sessionState = WinJS.Application.sessionState;
            sessionState.logPages = pages;
            sessionState.logMethods = methods;
            sessionState.logLogs = logs;
            sessionState.logLevel = currentLevel;
        }
    }

    // log resuming event
    function resume() {
        if (loggingEnabled) {
            var pageEntry = { 'time': new Date(), 'page': 'resuming' };
            pages.push(pageEntry);
        }
    }

    // log an event: navigate to a page
    function pageLog(pageId) {
        if (loggingEnabled) {
            try {
                processMemoryLogs();
            } catch (e) { }
            var pageEntry = { 'time': new Date(), 'page': pageId };
            pages.push(pageEntry);
        }
    }

    // log an event: invoke a method
    function methodLog(methodName) {
        if (loggingEnabled) {
            methods.push(methodName);
        }
    }

    // log a crash; a crash or unhandled exception can be caught by WinJS.Application.onerror event
    function crashLog(err) {
        if (loggingEnabled) {
            setLevel('crash');
            var errWrapper = getErrorObject(err);
            errWrapper.level = "crash";
            errWrapper.time = new Date();
            logs.push(errWrapper);
            try {
                processMemoryLogs();
            } catch (e) { }
        }
    }

    // log an error
    function errorLog(description, err) {
        if (loggingEnabled) {
            setLevel('error');
            logs.push(getLogObject('error', description, err));
        }
    }

    // log a warning message
    function warningLog(description, err) {
        if (loggingEnabled && debugEnabled) {
            setLevel('warning');
            logs.push(getLogObject('warning', description, err));
        }
    }

    // log an info message
    function infoLog(description) {
        if (loggingEnabled && debugEnabled) {
            setLevel('info');
            logs.push(getLogObject('info', description));
        }
    }

    // log a debug message
    function debugLog(description) {
        if (loggingEnabled && debugEnabled) {
            setLevel('debug');
            logs.push(getLogObject('debug', description));
        }
    }

    // build a log object
    function getLogObject(level, description, err) {
        var logObject = getErrorObject(err);
        if (logObject.description) {
            logObject.description = logObject.description + description;
        } else {
            logObject.description = description || '';
        }
        logObject.level = level || 'unknown';
        logObject.time = new Date();
        return logObject;
    }

    // build an error object
    function getErrorObject(err) {
        var errObject = {};
        if (err) {
            if (err.detail && typeof err.detail === 'object') {
                var detail = err.detail;
                if (detail.promise) {
                    errObject.source = "promise";
                }
                if (detail.errorMessage) {
                    errObject.message = detail.errorMessage;
                    if (detail.errorLine)
                        errObject.codeline = detail.errorLine;
                    if (detail.errorUrl)
                        errObject.sourcUrl = detail.errorUrl;
                } else if (detail.error && typeof detail.error === 'object') {
                    errObject.message = detail.error.message || 'unknown';
                    if (detail.error.description)
                        errObject.description = detail.error.description;
                    if (detail.error.stack)
                        errObject.stacktrace = detail.error.stack;
                } else {
                    errObject.message = detail.message || 'unknown';
                    if (detail.description)
                        errObject.description = detail.description;
                    if (detail.number)
                        errObject.codeline = detail.number;
                    if (detail.stack)
                        errObject.stacktrace = detail.stack;
                }
            } else {
                errObject.message = err.message || err.exception || err;
            }
        }
        return errObject;
    }

    // determine the highest log level for current log entry
    function setLevel(level) {
        if (levels[level] > levels[currentLevel]) {
            currentLevel = level;
        }
    }

    // periodically check the memory logs and storage logs, and send logs to server if Internet is available
    function cleanupLogs(recheckInseonds) {
        if (loggingEnabled) {
            processMemoryLogs();
            processFileLogs();
            setTimeout(function () {
                cleanupLogs(recheckInseonds);
            }, recheckInseonds * 1000);
        }
    }

    // construct log message and send to server if Internet is available, otherwise save it to local storage
    function processMemoryLogs() {
        if (logs.length > 0) {
            var data = getContext();
            var date = new Date();
            data.logtime = date.toLocaleString() + ' [' + date.toISOString() + ']';
            if (pages.length > 0) {
                var pagetrace = pages.map(function (item) {
                    if (item.time && item.time.toLocaleTimeString)
                        return item.page + "[" + item.time.toLocaleTimeString().replace(' ', '') + ']';
                    else
                        return item.page + "[" + item.time + ']';
                }).join(' => ');
                data.pagetrace = pagetrace;
            }
            if (methods.length > 0) {
                data.methodtrace = methods.join(' => ');
            }
            data.level = currentLevel;
            data.log = logs.slice(0); //(logs.length == 1) ? logs[0] : logs.slice(0);

            if (isConnectedToInternet()) {
                sendLogsToServer(JSON.stringify(data));
            } else {
                saveLogsToFile(data);
            }
        }

        // clean up the logs
        methods = [];
        logs = [];
        currentLevel = 'debug';
    }

    // read all saved log files and send them to server if Internet is available
    function processFileLogs() {
        if (isConnectedToInternet()) {
            var localFolder = Windows.Storage.ApplicationData.current.localFolder;
            localFolder.getFilesAsync().then(function (files) {
                files.forEach(function (file) {
                    if (file && file.displayName && file.displayName.indexOf("logs") == 0) {
                        Windows.Storage.FileIO.readTextAsync(file).then(function (text) {
                            sendLogsToServer(text);
                        }).then(function () {
                            file.deleteAsync();
                        }).done(function () { }, function (err) { });
                    }
                });
            });
        }
    }

    // save a log entry to file system if Internet is not available
    function saveLogsToFile(obj) {
        var fileName = "logs.txt";
        var content = JSON.stringify(obj);
        var localFolder = Windows.Storage.ApplicationData.current.localFolder;
        var saveOption = Windows.Storage.CreationCollisionOption;
        localFolder.createFileAsync(fileName, saveOption.generateUniqueName).then(
            function (file) {
                return Windows.Storage.FileIO.writeTextAsync(file, content);
            }).done(function () {
                console.log("Log saved");
            }, function (error) {
                console.log("Log saved error");
            });
    }

    // send log message to logging server
    function sendLogsToServer(jsonData) {
        WinJS.xhr({
            type: "post",
            url: loggingServer,
            headers: { "Content-type": "application/json" },
            data: jsonData
        }).done(function completed(c) {
            console.log("log sent");
        },
        function error(e) { // One more try? send to different server? or silently skip?
            console.log("log sent error");
        });
    }

    // get current application context
    function getContext() {
        if (!version) {
            var appVersion = Windows.ApplicationModel.Package.current.id.version;
            version = appVersion.major + "." + appVersion.minor + "." + appVersion.build + "." + appVersion.revision;
            try {
                var deviceInfo = new Windows.Security.ExchangeActiveSyncProvisioning.EasClientDeviceInformation();
                manufacturer = deviceInfo.systemManufacturer;
                model = deviceInfo.systemProductName;
            } catch (e) {
                manufacturer = 'unknown';
                model = 'unknown';
            }
        }
        var context = {};
        context.version = version;
        context.manufacturer = manufacturer;
        context.model = model;
        context.os = "Windows 8";
        context.lang = navigator.appName == "Netscape" ? navigator.language : navigator.userLanguage;
        context.screen = screen.width + "x" + screen.height;
        context.orientation = getOrientation();
        context.timezone = (-(new Date()).getTimezoneOffset() / 60).toString();
        return context;
    }

    // determine current orientation
    function getOrientation() {
        var orientation = "unknown";
        switch (Windows.Graphics.Display.DisplayProperties.currentOrientation) {
            case Windows.Graphics.Display.DisplayOrientations.landscape:
                orientation = "landscape";
                break;
            case Windows.Graphics.Display.DisplayOrientations.portrait:
                orientation = "portrait";
                break;
            case Windows.Graphics.Display.DisplayOrientations.landscapeFlipped:
                orientation = "landscapeFlipped";
                break;
            case Windows.Graphics.Display.DisplayOrientations.portraitFlipped:
                orientation = "portraitFlipped";
                break;
        }
        return orientation;
    }

    // check if Internet access is available
    function isConnectedToInternet() {
        var connectivity = Windows.Networking.Connectivity;
        var profile = connectivity.NetworkInformation.getInternetConnectionProfile();
        if (profile) {
            var connected = (profile.getNetworkConnectivityLevel() == connectivity.NetworkConnectivityLevel.internetAccess);
            return connected;
        } else {
            return false;
        }
    }

    WinJS.Namespace.define("Logging", {
        registerLogging: registerLogging,
        unregisterLogging: unregisterLogging,
        page: pageLog,
        method: methodLog,
        fatal: crashLog,
        error: errorLog,
        warning: warningLog,
        info: infoLog,
        debug: debugLog
    });
})();