Using async tokens with JavaScript FileReader

The JavaScript FileReader is a very powerful, efficient and asynchronous way to read the binary content of files or Blobs. Because it’s asynchronous, if you are doing high-volume, in-memory processing there is no guarantee as to the order in which reading events are completed. This can be a challenge if you have a requirement to associate some additional unique information with each file or Blob and persist it all the way thru to the end of the process. The good news is there is an easy way to do this using async tokens.

Using an asynchronous token means you can assign a unique Object, Number or String to each FileReader operation. Once you do that, the order in which the results are returned no longer matters. When each read operation completes you can simply retrieve the token uniquely associated with the original file or Blob.  There really isn’t any magic. Here is a snippet of the coding pattern. You can test out a complete example on github.


function parse(blob,token,callback){

    // Always create a new instance of FileReader every time.
    var reader = new FileReader();

    // Attach the token as a property to the FileReader Object.
    reader.token = token;

    reader.onerror = function (event) {
        console.error(new Error(event.target.error.code).stack);
    }

    reader.onloadend = function (evt) {
        if(this.token != undefined){

            // The reader operation is complete.
            // Now we can retrieve the unique token associated
            // with this instance of FileReader.
            callback(this.result,this.token);
        }
    };
    reader.readAsBinaryString(blob);
}

Note, it is a very bad practice to simply associate the FileReader result object with the token being passed into the parse() function’s closure. Because the results from the onloadend events can be returned in any order, each parsed result could end up being assigned the wrong token. This is an easy mistake to make and it can seriously corrupt your data.

node.js: batching parallel async http requests

Some DNS providers limit the number of parallel, asynchronous HTTP requests that you can run simultaneously. Not all providers have this feature and it is commonly called DNS RRL, or DNS Response Rate Limiting. These rate limits are typically used for the very good reason of thwarting DDoS attacks. However, these DNS management tools can also limit legitimate (non-spam) IT shops that are performing their jobs.

The good news is there is a pattern you can use within a Node.js application that let’s you use a blocking timer to wait for one batch of asynchronous parallel requests to complete before the next batch is run. Depending on your provider, this can help you help prevent DNS errors as well as incurring the wrath of your DNS provider. And, it’s still much more efficient than running synchronous requests.

Why not just use async.parallelLimit or something similar? As of the time of this writing, the async library doesn’t have batch capabilities. Therefore, the pattern I’m proposing allows you to fine tune the output HTTP request throttling by giving you the ability to set the delay that occurs between batches. Additionally, you can tokenize your batches to provide more control over handling specific tasks related to the order in which tokens are received.

The commercial use cases for this scenario include polling multiple RSS feeds, as well as JSON and xml feeds. These patterns are typical of large news feed aggregators and monitoring software.

Here is the code you’ll need to accomplish this. NOTE: you’ll want to run this code as a child process.  I’m skipping the nitty-gritty of how to use the Node async library and child processes so that this post can stay focused. You may also want to read my post on Node.js: moving intensive tasks to a child process.

Step 1. Set up a function that executes async.parallel and allows you to pass in both your data array and a token. Since async.parallel doesn’t let you inject a callback, we use a custom event emitter to announce when each batch is complete. This let’s you decouple the blocking timer task from the async task completion event.

this._async = function(/* Array */ data, /* Number */ token){
    try{
        async.parallel(data,function(err,results){
            console.log("Data retrieved! COUNT = " + results.length);
            try{
                var object = {
                    "results":results,
                    "token":token
                }
                event.emit("AsyncComplete",object);
            }
            catch(err){
                console.log("_async process.send() error: " + err.message + "\n" + err.stack);
            }
	}.bind(this._async))
     }
     catch(err){
         console.log("_async error: " + err.message + ", " + err.stack);
     }
}

Step 2. Here we set up our timer to loop in intervals of one second to wait until the current batch job completes. Once it’s complete then we fire off the next batch to async.parallel. We also listen for our custom AsyncComplete event to fire, and once that does then we take the results and build an array until we have all the tokens back. Once all tokens are received then we send the final results array back to the parent process via process.send().

this._loopArray = function(/* Array */ arr){

    var previousVal = 0;
    var segment = null;
    var length = arr.length;

    var remaining = 0; // number of tokens remaining
    var t = 0;         // number of loops counter
    var token = 0;     // token received back from async
    var count = 1;     // internal token up counter

    var totalTokens = this._isEven(arr / 10) ? arr.length / 10 : Math.ceil( arr.length /10);

    var resultsArray = [];
    var result = null;

    event.on("AsyncComplete",function(event){

        console.log("async complete " + event.token);

        result = event.results;
        resultsArray.push(result);
        token = parseInt(event.token);

        if(count == totalTokens){
            console.log("total has been reached");
            process.send(resultsArray);
        }
    }.bind(this))

    if(length > 0){
        var timer = setInterval(function(){

        console.log("t= " + t + ", token= " + token + ", " + totalTokens)

            if(t == token && count <= totalTokens){
                count++;
                if(t <= length) t+=10;
                remaining = length - t;

                if(remaining > 10){
                    segment = arr.slice(previousVal,t);
                    console.log("segment length " + segment.length)
                    previousVal = t;
                }
                else{
                    segment = arr.slice(previousVal,length);
                }

                console.log("remaining = " + remaining);

                if(segment != null && t != 0){
                    this._async(segment,t);
                }

                if(remaining <10)clearTimeout(timer);
            }

	    console.log("tick");
        }.bind(this),1000);
    }
}

Complete code snippet. Here’s all the code you’ll need for the child process.


//Retriever.js - batch processor for async.parallel requests

var http = require("http");
var async = require("async");
var Event = require("events").EventEmitter;

process.on('message',function(msg){

    this._async = function(/* Array */ data, /* Number */ token){
        try{
            async.parallel(data,function(err,results){
                console.log("Data retrieved! COUNT = " + results.length);
                try{
                    var object = {
                        "results":results,
                        "token":token
                    }
                    event.emit("AsyncComplete",object);
                }
                catch(err){
                    console.log("_async process.send() error: " + err.message + "\n" + err.stack);
                }
             }.bind(this._async))
         }
         catch(err){
             console.log("_async error: " + err.message + ", " + err.stack);
         }
    }

    this._loopArray = function(/* Array */ arr){

        var previousVal = 0;
        var segment = null;
        var length = arr.length;

        var remaining = 0; // number of tokens remaining
        var t = 0;         // number of loops counter
        var token = 0;     // token received back from async
        var count = 1;     // internal token up counter

        var totalTokens = this._isEven(arr / 10) ? arr.length / 10 : Math.ceil( arr.length /10);

        var resultsArray = [];
        var result = null;

        event.on("AsyncComplete",function(event){

            console.log("async complete " + event.token);

            result = event.results;
            resultsArray.push(result);
            token = parseInt(event.token);

            if(count == totalTokens){
                console.log("total has been reached");
                process.send(resultsArray);
            }
        }.bind(this))

        if(length > 0){
            var timer = setInterval(function(){
                console.log("t= " + t + ", token= " + token + ", " + totalTokens)

                if(t == token && count <= totalTokens){
                    count++;
                    if(t <= length) t+=10;
                    remaining = length - t;

                    if(remaining > 10){
                        segment = arr.slice(previousVal,t);
                        console.log("segment length " + segment.length)
                        previousVal = t;
                    }
                    else{
                        segment = arr.slice(previousVal,length);
                    }

                    console.log("remaining = " + remaining);

                    if(segment != null && t != 0){
                        this._async(segment,t);
                    }

                    if(remaining <10)clearTimeout(timer);
                }

	        console.log("tick");
            }.bind(this),1000);
        }
    }

    this._isEven= function(value){
        if(value%2 == 0)
            return true;
        else
            return false;
    }

    this._init = function(){
        this._loopArr(this._someArr);
    }.bind(this)()
}

process.on('uncaughtException',function(err){
    console.log("retriever.js uncaught exception: " + err.message + "\n" + err.stack);
})

Using ActionScript Tokenized Asynchronous HTTP Requests

My recent Antarctica Flex/ActionScript app had a requirement for tokenized asynchronous requests. That way I could use a centralized HTTP controller through which every outbound request was submitted. By attaching a “token” to each request, I could properly manage the response payload for the dozen’ish different processes that were going on. In other words, you attach a unique identifier to the outbound request. When the server sends back its response to the client application, this unique identifier is passed along in the payload. Quite cool, right?!

I’ve used this technique in heavy-duty, server-side applications before but only a few times in a web client. In practice it works brilliantly and it allowed me to easily organize the HTTP responses from many different types of requests and keep them all straight. At the heart of controller was this pseudo-code. If you haven’t done this before, there are just a few tricks to make it work right. I’ve included all the code to make your life easier. The token variable is a String that you create and then pass to the AsynchResponder.

                               
_http = new HTTPService();
_http.concurrency = "multiple";
_http.requestTimeout = 20;
_http.method = "POST";

var asyncToken:AsyncToken = _http.send( paramsObject );  
                                     
//you pass the token variable in as a String
var token:String = "someValue";
var responder:AsyncResponder = new AsyncResponder(resultHandler,faultHandler,token);
asyncToken.addResponder(responder);

Elsewhere in the app, the other classes that used the controller received the response payload via the event bus and then filtered the response by the tokens using a switch/case statement. AppEvent is my custom event bus that would broadcast the payload to the entire application via an Event. This allowed me to fully decouple the http controller from being directly wired into my other classes. It made the app very flexible in that action would only be taken when the response came back. If you want a few more details about this architecture, then check out my blog post on it. Here’s the HTTP response handler pseudo-code that is inside the controller.

Just a note, the HTTPData Class is a custom object I wrote to manage the response data. You could manage the data anyway you like. This is just one example of how to do it.

private function resultHandler(result:Object, token:Object = null):void
{	
	var httpData:HTTPData = new HTTPData();
	httpData.result = result.result;
	httpData.token = token; 
	AppEvent.dispatch(AppEvent.HTTP_RESULT,httpData);
}

And, here’s the response handler that’s inside one of the applications pages (views) that recieve the payload via my event bus:

AppEvent.addListener(AppEvent.HTTP_RESULT,httpResultHandler);

/**
 * Handles setting up many of the user variables based on tokenized,
 * asynchronous HTTP requests.
 * @param event AppEvent from HTTP request.
 */
private function httpResultHandler(event:AppEvent):void
{
    var json:String = event.data.result as String;	
    
    //route the tokens through a switch/case statement
    switch(event.data.token)
    {
         case "getallgpspoints":
              parseGPSPoints2(json);
              break;
    }
}

You can download the entire controller example here. If you use it for your own work, you’ll have to comment out anything you don’t need like some of the import statements and the custom events. Have fun!