Using async tokens with JavaScript FileReader

The JavaScript FileReader is a very powerful, efficient and asynchronous way to read the binary content of files or Blobs. Because it’s asynchronous, if you are doing high-volume, in-memory processing there is no guarantee as to the order in which reading events are completed. This can be a challenge if you have a requirement to associate some additional unique information with each file or Blob and persist it all the way thru to the end of the process. The good news is there is an easy way to do this using async tokens.

Using an asynchronous token means you can assign a unique Object, Number or String to each FileReader operation. Once you do that, the order in which the results are returned no longer matters. When each read operation completes you can simply retrieve the token uniquely associated with the original file or Blob.  There really isn’t any magic. Here is a snippet of the coding pattern. You can test out a complete example on github.


function parse(blob,token,callback){

    // Always create a new instance of FileReader every time.
    var reader = new FileReader();

    // Attach the token as a property to the FileReader Object.
    reader.token = token;

    reader.onerror = function (event) {
        console.error(new Error(event.target.error.code).stack);
    }

    reader.onloadend = function (evt) {
        if(this.token != undefined){

            // The reader operation is complete.
            // Now we can retrieve the unique token associated
            // with this instance of FileReader.
            callback(this.result,this.token);
        }
    };
    reader.readAsBinaryString(blob);
}

Note, it is a very bad practice to simply associate the FileReader result object with the token being passed into the parse() function’s closure. Because the results from the onloadend events can be returned in any order, each parsed result could end up being assigned the wrong token. This is an easy mistake to make and it can seriously corrupt your data.

Fastest way to find an item in a JavaScript Array

There are many different ways to find an item in a JavaScript array. With a little bit of testing and tinkering, I found some methodologies were faster than others by close to 200%!

I’ve been doing some performance tweaking on a very CPU intensive JavaScript application and I needed really fast in-memory searching on a temporary array before writing that data to IndexedDB. So I did some testing to decide on an approach with the best search times. My objective was to coax out every last micro-ounce of performance. The tests were completed using a pure JavaScript methodology, and no third party libraries were used, so that I could see exactly what was going on in the code.

I looked at five ways to parse what I’ll call a static Array. This is an array that once it is written you aren’t going to add anything new too it, you simply access its data as needed and when you are done you delete it.

  1. Seek. Create an index Array based exactly on the primary Array. It only contains names or unique ids in the same exact order as the primary. Then search for indexArray.indexOf(“some unique id”) and apply that integer against the primary Array, for example primaryArray[17] to get your result. If this doesn’t make sense take a look at code in my JSFiddle.
  2. Loop. Loop thru every element until I find the matching item, then break out of the loop. This pattern should be the most familiar to everyone.
  3. Filter. Use Array.prototype.filter.
  4. Some. Use Array.prototype.Some.
  5. Object. Create an Object and access it’s key/value pairs directly using an Object pattern such as parsedImage.image1 or parseImage[“image1”]. It’s not an Array, per se, but it works with the static access pattern that I need.

I used the Performance Interface to get high precision, sub-millisecond numbers needed for this test. Note, this Interface only works on Chrome 20 and 24+, Firefox 15+ and IE 10. It won’t run on Safari or Chrome on iOS. I bolted in a shim so you can also run these tests on your iPad or iPhone.

My JSFiddle app creates an Array containing many base64 images and then loops thru runs hundreds of tests against it using the five approaches. It performs a random seek on the Array, or Object during each iteration. The offers a better reflection of how the array parse algorithm would work under production conditions. After the loops are finished, it then spits out an average completion time for each approaches.

The results are very interesting in terms of which approach is more efficient. Now, I understand in a typical application you might only loop an Array a few times. In those cases a tenth or even hundredth of a millisecond may not really matter. However if you are doing hundreds or even thousands of manipulations repetitively, then having the most efficient algorithm will start to pay off for your app performance.

Here are some of the test results based on 300 random array seeks against a decent size array that contained 300 elements. It’s actually the same base64 image copied into all 300 elements. You can tweak the JSFiddle and experiment with different size arrays and number of test loops. I saw similar performance between Firefox 29 and Chrome 34 on my MacBook Pro as well as on Windows. Approach #1 SEEK seems to be consistently the fastest on Arrays and Object is by far the fastest of any of the approaches:

OBJECT Average 0.0005933333522989415* (Fastest.~191% less time than LOOP)
SEEK Average 0.0012766665895469487 (181% less time than LOOP)
SOME Average 0.010226666696932321
FILTER Average 0.019943333354603965
LOOP Average 0.02598666658741422 (Slowest)

————–

OBJECT Average 0.0006066666883028423* (Fastest.~191% less time than slowest)
SEEK Average 0.0012900000368244945 (181% less time than LOOP)
SOME Average 0.012076666820018242
FILTER Average 0.020773333349303962
LOOP Average 0.026383333122745777 (Slowest)

As for testing on Android, I used my Android Nexus 4 running 4.4.2. It’s interesting to note that the OBJECT approach was still the fastest, however the LOOP approach (Approach #2) was consistently dead last.

On my iPad 3 Retina using both Safari and Chrome, the OBJECT approach was also the fastest, however the FILTER (Approach #3) seemed to come in dead last.

I wasn’t able to test this on IE 10 at the time I wrote this post and ran out of time.

Conclusion

Some folks have blogged that you should never use Arrays for associative search. I think this depends on exactly what you need to do with the array, for example if you need to do things like slice(), shift() or pop() then sticking to an Array structure will make your life easier. For my requirements where I’m using a static Array pattern, it looks like using the Object pattern has a significant performance advantage. If you do need an actual Array then the SEEK pattern was a close second in terms of speed.

References:

JSFiddle Array Parse tests
Performance Interface

[Updated: May 18, 16:06, fixed incorrect info]

Two languages that software developers should be familiar with

The question that I get asked the most these days is “what development languages should I be learning [to stay competitive/excited/motivated/etc] ?” The high-tech industry, and software in particular, is changing at a ridiculously fast pace and that introduces a lot of uncertainty and confusion as well as great opportunity. My answer to this question is unequivocal: I think for the foreseeable future developers should be learning the basic concepts of JavaScript and Python. If you don’t already have these skills then you simply cannot go wrong with this approach. If you’ve been a long-time server-side developer, or you are just getting started with software development then knowing the patterns and practices for JavaScript and Python will serve you well.

Why?

There are three primary reasons and I’ll try to be short and to the point. First, there are at least 2.5 billion internet users world wide, and that number is growing. Their primary method of accessing the web is a browser and JavaScript is the lingua franca of the browser world. JavaScript is a scripting language and it is “the” fundamental building block that allows web pages to “do” things such as submitting your search request to a server, or helping to find your location from your phone. Almost all web pages being served up around the world have JavaScript in them.

Second, the majority of retail, commercial and governmental web applications have a requirement that calls for the use of “server-side” code. This is code, such as Python, that runs on a server and not in the browser application. The most common functionality of server-side code is passing data back and forth between a database and a web application. For example, if a web app asks for a username and password, that user name and password are almost always stored on a server somewhere and not, for security reasons, in the web page and on the client browser where it could be very easily stolen.

Third, you can absolutely apply these client-server patterns and practices to other languages used within the realm of web development. A User Interface designer who has been solely focused on layout and styling via Cascading Style Sheets (CSS) can now understand and appreciate how the underlying JavaScript code can affect the look, feel and behavior of a web page. Python skills can also be used a springboard for more quickly learning other powerful web development platforms such as ruby-on-rails.

The bottom line is if you understand both client development (JavaScript) and server development (Python), then you start to gain considerable value as someone who understands how to help the entire system work together in harmony.

A short note on jQuery.

Many (most?) new web developers learn jQuery first. However, even if you know jQuery you don’t necessarily understand JavaScript. The awesome jQuery libraries provide an interface that hides and simplifies a lot of native JavaScript hoopla, and in general can really make life significantly easier and save time when building modern cross-browser web apps. jQuery is built using JavaScript (and CSS3), but it is not JavaScript. Because of that, when something goes wrong or not as you expected (not if, but when!), and you have a general understanding of how JavaScript works, then you stand a much better chance of figuring out a timely work-around with significantly less stress, frustration and time wasted towards your projects deadline.

The absolute minimum recommended reading list:

JavaScript.

  • Douglas Crockford’s book “JavaScript: the good parts”.
  • Douglas Crockford’s website – He is considered a key brainchild behind the ongoing development and understanding of JavaScript.
  • W3schools – An excellent website for anyone using or learning JavaScript. It has tutorials and online Try It Yourself sample apps.

Python.