Easily find image type in JavaScript

There are two easy ways to determine an image’s type using JavaScript: using an html Input tag with a type file, and using the DataView API. I’ve put together a github repository that contains all the code shown below. The sample app detects PNG, GIF, JPEG and BMP: https://github.com/andygup/DetectImageType.js

Here’s how to do it with an Input tag:

    <input type="file" id="fileInput" name="file"/>
    <script>
    var fileInput = document.getElementById("fileInput");
    fileInput.addEventListener("change",function(event){
        document.getElementById("name").innerHTML = "NAME: " + event.target.files[0].name;
        document.getElementById("type").innerHTML = "TYPE: " + event.target.files[0].type;
        document.getElementById("size").innerHTML = "SIZE: " + event.target.files[0].size;
    });
    </script>

And, here’s how to do it with the DataView API. The concept is to retrieve the image via an HTTP request with the response type set to “arraybuffer.” And then extract the hexadecimal signature or magic number of the image type. I’ve taken the liberty of only reading the first 2 bytes to get the magic numbers in my example. If you need more precision here’s a great site to use for more information on image signatures: https://www.filesignatures.net/index.php?page=search

  function getImageType(arrayBuffer){
        var type = "";
        var dv = new DataView(arrayBuffer,0,5);
        var nume1 = dv.getUint8(0,true);
        var nume2 = dv.getUint8(1,true);
        var hex = nume1.toString(16) + nume2.toString(16) ;

        switch(hex){
            case "8950":
                type = "image/png";
                break;
            case "4749":
                type = "image/gif";
                break;
            case "424d":
                type = "image/bmp";
                break;
            case "ffd8":
                type = "image/jpeg";
                break;
            default:
                type = null;
                break;
        }
        return type;
    }

The DataView API has really good browser support. The only issues you’ll have, not surprisingly, are with IE 8 and 9. For more info on support of the DataView API go here: https://caniuse.com/#search=dataview

Using async tokens with JavaScript FileReader

The JavaScript FileReader is a very powerful, efficient and asynchronous way to read the binary content of files or Blobs. Because it’s asynchronous, if you are doing high-volume, in-memory processing there is no guarantee as to the order in which reading events are completed. This can be a challenge if you have a requirement to associate some additional unique information with each file or Blob and persist it all the way thru to the end of the process. The good news is there is an easy way to do this using async tokens.

Using an asynchronous token means you can assign a unique Object, Number or String to each FileReader operation. Once you do that, the order in which the results are returned no longer matters. When each read operation completes you can simply retrieve the token uniquely associated with the original file or Blob.  There really isn’t any magic. Here is a snippet of the coding pattern. You can test out a complete example on github.


function parse(blob,token,callback){

    // Always create a new instance of FileReader every time.
    var reader = new FileReader();

    // Attach the token as a property to the FileReader Object.
    reader.token = token;

    reader.onerror = function (event) {
        console.error(new Error(event.target.error.code).stack);
    }

    reader.onloadend = function (evt) {
        if(this.token != undefined){

            // The reader operation is complete.
            // Now we can retrieve the unique token associated
            // with this instance of FileReader.
            callback(this.result,this.token);
        }
    };
    reader.readAsBinaryString(blob);
}

Note, it is a very bad practice to simply associate the FileReader result object with the token being passed into the parse() function’s closure. Because the results from the onloadend events can be returned in any order, each parsed result could end up being assigned the wrong token. This is an easy mistake to make and it can seriously corrupt your data.

Fastest way to find an item in a JavaScript Array

There are many different ways to find an item in a JavaScript array. With a little bit of testing and tinkering, I found some methodologies were faster than others by close to 200%!

I’ve been doing some performance tweaking on a very CPU intensive JavaScript application and I needed really fast in-memory searching on a temporary array before writing that data to IndexedDB. So I did some testing to decide on an approach with the best search times. My objective was to coax out every last micro-ounce of performance. The tests were completed using a pure JavaScript methodology, and no third party libraries were used, so that I could see exactly what was going on in the code.

I looked at five ways to parse what I’ll call a static Array. This is an array that once it is written you aren’t going to add anything new too it, you simply access its data as needed and when you are done you delete it.

  1. Seek. Create an index Array based exactly on the primary Array. It only contains names or unique ids in the same exact order as the primary. Then search for indexArray.indexOf(“some unique id”) and apply that integer against the primary Array, for example primaryArray[17] to get your result. If this doesn’t make sense take a look at code in my JSFiddle.
  2. Loop. Loop thru every element until I find the matching item, then break out of the loop. This pattern should be the most familiar to everyone.
  3. Filter. Use Array.prototype.filter.
  4. Some. Use Array.prototype.Some.
  5. Object. Create an Object and access it’s key/value pairs directly using an Object pattern such as parsedImage.image1 or parseImage[“image1”]. It’s not an Array, per se, but it works with the static access pattern that I need.

I used the Performance Interface to get high precision, sub-millisecond numbers needed for this test. Note, this Interface only works on Chrome 20 and 24+, Firefox 15+ and IE 10. It won’t run on Safari or Chrome on iOS. I bolted in a shim so you can also run these tests on your iPad or iPhone.

My JSFiddle app creates an Array containing many base64 images and then loops thru runs hundreds of tests against it using the five approaches. It performs a random seek on the Array, or Object during each iteration. The offers a better reflection of how the array parse algorithm would work under production conditions. After the loops are finished, it then spits out an average completion time for each approaches.

The results are very interesting in terms of which approach is more efficient. Now, I understand in a typical application you might only loop an Array a few times. In those cases a tenth or even hundredth of a millisecond may not really matter. However if you are doing hundreds or even thousands of manipulations repetitively, then having the most efficient algorithm will start to pay off for your app performance.

Here are some of the test results based on 300 random array seeks against a decent size array that contained 300 elements. It’s actually the same base64 image copied into all 300 elements. You can tweak the JSFiddle and experiment with different size arrays and number of test loops. I saw similar performance between Firefox 29 and Chrome 34 on my MacBook Pro as well as on Windows. Approach #1 SEEK seems to be consistently the fastest on Arrays and Object is by far the fastest of any of the approaches:

OBJECT Average 0.0005933333522989415* (Fastest.~191% less time than LOOP)
SEEK Average 0.0012766665895469487 (181% less time than LOOP)
SOME Average 0.010226666696932321
FILTER Average 0.019943333354603965
LOOP Average 0.02598666658741422 (Slowest)

————–

OBJECT Average 0.0006066666883028423* (Fastest.~191% less time than slowest)
SEEK Average 0.0012900000368244945 (181% less time than LOOP)
SOME Average 0.012076666820018242
FILTER Average 0.020773333349303962
LOOP Average 0.026383333122745777 (Slowest)

As for testing on Android, I used my Android Nexus 4 running 4.4.2. It’s interesting to note that the OBJECT approach was still the fastest, however the LOOP approach (Approach #2) was consistently dead last.

On my iPad 3 Retina using both Safari and Chrome, the OBJECT approach was also the fastest, however the FILTER (Approach #3) seemed to come in dead last.

I wasn’t able to test this on IE 10 at the time I wrote this post and ran out of time.

Conclusion

Some folks have blogged that you should never use Arrays for associative search. I think this depends on exactly what you need to do with the array, for example if you need to do things like slice(), shift() or pop() then sticking to an Array structure will make your life easier. For my requirements where I’m using a static Array pattern, it looks like using the Object pattern has a significant performance advantage. If you do need an actual Array then the SEEK pattern was a close second in terms of speed.

References:

JSFiddle Array Parse tests
Performance Interface

[Updated: May 18, 16:06, fixed incorrect info]