Easily find image type in JavaScript

There are two easy ways to determine an image’s type using JavaScript: using an html Input tag with a type file, and using the DataView API. I’ve put together a github repository that contains all the code shown below. The sample app detects PNG, GIF, JPEG and BMP: https://github.com/andygup/DetectImageType.js

Here’s how to do it with an Input tag:

    <input type="file" id="fileInput" name="file"/>
    <script>
    var fileInput = document.getElementById("fileInput");
    fileInput.addEventListener("change",function(event){
        document.getElementById("name").innerHTML = "NAME: " + event.target.files[0].name;
        document.getElementById("type").innerHTML = "TYPE: " + event.target.files[0].type;
        document.getElementById("size").innerHTML = "SIZE: " + event.target.files[0].size;
    });
    </script>

And, here’s how to do it with the DataView API. The concept is to retrieve the image via an HTTP request with the response type set to “arraybuffer.” And then extract the hexadecimal signature or magic number of the image type. I’ve taken the liberty of only reading the first 2 bytes to get the magic numbers in my example. If you need more precision here’s a great site to use for more information on image signatures: http://www.filesignatures.net/index.php?page=search

  function getImageType(arrayBuffer){
        var type = "";
        var dv = new DataView(arrayBuffer,0,5);
        var nume1 = dv.getUint8(0,true);
        var nume2 = dv.getUint8(1,true);
        var hex = nume1.toString(16) + nume2.toString(16) ;

        switch(hex){
            case "8950":
                type = "image/png";
                break;
            case "4749":
                type = "image/gif";
                break;
            case "424d":
                type = "image/bmp";
                break;
            case "ffd8":
                type = "image/jpeg";
                break;
            default:
                type = null;
                break;
        }
        return type;
    }

The DataView API has really good browser support. The only issues you’ll have, not surprisingly, are with IE 8 and 9. For more info on support of the DataView API go here: http://caniuse.com/#search=dataview

Tags: , , ,
Posted in JavaScript | Comments Off

This is my 2014 wish list for where operating systems (OS) should be headed with laptops, tablets, smartphones and smart devices. Now before you lambast me or fill my ears with technical mumbo-jumbo about why some of these ideas aren’t possible, just take a slow, deep breath. I offer these concepts up as a challenge to take things to the next level, and not as fodder for a debate contest of what’s possible and what’s not. I hope these ideas are viewed as worthy goals rather than existing only in our imaginations thru science fiction.

I suggest it’s time we rethink operating system kernel theory and discard some of our historical notions of how operating systems are supposed to work. I’m continually amazed that even the newest operating systems, such as Android, have fundamental problems similar to what we’ve had since the earliest versions of Windows! So here’s my list…

No more OS lockups – It’s 2014 and computers still experience software related operating system crashes. In the last year, I’ve personally had brand new Windows machines, Mac’s and smartphones lock up in one way or another. No, it’s not just bad luck. I put the onus and ultimately responsibility back on the OS vendors. A 21st century OS should be hyper-intelligent about memory allocation and reclamation. The OS should be able to gracefully self-recover from everything short of a fatal hardware failure.

No more app crashes – I’m sure the OS developers will blame this on the application developers and vice-versa.  My take on this: app crashes should never happen. There are many well-known bad patterns that operating systems can monitor for and avoid. The OS should be able to detect bad application code and handle it without coming to a screeching halt. Examples that I’m thinking about include:

  • being aggressive about detecting and providing programmatic feedback on memory leak conditions,
  • automatically isolating run-away code blocks so they don’t lockup an entire application,
  • giving applications feedback on whether or not they are on a trajectory to run out of memory rather than simply killing them off,
  • provide not just guidelines but also build-time test tools for analyzing applications and provide pointed feedback on best practices.
  •  some may consider it draconian, but you could be more assertive on failing builds that don’t meet a minimum best-practice standard set by you, the OS manufacturer.

Dynamic updates – We should be able to update the OS and apps while they are running. I really don’t like having to reboot any device that gets updated, and in the case of Windows this can lead to multiple reboots and that is a major pain. This includes phones, computers, as well as TiVo’s, Hoppers and more. I’d like to see OSs model themselves after web pages that can replace specific content on-the-fly without having the refresh the entire page.

Instant boot – OS should allow smart, lazy loading of modules and applications as needed. Do we really need to wait for everything under the sun to load up front while we wait…and wait? My iPad takes some time to boot, my Android Nexus takes even longer, but my MacBook boots within seconds.

So that’s my short list. I hope some OS engineers have a chance to read this and give my suggestions thoughtful consideration.

Tags: , , ,
Posted in Innovation | Comments Off

Deleting an HTML Application Cache

When you are testing web applications that use an Application Cache, also sometimes called the manifest file, you have to delete this file every time you make a change to the application. If you don’t then none of the changes you make to the application will show up. The very purpose of the Application Cache is to semi-permanently store your HTML, CSS, JavaScript and images. It’s becoming increasingly popular for speeding up web app performance, and a requirement for taking web apps offline. In fact, Google now uses an application cache for their home page.

Simply trying to delete your browser cache in the normal way won’t necessarily clear the Application Cache and its associated files. So here’s a quick rundown that will hopefully save you some time.

Chrome – browse to chrome://appcache-internals/.  There may be a number of different caches listed. Select ‘Remove’ for any cache that you want to go bye-bye.

Chrome (Mobile Android) – go to Settings > Privacy (under Advanced) > CLEAR BROWSING DATA, checkbox the ‘Clear the cache’ option and then select the ‘Clear’ button.

IE 10 – go to Tools > Internet Options > Settings > Caches and databases tab. Select the cache that you want to delete and the click the ‘Delete’ button.

Safari (Mobile) – For Safari iPhone and iPad go to Settings and select “Clear Cookies and Data.”

Safari (Desktop) – Simply attempting Develop > Empty Caches may not work. On a Mac you may have to: close your browser, manually delete the .db file by going to //library/Caches/com.Apple.Safari and move any item ending in .db to the trash, then restart browser. If this doesn’t work then try restarting your machine. Yep, it’s an awful workflow and it’s been a known bug in Safari dating back to at least version 6.

Firefox (Desktop) – go to Tools > Options > Advanced > Network > Offline data > Clear Now.

Want to learn more about Application Cache’s? Here’s a good technical overview from WHATWG describing what is an application cache. And, MDN has a good article on Using the application cache.

Tags: , , , ,
Posted in Browsers | Comments Off

Yay, I’ll be at OSCON again this year! My presentation is on July 23, 2014 at 5pm in Portland Room 252. For those of you who aren’t familiar with OSCON, it’s one of the largest [if not ‘the’ largest] Open Source conventions in the U.S. Just take a look at the program schedule and you’ll see topics covering just about every open source project or initiative in existence.

I’ve learned a ton every time I’ve attended OSCON and I’m always happy to give back to the community in the form of presenting on lessons learned over the previous year.  In the past I’ve talked about HTML5 Geolocation and Android GPS. This time I’m presenting on best practices for IndexedDB.

If you’ve ever wanted to store large amounts of data in the browser then you’ve most likely read about IndexedDB. It’s a transactional database whereby you retrieve items via a key.  It’s an especially useful tool for taking data offline. While I will spend some time discussing what it is, I’ll spend most of my time on how to best use it. I’ll also examine the fastest way to retrieve data from the database, and look at considerations for pre- and post-processing which is something that is rarely discussed but can dramatically affect application performance.

I hope to see you there!

Tags: , ,
Posted in Conferences | Comments Off

Using async tokens with JavaScript FileReader

The JavaScript FileReader is a very powerful, efficient and asynchronous way to read the binary content of files or Blobs. Because it’s asynchronous, if you are doing high-volume, in-memory processing there is no guarantee as to the order in which reading events are completed. This can be a challenge if you have a requirement to associate some additional unique information with each file or Blob and persist it all the way thru to the end of the process. The good news is there is an easy way to do this using async tokens.

Using an asynchronous token means you can assign a unique Object, Number or String to each FileReader operation. Once you do that, the order in which the results are returned no longer matters. When each read operation completes you can simply retrieve the token uniquely associated with the original file or Blob.  There really isn’t any magic. Here is a snippet of the coding pattern. You can test out a complete example on github.


function parse(blob,token,callback){

    // Always create a new instance of FileReader every time.
    var reader = new FileReader();

    // Attach the token as a property to the FileReader Object.
    reader.token = token;

    reader.onerror = function (event) {
        console.error(new Error(event.target.error.code).stack);
    }

    reader.onloadend = function (evt) {
        if(this.token != undefined){

            // The reader operation is complete.
            // Now we can retrieve the unique token associated
            // with this instance of FileReader.
            callback(this.result,this.token);
        }
    };
    reader.readAsBinaryString(blob);
}

Note, it is a very bad practice to simply associate the FileReader result object with the token being passed into the parse() function’s closure. Because the results from the onloadend events can be returned in any order, each parsed result could end up being assigned the wrong token. This is an easy mistake to make and it can seriously corrupt your data.

Tags: , , , , ,
Posted in JavaScript | Comments Off

Fastest way to find an item in a JavaScript Array

There are many different ways to find an item in a JavaScript array. With a little bit of testing and tinkering, I found some methodologies were faster than others by close to 200%!

I’ve been doing some performance tweaking on a very CPU intensive JavaScript application and I needed really fast in-memory searching on a temporary array before writing that data to IndexedDB. So I did some testing to decide on an approach with the best search times. My objective was to coax out every last micro-ounce of performance. The tests were completed using a pure JavaScript methodology, and no third party libraries were used, so that I could see exactly what was going on in the code.

I looked at five ways to parse what I’ll call a static Array. This is an array that once it is written you aren’t going to add anything new too it, you simply access its data as needed and when you are done you delete it.

  1. Seek. Create an index Array based exactly on the primary Array. It only contains names or unique ids in the same exact order as the primary. Then search for indexArray.indexOf(“some unique id”) and apply that integer against the primary Array, for example primaryArray[17] to get your result. If this doesn’t make sense take a look at code in my JSFiddle.
  2. Loop. Loop thru every element until I find the matching item, then break out of the loop. This pattern should be the most familiar to everyone.
  3. Filter. Use Array.prototype.filter.
  4. Some. Use Array.prototype.Some.
  5. Object. Create an Object and access it’s key/value pairs directly using an Object pattern such as parsedImage.image1 or parseImage["image1"]. It’s not an Array, per se, but it works with the static access pattern that I need.

I used the Performance Interface to get high precision, sub-millisecond numbers needed for this test. Note, this Interface only works on Chrome 20 and 24+, Firefox 15+ and IE 10. It won’t run on Safari or Chrome on iOS. I bolted in a shim so you can also run these tests on your iPad or iPhone.

My JSFiddle app creates an Array containing many base64 images and then loops thru runs hundreds of tests against it using the five approaches. It performs a random seek on the Array, or Object during each iteration. The offers a better reflection of how the array parse algorithm would work under production conditions. After the loops are finished, it then spits out an average completion time for each approaches.

The results are very interesting in terms of which approach is more efficient. Now, I understand in a typical application you might only loop an Array a few times. In those cases a tenth or even hundredth of a millisecond may not really matter. However if you are doing hundreds or even thousands of manipulations repetitively, then having the most efficient algorithm will start to pay off for your app performance.

Here are some of the test results based on 300 random array seeks against a decent size array that contained 300 elements. It’s actually the same base64 image copied into all 300 elements. You can tweak the JSFiddle and experiment with different size arrays and number of test loops. I saw similar performance between Firefox 29 and Chrome 34 on my MacBook Pro as well as on Windows. Approach #1 SEEK seems to be consistently the fastest on Arrays and Object is by far the fastest of any of the approaches:

OBJECT Average 0.0005933333522989415* (Fastest.~191% less time than LOOP)
SEEK Average 0.0012766665895469487 (181% less time than LOOP)
SOME Average 0.010226666696932321
FILTER Average 0.019943333354603965
LOOP Average 0.02598666658741422 (Slowest)

————–

OBJECT Average 0.0006066666883028423* (Fastest.~191% less time than slowest)
SEEK Average 0.0012900000368244945 (181% less time than LOOP)
SOME Average 0.012076666820018242
FILTER Average 0.020773333349303962
LOOP Average 0.026383333122745777 (Slowest)

As for testing on Android, I used my Android Nexus 4 running 4.4.2. It’s interesting to note that the OBJECT approach was still the fastest, however the LOOP approach (Approach #2) was consistently dead last.

On my iPad 3 Retina using both Safari and Chrome, the OBJECT approach was also the fastest, however the FILTER (Approach #3) seemed to come in dead last.

I wasn’t able to test this on IE 10 at the time I wrote this post and ran out of time.

Conclusion

Some folks have blogged that you should never use Arrays for associative search. I think this depends on exactly what you need to do with the array, for example if you need to do things like slice(), shift() or pop() then sticking to an Array structure will make your life easier. For my requirements where I’m using a static Array pattern, it looks like using the Object pattern has a significant performance advantage. If you do need an actual Array then the SEEK pattern was a close second in terms of speed.

References:

JSFiddle Array Parse tests
Performance Interface

[Updated: May 18, 16:06, fixed incorrect info]

Tags: , , , ,
Posted in JavaScript | 2 Comments »