getByName, a Javascript Function that Gave Me Trouble

Today I was working on a site that needs to be re-written with a focus on optimization. The old site runs on wordpress, and the task is to take if off from wordpress and build a custom mini engine to run that site and a few others. Since the site needs to load fast, I decided to write all the javascript by hand as opposed to using some of those frameworks that people seem to love.

This morning, as I set to work on the javascript, I thought it would be a good thing to forgo all loop usage and use recursive functions instead, even if just for the sake of doing it. As you know from my posts, I try to be functional on my programming. I believe functions are far superior to objects, although, objects do come in handy sometimes. I also try to follow the little functional programing principles that I know, but I’m still learning. All that I’m going to show you here can be easily, and painlessly accomplished using loops, but what is the fun in that? Also, as you read my code, keep in mind that I am not a pure functional kind of guy. I mix stuff, sometimes out of convenience, and others out of ignorance. I will leave it to you to decided when it might be one or the other.

One of the things that I need to do in the site is to validate a form. I like to set up an array of the fields that are required, this way if I need to add or take out required fields, I can do it easily. In this case the array contains the name of the fields that are required, so at some point I need to get elements by their name. That is where the getByName function comes in handy. (Yes, I know we have query selectors now, but IE is a must on this project).

A first naive attempt to write the function would be this:


var getByName = (function(){
   var iteration = 0;
   var els;
   return function(nme, ctx){
     var c = ctx || document;
     if(typeof els == 'undefined') els =  c.getElementsByTagName('*');// get them only once for better performance.
     var currEl = els[iteration];

     if(!currEl.name){
       iteration ++;
       getByName(nme, c);
     }

     if(currEl.name == nme){
       return currEl;
     }
    
     iteration ++;

     if(iteration == els.length){
       return false;
     }else{
       getByName(nme, c);
     }
   }
 })();

I don’t know if you can spot the problems with that code. They are easy to miss, and cause a lot of trouble. One hint is that the function will most likely not return the element that you are expecting. It took me a long time to figure out what was wrong.

The function looks right. It felt right at some point, and it was frustrating to see it fail over and over again. I felt like those breaking points in firebug were laughing at me so hard that I could almost hear them.

The function that gets executed when you call getByName is the returned function, and this is why I though happened when the function was executed:

1) Determine the context.
2) Fill the els variable with the live object returned by getElementsByTagName, but only if els is undefined. (This function is expensive here, and overkill, read on to see how I solved that problem).
3) Get the element that corresponds to this iteration of the function.
4) If the element has no name attribute, increase the iteration index and call the function again. Start from 1.
5) If the element has a name attribute, check if is the same as the name we are looking for. If so, return the current element.
6) If the name attribute is not the one we are looking for, increase the iteration index.
7) If the index is equal to the amount of element we have, give up and return false. Otherwise, call the function again and start form 1

There is a big hole there. This function will not work unless the very first element in the list of elements is the one we are looking for. In that case this is what happens:

1) current element (first element) has a name attribute. Skip to the next if statement.
2) current element (first element) has a name that is equal to the one we are looking for. Return current element.

There is no fail there.

But if the first element is not the one we are looking for, the function fails. This is what happens:
(Imagine the element we are looking for is the second in the available elements)

IF element has not name attribute:
1) The element (first element) has no name, increase iteration (value: 1) and call the function again.
2) Function is called again. This is the element (second element) we are looking for. Return it.
3) the function call inside the if statement returns the element we are looking for (second element) but it is never caught, and the first function call continues execution.
4) current element (first element) has no name, so currEl.name == nme is false.
5) Increase the iteration index (value: 2).
6) If iteration is not equal to amount of elements available, call the function again. From now on, all calls will fail.
7) Once all elements have been exhausted, continue execution of first call
8) Noting else to do, return nothing.

If element has name attribute, but not the one we are looking for:
1) Element has name attribute. Skip to the next if statement
2) element’s name is not the one we are looking for.
4) increase iteration index (value: 1)
5) If iteration is not equal to available elements, call the function again, else return false.
6) Function is called again. Element is the one we are looking for. Return it.
7) Element is not caught.
8) First function call continues. Nothing else to do, return nothing.

We were able to find the element we were looking for, but it was never caught. A lot more functions to the call are made, and until the last one returns, then the first one continues. But there is nothing else to do, and nothing to return. The function dies, and the element is never returned.

I have oversimplified what happens. In reality, functions can be called on the second if statement, or on the else statement, and every time a function is called, the very first execution is paused, and continues only when the other call returns. This can create a big mess. You can end up with a first call waiting for 2, 3 or maybe more execution to return before continuing. And what is worse, even if any of this subsequent function calls find the element we are looking for, it will never be returned because the one function that returns a value to be caught is the very first execution, and this one is dealing with the first element in the list, and that one is not the one we are looking for. If you are lost, please tell me about it on the comments. IF I see too many people are getting lost, I will make a video explaining this in more detail.

Imagine you have this:
var el = getByName('email', formElement);//formElement is a reference to form element dom node.

el will receive the value returned by the first execution of getByName, and not by the others that are called recursively.

How do we fix this?

A Naive attempt to solve it is to place a return after calling the function again:


var getByName = (function(){
   var iteration = 0;
   var els;
   return function(nme, ctx){
     var c = ctx || document;
     if(typeof els == 'undefined') els =  c.getElementsByTagName('*');// get them only once for better performance.
     var currEl = els[iteration];

     if(!currEl.name){
       iteration ++;
       getByName(nme, c);
       return; //This the line added.
     }

     if(currEl.name == nme){
       return currEl;
     }
    
     iteration ++;

     if(iteration == els.length){
       return false;
     }else{
       getByName(nme, c);
     }
   }
 })();

This is how I initially tried to solve this problem. This fails. I won’t go over the details of how it fails, but it is pretty much a simple variation of why the first one fails.

At this point I still think that what happens is what I explain on my first list of steps:

1) Determine the context.
2) Fill the els variable with the live object returned by getElementsByTagName, but only if it was undefined. (This function is expensive here, and overkill, read on to see how I solved that problem).
3) Get the element that corresponds to this iteration of the function.
4) If the element has no name attribute, increase the iteration index and call the function again. Start from 1.
5) If the element has a name attribute, check if is the same as the name we are looking for. If so, return the current element.
6) If the name attribute is not the one we are looking for, increase the iteration index.
7) If the index is equal to the amount of element we have, give up and return false. Otherwise, call the function again and start form 1

All I get is an error that the field is not defined when I try to use it, so naturally, I think the function is not finding the field, but I want to make sure:


var getByName = (function(){
   var iteration = 0;
   var els;
   return function(nme, ctx){
     var c = ctx || document;
     if(typeof els == 'undefined') els =  c.getElementsByTagName('*');// get them only once for better performance.
     var currEl = els[iteration];

     if(!currEl.name){
       iteration ++;
       getByName(nme, c);
       return;
     }

     if(currEl.name == nme){
       alert(currEl); //This is the line added.
       return currEl;
     }
    
     iteration ++;

     if(iteration == els.length){
       return false;
     }else{
       getByName(nme, c);
     }
   }
 })();

This makes everything more confusing to me because I see an alert. Remember that at this point I haven’t figured out what is happening. In other words, I don’t know what I explained on my other two lists of steps. So I’m thinking that the element is returned, because I see an alert, which means that the content of that if statement is being executed, so the return is happening.

The return is indeed happening, but it happens at a point where it does not get caught. In our analysis we are thinking that the element we are looking for is the second in the list, so when it gets returned, the initial function call continues. The value returned is never caught. So this is what happens:

var el = getByName(); <- does not find element. Eventually return nothing.
            getByName(); <-this one finds the element, and returns it.

Notice how the second function call returns the element to the first one. The first one does not catch and return the returned value, but instead continues executing, and returns nothing because it finds nothing. That “diagram” is very simple, but it helps visualize what is happening. In reality, it is likely there will be more nested calls. The first call will just ignored their returned value, and at the end return nothing.

Once I figured that out, it was easy to solve the problem. All you need to do is store the value somewhere that will not get lost, then later in the function check if that value has been set, if so, return it. This way, the first call will have access to that value:


var getByName = (function(){
  var iteration = 0;
  var els;
  var el;
  return function(nme, ctx){
    var c = ctx || document;
    if(typeof els == 'undefined') els =  c.getElementsByTagName('*');// get them only once for better performance.
    var currEl = els[iteration];

    if(!currEl.name){
      iteration ++;
      getByName(nme, c);
    }

    if(currEl.name == nme){
      el = currEl;
      iteration = 0;
    }
    if(typeof el !== "undefined" && el.name !== nme){
      el = undefined;
    }
    if(typeof el !== "undefined"){
      return el;
    }

    iteration ++;

    if(iteration == els.length){
      return false;
    }else{
      getByName(nme, c);
    }
    return el;
  }
})();

I said I would tell you how to solve the getElementsByTagName problem. It is quite simple, if no context is provided (ctx) then there is no other option. IF context is provided, and it is a form, use form.elements instead. Otherwise no option.

I will not go into much detail about the function. That function seems to work, but I don’t like how it has a few points of potential failure, especially because of all the calls that happen. I think you could even end up with a calls tree that would be hard to follow. But yesterday, when I started writing this post, I thought it was good enough. I was ready to leave it like that, but the more I though about it, the more it bothered me. There must be a better way to do it. Today I wrote that better function.

A better getByName:

Today I decided to write a better function. First, lets get rid of those persistent values.

The closure creates a few persistent values. That is what we wanted, because we could access the same values from different function calls, but it creates problem because the values would be kept even after the value was returned. That is why we had to reset the iteration index. This could potentially create problem if the iteration index is not reset properly for whatever reason. Whit that mess of a code that I had, it could easily happen.

In order to get rid of the persistent values, we need to get rid of the self-executing functions, which actually makes the code easier to read. In fact, we get rid of the whole function, and start from scratch.

This is the new function:


function getByName2(name, context, i){
  i = i || 0;
  var c = ( (typeof context === 'string') && document.getElementById(context) ) || context;

  if(!c) return false;

  if(c.elements[i].name === name){
    return c.elements[i];
  }else{
    i++;
    return getByName2(name, c, i);
  }
}

Notice that it looks noting at all like the previous one. That is because it was written entirely from scratch.

There are a few trade-offs with this function.
1) context is required, either as the id for the form element where the element is, or as a DOM element object reference.
2) since there are no persistent values, the collection of elements is searched on every iteration. This could present a little performance penalty, but the function is still potentially faster than the previous one.

There is some room for improvement. For example, we could make context optional, or find a way to cache the list of elements so they are not searched for on every iteration.

What do you think of this new function? Definitely much better than my first naive attempts. Do you see a point of failure that I’ve missed?