using Declarations in C++

Now that we have a complete understanding of using directives in C++, lets take a look at using declarations. Using declarations are easier to understand, especially now that we’ve gained an understanding of other concepts such as scope, and namespaces.

using declarations serve a similar purpose than using directives: we want to use something that is within a namespace without having to write the fully qualified name every time we want to use it. A using declaration looks like this:

using nsName::entityName;

As you can see, all you do is refer to an entity using its fully qualified name. After this, you can use that entity by calling it by its simple name.

How using Declarations Work

As with using directives, we need to understand what happens when we use a using declaration. We already know that a using directive works by placing everything withing the referred namespace into the closest namespace shared by the referred namespace and the curent scope. using declarations, however, work by placing the referred entity into the current scope. This distinction is important because knowing it will help us determine where to place our using declarations.

A Rule of Thumb

A good rule of thumb when working in C++ is to declare variables close to where they will be used. For example, if a variable will be used in an function, then you should declare it in that function. If it will be used in an if statement, then declare it in that if statement. We can apply the same rule to using declarations because they place the referred entity into the current scope.

This time I won’t go into much detail, because I think that understanding how using declarations work is very easy. However, I do want to show an example of when, and why using declarations coule be problematic. Consider the following code:


namespace myNs {
  int number = 0;
}

int main() {
  int number = 1;
  using myNs::number;
  return 0;
}

If you tried to compile this code, you would get an error. The problem is that when the using declaration tries to move myNs::number into the scope of the main function, it finds that there is already a variable in there that has been declared with the same name.

I want to end this short entry with a few links where you can read some more about real-world instances where using directives and declarations come into play:

http://stackoverflow.com/questions/16152750/using-directive-vs-using-declaration-swap-in-c
http://stackoverflow.com/questions/3039615/exceptional-cbug
http://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice

Advertisements

using Directives by Example

A deep understanding of the features of a programming language is key in order to become a good programmer. With that philosophy in mind, lets take a closer look at using directives in C++.

Last time we said that:

A using directive works by placing everything belonging to the referred namespace inside the closest namespace shared by the referred namespace and the current scope.

The Referred Namespaces and the Current Scope

In order to better understand what that sentence means, lets take a look at what the current scope and the referred namespace are. Consider the following code:


int main() {
  using namespace std;
}

This code doesn’t do anything, but it will serve well for our purpose. In the using directive, the referred namespace is std. This is the namespace that we want to use. Note that you can use more than one using directive to refer different namespaces. When the main function executes, we enter into the scope of the main function, and that becomes the current scope. This means that the current scope is the scope of the block where the using directive is.

Now, you should be able to understand a little bit better what I meant. However, I’m going to go the extra mile here and show some examples that, both, demonstrate that the statement is true, and clarify the statement even further.

The Examples

Consider the following code:


#include 
int number = 0;

namespace outer {
  int number = 1;
  namespace inner_one {
    int number = 2;
  }
  namespace inner_two {
    int number = 3;
  }
}

int main() {
  return 0;
}

Here we have a namespace outer which contains a variable declaration, and two more namespaces. Each of these nested namespaces contains a variable declaration as well. There is also a variable defined in the global scope, outside of any namespace. All variables share the same name, but since they are in different scope, and more importantly, in different namespaces, they are all different entities. This example doesn’t do anything, and if you compile this program and run it, it won’t do anything. This is just our basic code. We will use this for all of our examples.

Lets first use a using directive that applies to the whole document:


#include 
int number = 0;

namespace outer {
  int number = 1;
  namespace inner_one {
    int number = 2;
  }
  namespace inner_two {
    int number = 3;
  }
}
using namespace outer;

int main() {
  std::cout << number;
  return 0;
}

Notice we’ve also added a cout statement. We are using the qualified name of cout so that we don’t have to use a using directive. I’m doing this so that the examples are not confusing.

If you try to compile that code you will get an error. Depending on the compiler you are using, the error may be different. In the g++ compiler, the error says “reference to ‘number’ is ambiguous”, and then it gives the line number where the two candidates are: 2, and 5. It also gives me the fully qualified names of the two candidates: int number, and int outer::number.

What does it mean anyway? The message is basically telling us that there are two different number variables, and that there is no way to know which one you mean. At first sight you would think that this happens because we have the using directive in the global scope, so when the members of outer are “imported” to be used without specifying their fully qualified name, they come to the global scope, and that is where the conflict happens. Following this logic, we would think that placing the using directive in the main function would solve the problem, since the members of the referred namespace would be placed in the scope of the main function, thus preventing the variables from conflicting with each other:


#include 
int number = 0;

namespace outer {
  int number = 1;
  namespace inner_one {
    int number = 2;
  }
  namespace inner_two {
    int number = 3;
  }
}

int main() {
  using namespace outer;
  std::cout << number;
  return 0;
}

If you do this, you may be surprised when you try to compile the program to see the same error you were seeing before. It made no difference. This means, then, that the members of the referred namespaces are not being placed in the current scope, but if they don’t go into the current scope, where do they go? As I said before, they go into the namespace share between the referred namespace and the current scope.

The Global Namespace

Lets take a break from the examples, and talk about the global namespace. Everything in C++ is in a namespace, even if you don’t explicitly put it there. If you don’t specify a namespace, things go into the global namepsace, which is referred to as ::. This means that the fully qualified name of the variable defined in the global scope (int number = 0) is ::number.

In our previous example, the closest shared namespace between the referred namespace and the current scope is the global namespace, so all the members of the referred namespace go into the global namespace, but in the global namespace there is already a variable named number, thus the conflict of ambiguity.

But, how do we know that everything in fact goes into the closest namespace rather than the global namepsace all the time? We can prove that quite easily:


#include 
int number = 0;

namespace outer {
  int number = 1;
  namespace inner_one {
    int number = 2;
  }
  namespace inner_two {
    int func() {
      using namespace inner_one;
      return number;
    }
  }
}

int main() {
  std::cout << outer::inner_two::func();
  return 0;
}

Here we have created a function in inner_two. We also removed the variable declaration, otherwise name lookup would find that variable first, and use it without giving us a chance to get to the ambiguous definitions.

If you try to compile this program you will see the same ambiguity problem, but this time the candidates are different: outer::number and outer::inner_one::number. This effectively demonstrates that the members of the referred namespace were placed in outer, which is the closest share namespace, rather than in the global space.

Getting Real

Why do conceptual things like this matter? Well, it all comes down to understanding of the language. The better you understand a language, the better prepared you are to write programs that have lesser chances of doing unexpected things. There are real life problems that derive from a poor understanding of the features of a language, and these are some of the worst kind of problems, because if you don’t understand how something works, you could be thinking that it works in a different way, and this will make it really hard to fix bugs because you cannot see why something is not working.

This is not just trivia knowledge. In fact, chances are no one will ever ask you to explain what a using directive does or how it works. This is knowledge that will prevent you from making mistakes. Knowing where to put something, like a using directive or a variable declaration within your code is essential in order to create good programs.

Next time, we will take a look at using declarations, how they work, and why you may want to use them in favor of using directive in some instances.

Using Directives in C++

Last time I talked about namespaces in C++ and I touched a little on the subject of using directives, and using declarations. Since I think it is very important to understand the difference between the two, I’ve decided to talk about them. Before you continue, however, you need to really have understood the concepts that we talked about before, specially those of namespaces, and scope, as well as how name resolution works.

Scope — A Second Look

Lets give scope a second look. We’ve already said that the scope of a variable is everywhere that variable is available, or visible. This time, we will talk about variables, but you should assume that this applies to all kinds of entities that have an identifier, such as functions, constants, and classes.

C++ has block scope. This means that a new scope is defined for every block of code. A block of code is usually delimited by curly braces. Following this logic, we can see that if you create an if statement, everything in the if branch is in its own scope, and if you have an else branch, everything in it is in its own scope as well. If you define a function, that function creates its own scope, and everything inside that function is in the scope of that function.

It is really important to understand this, because scope will usually determine where the best place to define a variable is. For example, if you want a variable to be accessible only from withing a function, then the best place to define that variable is inside the function. Scope will prevent the variable from being accessed from outside the function.

Another example would be if you want to create a variable and work with it only if some condition is met. In this case, the best place to define that variable is inside the if statement. However, if you want to have a variable and change its value depending on a certain condition being met, then you would have to define the variable outside of the if statement, and only change its value if the condition is met. This way you can access the variable from outside the if statement once its value has been changed.

The Scope Chain

We didn’t talk about the scope chain last time, and it is time we talk about it. As blocks get nested, scopes get nested as well, and this forms a scope chain. Think for example of a function that has an if statement inside of it:


int main() {
  if (1 == 1) {
    //do something here.
  }
}

In this code we have two different levels of scope. One is the scope defined by the main function, and the other one is define by the block of the if statement. The scope of the if statement is inside of that of the main function. We can say that the scope of the if statement is bellow the scope of the function in the scope chain.

In most programming languages you can move up the scope chain, but not downwards.This means that the code in the if statement has access to anything defined before the if statement in the main function. However, nothing in the main function can access anything defined inside the if statement unless it is also in the if statement. Lets take a look at some examples:


int main() {
  int number = 1;
  if (1 == 1) {
    int if_number = 2
  }
  cout << number << endl;
}

Note that I’ve omitted some boilerplate code here, like the inclusion of the iostream header, as well as the return statement of the main function.

In this example, cout can access the variable number because it was defined in the same scope, that of the main function. Now take a look at the next example:


int number = 1;
int main() {
  if (1 == 1) {
    int if_number = 2
  }
  cout << number << endl;
}

In this example, number was defined outside of the scope of the main function, but cout can still navigate up the scope chain and have access to the number variable. Now, consider the following:


int number = 1;
int main() {
  if (1 == 1) {
    int if_number = 2
  }
  cout << if_number << endl;
}

Here, there is an problem. cout is now trying to access if_number which was defined in the scope of the if statement. cout cannot navigate down the scope chain, therefore it does not have access to the if_number variable. Now consider this example:


int number = 1;
int main() {
  if (1 == 1) {
    int if_number = 2
  }
  if (2 == 2) {
    cout << if_number << endl;
  }
}

In this example, it looks like if cout had access to the if_number variable, because it is defined up the scope chain. However, that is not the case. When name lookup starts, it goes up the scope chain, to the scope of the main function, however, if we wanted to access the variable, we would have to go down into the scope of the if statement, and that is not possible.

The using Directive

Now that we have a better understanding of scope, lets look at the using directive.

A using directive is a way to specify that we are going to be accessing members contained within a namespace using only their name, rather than their fully qualified names, which would include the namespace as well as the name. Consider this:


namespace myNs {
  int myNumber = 9001; // myNumber is over 9000!
}

The name of the variable in this case is “myNumber”, and its fully qualified name is “myNs::myNumber”.

By using the using directive using namespace myNs we can refer to the variable by its name without specifying the namespace that it belongs to.

How does this work anyway?

A using directive works by placing everything belonging to the referred namespace inside the closest namespace shared by the referred namespace and the current scope. Good luck understanding what that means…

But if you don’t understand it, don’t worry. I will explain it by example in my next post.

Namespaces in C++

Last time we talked about namespaces in general, and I gave an introduction to what they are, and why we need them. This time, lets look at how to implement namespaces in C++.

Implementing namespaces in C++ is very simple, all you need to do is use the namespace keyword plus an identifier that will be the name of your namespace:


namespace my_namespace {
  //code here
  int my_var = 1;
}

Inside the curly braces ({,}) you put your code. Everything you declare in there, such as functions, variables, and classes will be “inside” the namespace my_namespace. To access entities inside an identifier we use the :: scope operator:


my_namespace::my_var;

You can even add namespaces inside namespaces:


namespace my_namespace {
  //code here
  int my_var = 1;
  namespace internal {
  }
}

That is all there is to creating namespaces.

You should always try to use a unique indentifier for your namespace so that it won’t conflict with another namespace.

Using “using”

In C++ there are using declarations and using directives. They both do different things, and you may want to use one or the other depending on what you are trying to accomplish.

In my C++ study group, the professor has recommended to place the following directive on every program:

using namespace std;

If you’ve been reading the tutorials I recommended a couple of posts back, you will see that they also use the same directive. However, the first time I saw that, something told me there was something wrong with that. Sure, if you are building simple 10 line hello world programs you won’t have any problem, but if you plan on actually writing large, complex programs, you may want to reconsider adding that directive at the top of every source file you write.

I won’t go into details on why using that line of code can be potentially a bad idea because for now, I think it is safe to use it, but keep in mind that when you start writing actual programs, you may want to use the full qualified name of your identifiers.

I mentioned there are two kinds of using expressions: using directives, and using declarations. Using directives “import” everything within a namespace into a namespaces that is shared by the namespace we are “importing” and the current scope, while using declarations introduce a name directly into the current scope. Lets look at how both of these statements look:


// using directive
using namespace std;
// using declaration
using std::std

A simple way to remember which one is a directive and which one is a declaration is to think of a using declaration as a statement that is declaring a name into the current scope. Or maybe just remember that if it has the namespace keyword it is most likely a using directive.

So, what is the actual difference between both of them? The difference is hard to understand if you haven’t fully understood the concept of scope, plus you need to understand another bit of how languages work.

Name Resolution

When you do something like


#include 
using namespace std;
int two = 2;
int main() {
  int four = two + two;
  cout << four;
  return 0;
}

The following is happening:

When main is called, it sees that you are trying to add the value of two to itself, and return the result to be assigned into a variable four. This starts a lookup process which looks inside the function for a variable called “two”. It does not find it, then since there is no other scope, such as a class, or a namespace, it goes up to the global scope, and finds two in there. It gets the value of the variable, adds it to itself, and returns the result to be stored in the variable four.

This lookup process happens every time you use an identifier. Lookup will look for the entity identified with the name provided in all of the scopes shared by the current scope until it finds it. In other words, it will go up the scope chain until it finds what it is looking for. Once it finds the first occurrence of the name, it uses that entity, and lookup ends. Why does this matter anyway?

The importance of this process becomes obvious when you try to understand the difference between using directives and using declarations, but we will leave that for another time.

Here are some documents you may want to read next:

http://www.devx.com/tips/Tip/12534 – Using Declarations and using Directives
http://www.cplusplus.com/doc/tutorial/namespaces/#namespace – The section on namespaces in one of the tutorials I’m following.
http://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice – A closer look at why “using namespace std;” Might not be a good idea.

About Namespases

Last Wednesday I had the first meeting with my C++ study group. If you read my last post, you will know what I’m talking about. In this first session, I noticed the professor had a little bit of trouble explaining namespaces mainly because explaining namespaces in 5 minutes to people with little programming background is hard. In this post, I will try to explain namespaces taking more than 5 minutes, and hope that it may be helpful for some people.

I think my first encounter with namespaces was back when I was doing Flash and Actionscript. They also came back when I was studying XML, and also in php and python. Javascript does not have namespaces, but pseudo-namespaces that help us accomplish the same basic task. Since implementation of namespaces varies from language to language, this post is language agnostic, concentrating on the concept rather than on any particular implementation.

Identifiers

Before we can actually talk about namespaces, we need to look at a different concept: Identifiers. Understanding identifiers is key in understanding other concepts, among which is that of namespaces.

Identifiers are nothing else than names that we give to variables, constants, functions and classes (there are other “things” that we also give names to, but that I’m not going to talk about for the sake of simplicity). When we declare a variable, we normally do it using a name, and then we assign a value. When we create a function, we normally give it a name that we can use to invoke the function later. Wen we declare a constant, we give it a name so we can refer to it later. When we create a class, we give it a name so we can instantiate it later.

All of these names that we give to “things” are identifiers, and they should be unique to the thing we are naming. There are cases where the name is not unique, like in the case of overloading, but for now, lets not consider those cases.

Scope

Scope is another concept we need to understand. If you understand scope, you’ve understood half of closures, another important and powerful concept available in some programming languages, but we will talk about that some other time.

I like to think of scope as the reach that an identifier has within a program. Although this is not the proper definition, and it may not be the most accurate, it is one that I’ve found makes it easy to understand the concept. Simply put, the scope of an identifier is anywhere where you can use that identifier to find the entity that it refers, the entity being the value of a variable, a function, a class or the value of a constant.

There are different kinds of scope, and knowing the kind of scope that applies to the language that you are learning is important in order to become a better programmer. For example, Javascript has function scope, meaning that all variable declared within a function are available only within that function. Knowing this can save you from a lot of headache. Some other languages, like C, have block scope, meaning that a variable is available (sometimes also called visible) only within the block where it was declared. The block being delimited by curly braces ({ and }). There are other kinds of scope, but we don’t need to know all of them. We only need to understand what scope is.

Scope by Example

Suppose you have a function f1, and a function f2 declared in a language that has function scope. Now, in f1, you declare a variable called v1, and in f2 you try to reference that variable. What you will get is an error because the variable v1 is local to the function f1, which prevents function f2 from accessing v1. That is scope.

Namespaces

Now that you have an understanding of identifiers and scope, we can begin to talk about namespaces. First, lets consider the following problem:

Suppose you are working on a program that has the functionality of an email client, and of a news reader. At some point during the program’s execution, you will need to fetch information from the net such as news and emails. You have divided your program in two different parts. One is the part that handles all of the email functionality, and the other one is the one that handles all of the reader functionality. Since you need to fetch information from the net, you have created two classes, one for each part of the program. One class fetches information from the email server, and the other one from the news server. You call both of these classes Fetcher.

I hope at this point you can see the problem, but if you can’t, look a bit closer and realize that we have two classes named Fetcher, and this will be a problem.

We will assume that the scope of these classes is global, meaning you can reach them from anywhere in the program. Depending on the language and the compiler, you can have one of multiple possible errors, but they all boil down to the same issue. You have two classes that are named the same, and no way to distinguish one from the other.

Why would anyone do that? Why not just call them EmailFetcher, and NewsFetcher? Well, sometimes you did not write those classes. Maybe you are using a library that comes with those classes, so you have no real say in what the classes are named.

To solve that problem, some clever people came up with namespaces. Namespaces, are, like the name suggests, spaces for names, or identifiers. Namespaces help us distinguish from two identifiers that have the same name but that belong to different context.

The simplest example of namespaces is computer directories. A computer directory, most commonly known as a folder, is a place where you can store files. Suppose you have a directory dir1. Inside dir1 you create a file called myFile.txt. Now, what would happen if you want to create a new file, with different content, but that is also called myFile.txt? The computer will either complain that you cannot have the same name on two different files, or worst, overwrite the original file. One solution to this problem would be to create a second directory called dir2, and in dir2 put the new myFile.txt file. Now you have two files that have the same name, but in different context (think of every directory as a separate context).

This is clearer with a more real-life-like example. Suppose you have your Music directory, where all your music is stored. Two of your favorite bands have a song called “The Wild Loop”, so the file has a name like the_whild_loop.mp3. What happens when you add both of these songs to your Music library? Do they get overwritten? No. What happens is that your music player program may be using a different directory to store the songs of each band.

Lets say one of the bands is called The Pythonist, and the other one is called The Rubyist. The files in your Music directory may be organized like this:

Music
 |- The Pythonist
    |- Album Name
       |- the_wild_loop.mp3
 |- The Rubyist
    |- Album Name
       |- the_wild_loop.mp3

As you can see, you have two the_wild_loop.mp3 files, and not only that, but you also have two directories called Album Name, but one is in the context of The Pythonist, and the other one in the context of The Rubyist.

That is basically what namespaces do in programming languages. They allow us to have entities referred by the same name but in a different context. This context lets us distinguish one entity from the other.

Since my study group is about C++, I will publish another posts about the implementation of namespaces in C++ later today or tomorrow, but I hope this article has been a good introduction to namespaces.