Click here to Skip to main content
15,501,747 members
Articles / Programming Languages / C++
Article
Posted 9 Mar 2006

Stats

150.3K views
36 bookmarked

An Insight to References in C++

Rate me:
Please Sign up or sign in to vote.
3.52/5 (46 votes)
18 Apr 2006CPOL4 min read
How C++ Compiler handles References

Introduction

I choose to write about references in C++ because I feel most of the people have misconceptions about references. I got this feeling because I took many C++ interviews and I seldom get correct answers about references in C++.

What is meant by references in C++? A reference is generally thought of as an aliasing of the variable it refers to. I hate the definition of references being an alias of a variable in C++. In this article, I will try to explain that there is nothing known as aliasing in C++.

Background

Both in C and in C++, there are only two ways by which a variable can be accessed, passed, or retrieved. The two ways are: 

  1. Accessing/passing variable by value
  2. Accessing/Passing variable by address - In this case pointers will come into the picture

There is no 3rd way of accessing/passing variables. A reference variable is just another pointer variable which will take its own space in memory. The most important thing about the references is that it's a type of pointer which gets automatically dereferenced (by compiler). Hard to believe? Let's see....

A Sample C++ Code using References

Lets write a simple C++ code which will use references:

C++
#include <iostream.h>
int main()
{
    int i = 10;   // A simple integer variable
    int &j = i;   // A Reference to the variable i
    
    j++;   // Incrementing j will increment both i and j.

    // check by printing values of i and j
    cout<<  i  <<  j  <<endl; // should print 11 11

    // Now try to print the address of both variables i and j
    cout<<  &i  <<  &j  <<endl; 
    // surprisingly both print the same address and make us feel that they are
    // alias to the same memory location. 
    // In example below we will see what is the reality
    return 0;
}

References are nothing but constant pointers in C++. A statement int &i = j; will be converted by the compiler to int *const i = &j; i.e. References are nothing but constant pointers. They need initialization because constants must be initialized and since the pointer is constant, they can't point to anything else. Let's take the same example of references in C++ and this time we will use the syntax that the compiler uses when it sees references.

A Sample C++ Code using References (Compiler Generated Syntax)

C++
#include <iostream.h>
int main()
{
    int i = 10;   		// A simple integer variable
    int *const j = &i;   	// A Reference to the variable i
    
    (*j)++;   		// Incrementing j. Since reference variables are 
			// automatically dereferenced by compiler

    // check by printing values of i and j
    cout<<  i  <<  *j  <<endl; // should print 11 11
    // A * is appended before j because it used to be reference variable
    // and it should get automatically dereferenced.
    return 0;
}

You must be wondering why I skipped the printing of address from the above example. This needs some explanation. Since reference variables are automatically dereferenced, what will happen to a statement like cout << &j << endl;. The compiler will convert the statement into cout << &*j << endl; because the variable gets automatically dereferenced. Now &* cancels each other. They become meaningless and cout prints the value at j which is nothing but the address of i because of the statement int *const j = &i;.

So the statement cout << &i << &j << endl; becomes cout << &i << &*j << endl; which is similar to printing the address of i in both the cases. This is the reason behind the same address being displayed while we try to print normal variables as well as reference variables.

A Sample C++ Code using Reference Cascading

Here we will try to look at a complex scenario and see how references will work in cascading. Let's follow the code below:

C++
#include <iostream.h>
int main()
{
    int i = 10; // A Simple Integer variable
    int &j = i; // A Reference to the variable
    // Now we can also create a reference to reference variable. 
    int &k = j; // A reference to a reference variable
    // Similarly we can also create another reference to the reference variable k
    int &l = k; // A reference to a reference to a reference variable.

    // Now if we increment any one of them the effect will be visible on all the
    // variables.
    // First print original values
    // The print should be 10,10,10,10
    cout<<  i  <<  ","  <<  j  <<  ","  <<  k  <<  ","  <<  l  <<endl;
    // increment variable j
    j++; 
    // The print should be 11,11,11,11
    cout<<  i  <<  ","  <<  j  <<  ","  <<  k  <<  ","  <<  l  <<endl;
    // increment variable k
    k++;
    // The print should be 12,12,12,12
    cout<<  i  <<  ","  <<  j  <<  ","  <<  k  <<  ","  <<  l  <<endl;
    // increment variable l
    l++;
    // The print should be 13,13,13,13
    cout<<  i  <<  ","  <<  j  <<  ","  <<  k  <<  ","  <<  l  <<endl;
    return 0;
}

A sample C++ Code Using Reference Cascading (Compiler Generated Syntax)

Here we will see if we won't depend upon the compiler to generate constant pointers in place of reference and auto dereferencing the constant pointer, we can achieve the same results.

C++
#include <iostream.h>
int main()
{
    int i = 10;         // A Simple Integer variable
    int *const j = &i;     // A Reference to the variable
    // The variable j will hold the address of i

    // Now we can also create a reference to reference variable. 
    int *const k = &*j;     // A reference to a reference variable
    // The variable k will also hold the address of i because j 
    // is a reference variable and 
    // it gets auto dereferenced. After & and * cancels each other 
    // k will hold the value of
    // j which it nothing but address of i

    // Similarly we can also create another reference to the reference variable k
    int *const l = &*k;     // A reference to a reference to a reference variable.
    // The variable l will also hold address of i because k holds address of i after
    // & and * cancels each other.

    // so we have seen that all the reference variable will actually holds the same
    // variable address.

    // Now if we increment any one of them the effect will be visible on all the
    // variables.
    // First print original values. The reference variables will have * prefixed because 
    // these variables gets automatically dereferenced.

    // The print should be 10,10,10,10
    cout<<  i  <<  ","  <<  *j  <<  ","  <<  *k  <<  ","  <<  *l  <<endl;
    // increment variable j
    (*j)++; 
    // The print should be 11,11,11,11
    cout<<  i  <<  ","  <<  *j  <<  ","  <<  *k  <<  ","  <<  *l  <<endl;
    // increment variable k
    (*k)++;
    // The print should be 12,12,12,12
    cout<<  i  <<  ","  <<  *j  <<  ","  <<  *k  <<  ","  <<  *l  <<endl;
    // increment variable l
    (*l)++;
    // The print should be 13,13,13,13
    cout  <<  i  <<  ","  <<  *j  <<  ","  <<  *k  <<  ","  <<  *l  <<endl;
    return 0;
}

A Reference Takes its Own Space in Memory

We can see this by checking the size of the class which has only reference variables. The example below proofs that a C++ reference is not an alias and takes its own space into the memory.

C++
#include <iostream.h>

class Test
{
    int &i;   // int *const i;
    int &j;   // int *const j;
    int &k;   // int *const k; 
};

int main()
{    
    // This will print 12 i.e. size of 3 pointers
    cout<<  "size of class Test = "  <<   sizeof(class Test)  <<endl;
    return 0;
}

Conclusion

I hope that this article explains everything about C++ references. However I'd like to mention that C++ standard doesn't explain how reference behaviour should be implemented by the compiler. It's up to the compiler to decide, and most of the time it is implemented as a constant pointer.

Additional Notes to Support this Article

In the discussion forums for this article, people were having concerns that References are not constant pointers but aliases. I am writing one more example to support this fact. Look carefully at the example below:

C++
#include <iostream.h>

class A
{
public:
	virtual void print() { cout<<"A.."<<endl; }
};

class B : public A
{
public:
	virtual void print() { cout<<"B.."<<endl; }
};

class C : public B
{
public:
	virtual void print() { cout<<"C.."<<endl; }
};

int main()
{
	C c1;
	A &a1 = c1;
	a1.print(); // prints C

 	A a2 = c1;
	a2.print(); // prints A
	return 0;
}

The example using references supports the virtual mechanism, i.e. looking into the virtual pointer to get the handle to correct function pointer. The interesting thing here is how the virtual mechanism is supported by the static type which is simply an alias. Virtual mechanism is supported by dynamic information which will come into the picture only when a pointer is involved. I hope this will clarify most of the doubts.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect
India India
A programmer by heart since 1998. Written code in C++, Java, JavaScript, Python & Ruby, Worked on Stack Development to Web Development. Data Specialist with SQL and NoSQL DBs

Comments and Discussions

 
QuestionPointer to a reference. Pin
Anitesh Kumar4-Jun-12 2:52
MemberAnitesh Kumar4-Jun-12 2:52 
AnswerRe: Pointer to a reference. Pin
IshanB9-Jul-12 5:18
MemberIshanB9-Jul-12 5:18 
GeneralReference Variable Disadvantages Pin
ArchangelOfTheElectronicWorld24-Jan-08 0:11
MemberArchangelOfTheElectronicWorld24-Jan-08 0:11 
GeneralWords fail me Pin
trelliot27-Apr-06 16:29
Membertrelliot27-Apr-06 16:29 
Generalreference != const pointer Pin
Rob Hemstede19-Apr-06 6:50
MemberRob Hemstede19-Apr-06 6:50 
GeneralRe: reference != const pointer Pin
Iftahh19-Apr-06 23:51
MemberIftahh19-Apr-06 23:51 
GeneralRe: reference != const pointer Pin
itsdkg20-Apr-06 0:18
Memberitsdkg20-Apr-06 0:18 
GeneralRe: reference != const pointer Pin
Rob Hemstede20-Apr-06 0:54
MemberRob Hemstede20-Apr-06 0:54 
GeneralRe: reference != const pointer Pin
nutty26-Apr-06 4:20
Membernutty26-Apr-06 4:20 
GeneralRe: reference != const pointer Pin
jefito9-Apr-21 3:39
Memberjefito9-Apr-21 3:39 
GeneralJust an alias Pin
maihem19-Apr-06 3:56
Membermaihem19-Apr-06 3:56 
GeneralGreat topic, style points (2) Pin
Shawn Poulson19-Apr-06 3:17
MemberShawn Poulson19-Apr-06 3:17 
GeneralRe: Great topic, style points (2) Pin
Iftahh19-Apr-06 23:55
MemberIftahh19-Apr-06 23:55 
GeneralConfusing article Pin
jefito15-Mar-06 4:43
Memberjefito15-Mar-06 4:43 
GeneralRe: Confusing article Pin
itsdkg15-Mar-06 19:14
Memberitsdkg15-Mar-06 19:14 
GeneralRe: Confusing article Pin
Ilya Lipovsky16-Mar-06 5:38
MemberIlya Lipovsky16-Mar-06 5:38 
GeneralRe: Confusing article Pin
Ilya Lipovsky16-Mar-06 7:46
MemberIlya Lipovsky16-Mar-06 7:46 
GeneralRe: Confusing article Pin
itsdkg16-Mar-06 20:13
Memberitsdkg16-Mar-06 20:13 
GeneralRe: Confusing article Pin
jefito21-Mar-06 14:18
Memberjefito21-Mar-06 14:18 
GeneralRe: Confusing article Pin
itsdkg22-Mar-06 2:06
Memberitsdkg22-Mar-06 2:06 
GeneralRe: Confusing article Pin
jefito23-Mar-06 8:30
Memberjefito23-Mar-06 8:30 
Gatik G wrote:
I can buy your arguments only when if you can proof that compilers implements the references in any way, other than const pointers.

For your claim that in case of functions it might go into registers. This can be the case for simple variable also for efficiency reasons. Try to dig into assemble codes generated.



I'm not quite sure that I understand -- are you asking me to provide you with examples of when references may not take up memory? Here's one that I found:
int array[10] = {0}; // simple array<br />
int& ri = array[3]; // reference to array member<br />
ri = 4; // change array member through reference<br />

It should be fairly straightforward for a compiler to optimize away the existence of the reference variable 'ri', and do the assignment directly to the appropriate array member. The compiler that I am using (Microsoft C/C++ 12/VS2003) generated the single 8086 assembly instruction for the assignment:

mov DWORD PTR _array$[esp+84], 4

No use of the variable 'ri', and it takes up no space. QED.

Further, a reference to a static object can generally be optimized away (or at least into machine code offsets). References used in functions that are inlined may be optimized into register-only values. Yes, I could post examples of these, but what's the point? I just showed one case where a reference doesn't take up any space.

Gatik G wrote:
Finally, Standards may also not ask you how to implement virtual table in the memory but more or less it is done in a same way by most of the compilers (you can’t argue till you get proper functionality).


Whatever this means, it has nothing to do with references.

Gatik G wrote:
if you can prove that I am spreading confusion about C++ and audience can support that. I will withdraw this article very next day along with my written apology for this.


It's my opinion that your article is confusing, because it lacks clarity with respect to C++ concepts, and makes incorrect assertions about references. Of course, since it's my opinion, it is not provable. But let me try to explain. From your article:

Gatik G wrote:
I choose to write about the references in C++ because I feel most of the people are having misconception about the references and I have got this feeling because I took many C++ interviews and I seldom get correct answers about references in C++.


So what are the questions you've asked about references, and what are the answers that you get that are so incorrect and so misconceived?

By the way, one of the interview questions that I ask is: What's the difference between a pointer and a reference? If the candidate told me that they are identical, then I would have a big problem recommending that we hire them.

Gatik G wrote:
What is meant by references in C++? A reference is generally thought of an aliasing of the variable it refers to. I hate the definition of references being and Alias of a variable in C++. In this article I will try to explain that there is nothing known as aliasing in C++.

Here, you begin by asking a question about the nature of references, then you introduce the notion of aliasing, that you somehow dislike. For the third time, what do you mean by aliasing, with respect to references? And then why do you claim that there is no such thing as aliasing in C++? As I read the Standard, there is no official notion of aliasing in C++, but I don't think that's what you mean. You could clarify by defining what you mean by aliasing. By leaving such an important concept (to the stated purpose of your article) undefined, you create confusion.

Gatik G wrote:
Background
Both in C and in C++ there are only two ways by which a variable can be accessed, passes, or retrieved. The two ways are

1. Accessing/passing variable by value.
2. Accessing/Passing variable by address. In this case pointers will come into picture.

There is no 3rd way of Accessing/Passing variable. A reference variable is just another pointer variable which will take its own space in memory. The most important thing about the references is that its a type of pointer which gets automatically dereferenced (by compiler..). Hard to believe? Let see....

Why this strange excursion into accessing/passing variables? Why not just go to the Standard for the definition? It's very simple, and quite clear: "A reference can be thought of as the name of an object". Nowhere does it say that references are pointers; indeed, pointers are described in a separate section.

Gatik G wrote:
References are nothing but constant pointer in C++. A Statement int &i = j; will be converted by compiler to int *const i = &j; i.e References are nothing but constant pointers. They need initialization because constants must be initialized and since the pointer is constant they can't point to anything else. Lets take the same example of references in C++ and this time we will use the syntax what compiler uses when it sees references.

References are not constant pointers. A common implementation of references is indeed by using pointers, but the standard does require that. Moreover, the compiler does not convert the statement int &i = j; into int *const i = &j;. This is nonsensical. Don't believe me? Try this code:
double d;<br />
double& rd = d;<br />
cout <<  "size of d = "  <<  sizeof d  <<  "  size of rd = "  <<  sizeof rd <<  endl;


If rd were converted by the compiler into a pointer, then what would you expect its size to be? Moreover, references cannot be null, while pointers can be null. You cannot have a reference to void, but you vertainly can have a pointer to void. You can't even have an array of references. A reference is not a pointer, period. There is no such rule as you cite; it is not like the rule "the subscript operator [] is interpreted in such a way that E1[E2] is identical to *((E1)+(E2))", which is in the Standard. You made your rule up.

Gatik G wrote:
We can see this by checking the size of the class which is having only reference variables. The example below proofs that a C++ reference is not an Alias and takes its own space into the memory.

We have also seen by my examples above that a reference does not actually need to take up memory space. And by the way, this accords with what the Standard says. And again, what is an alias?

So, by my reckoning, your article introduces a concept, aliasing, that is never defined; claims that many people believe that references are aliases (if true, why is that a bad thing?), complains that that that belief is incorrect, and then goes on to make claims about references that are unsupported both by the Standard and by at least one actual implementation. And then forgets to prove that aliases don't exist in C++. It proves nothing about references except that they can take up memory space (if you've read the Standard's description, then that should come as no surprise), and I'm sorry, but in my opinion, that's confusing and unhelpful.

My suggestion: if you wish to describe why a reference might take up memory space, then you should begin by citing the definition from the Standard and that the Standard says that a reference might take up space. Contrast it against when a reference might not take up space (now that you know that they need not). Explain how a compiler might implement a reference as a pointer, sure, but don't claim that a reference is a pointer. And forget about the concept of aliasing, at least until you define it, and why it matters to a reference's size.

I'm sorry; you get points for trying to be helpful in this forum. But that's not enough if the material you provide isn't clear and correct.

regards,

Jeff
GeneralRe: Confusing article Pin
Ilya Lipovsky24-Mar-06 10:16
MemberIlya Lipovsky24-Mar-06 10:16 
GeneralRe: Confusing article Pin
jefito24-Mar-06 10:58
Memberjefito24-Mar-06 10:58 
GeneralRe: Confusing article Pin
Ilya Lipovsky24-Mar-06 11:52
MemberIlya Lipovsky24-Mar-06 11:52 
GeneralRe: Confusing article Pin
Matthias Becker2-Apr-06 9:45
MemberMatthias Becker2-Apr-06 9:45 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.