class Student { public: float calcTuition(); //...other stuff }; class GraduateStudent: public Student { public: float calcTuition(); //...other stuff }; main() { Student s; GraduateStudent gs; s.calcTuition(); //calls Student::calcTuition() gs.calcTuition(); //call GraduateStudent::calcTuition() }But what if the exact class of the object can't be determined at compile time? To demonstrate how this can occur, let's change the preceding program in a seemingly trivial way:
class Student { public: float calcTuition(); //...other stuff }; class GraduateStudent: public Student { public: float calcTuition(); //...other stuff }; void func(Student &x) { x.calcTuition(); //to which calcTuition does this refer? } //(answer: base class Student::calcTuition) main() { Student s; GraduateStudent gs; func(s); func(gs); }Instead of calling
calcTuition()
directly, the call is now made
through an intermediate function, func(Student &x)
. Depending
on how func(Student &x)
is called, x can be a Student
or a GraduateStudent. You would like x.calcTuition()
to
call Student::calcTuition()
when x is a Student but call
GraduateStudent::calcTuition()
when x is a GraduateStudent.
Normally the compiler decides which function to call at compile time.
Even when a function is overloaded, the compiler uses the different
argument types to decide at compile time. But here the decision cannot
be made until run time, when the actual type of the object can be
determined.
The capability to decide at run time which of several overloaded member functions to call based on the actual type is called polymorphism. Poly means many and morph means form (as in amorphous). Late binding is the mechanism C++ uses to implement polymorphism. I will tend to use the two terms interchangably. Deciding which overloaded member functions to call at compile time is called early binding because that sounds like the opposite of late binding. The default for C++ is early binding because polymorphism adds a small amount of overhead both in terms of data storage and code needed to perform the call.
Terminology: Another name for the actual type is the run-time type.
In the previous example, the run-time type of x is Student
in the call func(s)
and the run-time type of x is
GraduateStudent in the call func(gs)
. The declared type
of x is Student because that's what the declaration of
func says.
To indicate polymorphism, the programmer must flag the member function with the keyword virtual. Virtual functions allow the programmer to declare the functions in a base class that can be redefined in each derived class.
class Base { public: virtual void func(); }; class DerivedClass: public Base { public: virtual void func(); }; void extfunc(Base &b) { b.func(); } main() { Base bc; DerivedClass dc; extfunc(bc); //calls Base::func() extfunc(dc); //calls DerivedClass::func() }You need to declare the function virtual only in the base class. The ``virtualness'' is carried down to the derived class automatically. So we could have written:
class Base { public: virtual void func(); }; class DerivedClass: public Base { public: void func(); //implicitly virtual };To allow a virtual function declaration to act as an interface to functions defined in derived classes, the argument types specified for a function in a derived class cannot differ from the argument types declared in the base class. If the arguments don't match, there is no late binding and the function is specified at compile time just like any ordinary overloaded function. This is true even if the keyword ``virtual'' is used. Only very slight changes are allowed for the return type. In particular, if the member function of the base class returns a pointer or reference to a base class object, an overloaded member function in a subclass may return a pointer or reference to an object of the subclass. In other words, the following is allowed:
class Base { public: virtual Base* func(); }; class DerivedClass: public Base { DerivedClass* func(); };
A virtual function must be defined for the class in which it is first declared (unless it is declared to be a pure virtual function). A virtual function can be used even if no class is derived from its class, and a derived class that does not need its own version of a virtual function need not provide one. When deriving a class, simply provide an appropriate function, if it is needed.
The great benefit of polymorphism is that one can add new derived classes without changing the base class or the older derived classes. In some cases you don't even have to recompile the existing classes. Polymorphism greatly facilitates encapsulation of the code. The details of a function can be dealt with in the base class and its derived classes, while the application can call the generic function which is bound at run time. The generic function acts as a buffer or an interface between the details of the function and the applications which will use various versions of the function.
class StuffYouCook{}; class Nachos: public StuffYouCook {}; class Oven { public: virtual void cook(Nachos &nachos); //other stuff... }; class Microwave: public Oven { public: virtual void cook(Nachos &nachos); //other stuff... }; Nachos makeNachos(Oven &oven) //external function { Nachos n; oven.cook(n); //bound at run time return n; }The function makeNachos is passed an Oven of some type. Given that oven, it assembles all the stuff into an object n and then cooks them by calling
oven.cook
. Exactly which
function is used, function Oven::cook
or Microwave::cook
,
depends on the real-time type of oven. The function
makeNachos has no idea-and doesn't want to know-what the run-time
type of oven is. Notice that makeNachos doesn't need to know
the details of oven. That's what we mean by encapsulation and
hiding the details of one part of the program from another part of the
program. Polymorphism has allowed us to greatly simplify the code since
makeNachos doesn't have any of the oven or Nachos
details. The result is extensible. If a new class of oven
comes along, e.g. ToasterOven, with a new cook function
(ToasterOven::cook(Nachos&)
), we do not need to change
makeNachos to incorporate the new function. Polymorphism
automatically includes the new function and calls it when necessary.
Polymorphism is the key to the power of object-oriented programming. It's so important that languages that don't support polymorphism cannot advertise themselves as object-oriented languages. Languages that support classes but not polymorphism are called object-based languages. Ada is an example of such a language.
Base::func
because
that's what the programmer indicated, even if func is declared
virtual:
class Base { public: virtual void func(); }; class DerivedClass: public Base { public: void func(); //implicitly virtual }; void test(Base &b) { b.Base::func(); //This call is not bound late }
class Base { public: ~Base(); }; class DerivedClass: public Base { public: ~DerivedClass(); }; void finishWithObject(Base *pHeapObject) { //work with object... //now return it to heap memory delete pHeapObject; //this calls ~Base() no matter //what the run-time type of //pHeapObject isIf the pointer passed to finishWithObject really points to DerivedClass, the DerivedClass destructor is not invoked properly. Declaring the destructor virtual solves the problem:
class Base { public: virtual ~Base(){}; //even an empty destructor will do }; class DerivedClass: public Base { public: ~DerivedClass(); //implicitly virtual }; void finishWithObject(Base *pHeapObject) { //work with object... //now return it to heap memory delete pHeapObject; //this calls the correct destructorWhen an object is deleted, C++ runs the object's destructor. Given that we only have a pointer to the Base object, which destructor should C++ run,
~Base()
or ~DerivedClass()
? The
answer depends on which kind of object the pointer points to. Such
behavior is exactly what declaring a member to be virtual
arranges. Since one can't predict when writing a base class whether
any class eventually derived from the base class will have a
destructor member, it is safest to anticipate the possibility by
providing a virtual destructor in the public interface of the base
class. You should definitely use a virtual destructor if you use
virtual functions and pointers to the base class. In short,
the presence of a virtual destructor in Base ensures that every
class derived from it will be supplied with a destructor (thus getting
the size of the object right), even if the derived class doesn't
have a user-defined destructor. Even an empty destructor will do.
v_table
. The v_table
contains
a list of pointers to all the virtual functions defined in the class.
Suppose we have the following class hierarchy:
class Base { public: virtual void f1(); virtual void f2(); void nonVirtualFn(); int d1; } class DC: public Base { public: virtual void f3(); virtual void f1(); int d2; }; void func(Base &b) { b.f1(); b.f2(); }; main() { Base base; DC subclass1; //declare 2 subclass elements DC subclass2; func(base); func(subclass1); }The
v_table
configuration is shown in the figure. Notice that
the two DC objects have their own v_table
pointers,
but they share the same v_table
.
When func(base) is called in main, it is passed the object
base. The call to b.f1 leads to Base::fn1
if you
just follow the arrows in the figure. The second time func is
called in main, it is passed the object
subclass1. If you follow the arrows, the call to b.f1
now leads to DC::fn1
. The call to b.f2 leads to
Base::fn2
for both the object base and the
object subclass1. So virtual functions introduce a little
``overhead'', but not much.
class Shape{ //abstract class public: virtual void rotate(int) = 0; //pure virtual function virtual void draw() = 0; //pure virtual function virtual bool is_closed() = 0; //pure virtual function //... };A class with one or more pure virtual functions is an abstract class, and no objects of that abstract class can be created:
Shape s; //error: can't create object of abstract class ShapeBut it's ok to create a pointer or a reference to an abstract class
func1(Shape *s); //legal func2(Shape &s); //legalAn abstract class can be used only as an interface and as a base for other classes. For example:
class Point{ ... }; class Circle: public Shape { public: void rotate(int){ } //overrides Shape::rotate void draw(); //overrides Shape::draw bool is_closed(){return true;} //overrides Shape::is_closed Circle(Point p, float r); private: float radius; Point center; };A pure virtual function is a placeholder in the base class for the derived class to overload with its own implementation. Without that placeholder in the base class, there is no overloading. A pure virtual function that is not defined in a derived class remains a pure virtual function, so the derived class is also an abstract class. This allows us to build implementations in stages:
class Polygon: public Shape { //abstract class public: bool is_closed(){return true;} //overrides Shape::is_closed //... draw and rotate not overridden... }; Polygon b; //error: declaration of an object of abstract class Polygon class Square: public Polygon { public: void draw(); //overrides Shape::draw void rotate(int); //overrides Shape::rotate //.... }; Square s; //fine (assume some suitable constructor)An important use of abstract classes is to provide an interface without exposing any implementation details. For example, an operating system might hide the details of its device drivers behind an abstract class:
class Device { //abstract class public: virtual int open(int opt) = 0; virtual int close(int opt) = 0; virtual int read(char* p, int n) = 0; virtual int write(const char* p, int n) = 0; virtual ~Device() {}; //virtual destructor };We can then specify drivers as classes derived from Device, and manipulate a variety of drivers through that interface.
class EmployeeList : public Array<Employee *> { ... }; EmployeeList myDept(30); WageEmployee *wagePtr; SalesPerson *salePtr; Manager *mgrPtr; wagePtr = new WageEmployee("Bill Shapiro"); myDept[0] = wagePtr; salePtr = new SalesPerson("John Smith"); myDept[1] = salePtr; myDept[2] = new Manager("Mary Brown"); for(int i=0; i < myDept.numElts(); i++) cout << myDept[i]->getName() << endl; //Now try: for(int i=0; i < myDept.numElts(); i++) cout << myDept[i]->computePay() << endl; // error, computePay not a member of EmployeeIf Employee did have a computePay, it would not be the right one: we need a different computePay for each employee type.
class Employee //abstract class { public: Employee(const char* nm); char *getName() const; virtual float computePay() const=0; //placeholder fcn that is never called virtual ~Employee() {} private: char name[30]; };Now make sure WageEmployee, SalesPerson, Manager, each have the line:
float computePay() const; // implicitly virtualNow to use it:
Employee *empPtr; float salary; empPtr = &aWorker; salary = empPtr->computePay(); // call WageEmployee::computePay empPtr = &aSeller; salary = empPtr->computePay(); // call SalesPerson::computePay empPtr = &aBoss; salary = empPtr->computePay(); // call Manager::computePay //Or use our array of pointers: for(int i=0; i < myDept.numElts(); i++) cout << myDept[i]->computePay() << endl; // No problem; this works fine.