Home / Open source / About C++
C++
for professionals
(Selected
chapters)
By
Yuri Putivsky
2007
Introduction
Probably
a respectable reader would ask herself: “Why has another book been published
dedicated to the programming language, well known only for a few, whereas the
majority of software developers are predicting that the language will be dead
in the near future and forgotten forever?”
What
is modern software development? Is this routine procedure – a conveyer that
everybody could work on after a few months of community college courses and
where there are no irreplaceable people? Or is it a profession that requires
the state-of-art creative thinking and permanent learning every day, where the
same amount of creativity is required of the programmer as of a bestseller
writer or a movie star? May be it is the combination of both and each person
chooses the proportions according to his ability and taste?
This
book has been written for readers who are very familiar with the C++
programming language and who plan to develop new programs in the future.
Therefore, you will not find in this book the basic C++ introduction with
comprehensive examples. The author assumes the reader is almost fluent in C++
and can read the code examples as well as she reads a text written in her
native language.
The C++ programming language syntax is very compact.
However, the list of problems that can be solved using C++ is almost infinite.
Of course, the author knows that many problems can be solved more quickly when
other programming languages are exploited. This is why in this book we will try
to pay attention to the areas where C++ demonstrates obvious advantages.
About
analogies
The
usage of analogies is very useful and is a wide spread method for explaining
different ideas and aspects of human knowledge. The author suspects (he does
not even pretend to be an expert in human brains) that our brain is a storage
of all types of information, with an enormous number of connections between
these information pieces on the neurons’ level. Those connections are actually
what creates new knowledge. The knowledge that some entity is “similar” to
another entity is a pure example of analogy. Software development is an area of
human activity where usage of analogies could be invaluable.
All
in all we are trying to build the analogy for specific tasks chosen for C++
development. Moreover, when we are solving abstract mathematic, physics, or
business problems we can always find something similar in our real world
experience. Even when teaching students it’s extremely useful to come up with
the analogies that the students are more or less familiar with and they can
easily imagine the process almost immediately.
Multi-platform
code
The
programming language C++ itself (we won’t discuss the exceptions from the
standard here, each compiler has them) does not depend on the operating system
or the CPU that we will be using. The possibility to include the Assembler code
directly into C++ code is also out of our scope. Many non-trivial programs use
the dynamic memory allocation, synchronization objects for threads and
processes, sockets connections, etc. Regardless of the attempts of many
operation developers to keep the standard at least on the “C” API level there
are some non-portable API/libraries. It might looks like a real challenge to
develop a program that could be easily compiled on multiple operation system
and compilers (which is in practice an even a bigger challenge). However, one
of the powerful features of the C++ language, encapsulation, would help us a
lot. We will deal with such classes/concepts like “thread”, “memory allocator”,
“database connection”, etc. For each operation system internal implementation
could be different and hidden inside the code but from the outside those
classes will look identical.
Programming
styles
“Tastes
differ”. This is very true about programming styles. Some like to use the
combination of capital and lowercase letters for variable names, others use the
underscore symbol as a separator. Some people like the tab offset format for the
code, others insist on 2-3 whitespaces only. The author belongs to the group of
developers who think that using “Hungarian notation” or any other particular
“standard” is not so important. The most important thing is the readability of
your code.
1)
The variable name should be self-explained on one hand, and on the other hand
should not exceed the reasonable length.
The
formatting should present the code in the easy readable form. For instance, the
author has seen formatting like this:
if ( obj -> properties . ident != input . ptr ->
count )
The
golden rule of the code formatting is to show your code to somebody else or
take a look at your own code after 6 moths. If you are not able to easily
understand what you wrote 6 moths ago, change your style immediately.
One
of the most important attributes of the programming code is the existence of
comprehensive comments. Actually, the comments inside the code are the part of
the documentation process, sometimes the only available part. It would be very
useful to describe the public members and methods in much detail because other
developers will likely use it. Even here there should be some reasonable amount
of comments. When the number of comment lines exceeds the number of lines of
code itself, it is hard to read, understand, and maintain such an interface.
Memory
management
It
would not look like a metaphor if I said that the optimal custom memory
management is the corner stone for high-performance multithread software
development. In practice, it is almost impossible to develop a memory heap with
the efficient memory allocation/de-allocation algorithm on a process level
without blocking multiple threads with extensive memory allocation requests of
different sizes. What would be the acceptable solution in many cases? Let’s
split the problem into two parts. The first is the extremely fast allocation of
the small blocks of memory, and the second is independent access to the memory
repository from different threads. Each thread works with independent memory space.
One of the obvious solutions to this problem is a special class allocator. Any
instance of allocator class will be used at one thread exclusively and
multithreaded synchronization is not required. The allocator will go to the
system heap very rarely and will grab a significant chunk of memory to chop it
into smaller pieces on demand. Let’s start with the example of very simple
allocator. Such an allocator can not reuse or deallocate the small chunks of
memory but it can reuse or deallocate all memory – “all or nothing” could be
written on its banner.
The
main challenge in reusing previously used blocks is the different sizes of such
chunks. What if we knew in advance that the size of requested chunks will be
fixed and only one chunk will be requested at a time? Then we could build a
more sophisticated allocator. Let’s do it.
Let’s do it.
As
the reader will notice, we have the template class, which is able to allocate
and reuse memory chunks with a fixed size.
Before
we go further with different and more sophisticated allocator classes, let’s
discuss standard template library (STL) high-performance issues. The classic
implementation is the most generic one, but when you need the best performance
you can achieve the more specific solution would be the best one. Just as an
example consider the following design of template class “list”. What if we need
to create a temporary list of a million pretty simple elements – integer types?
The classical implementation, even with the custom external allocator, will
call destructor a million times. It would be a waste of time if we knew that
all the memory required to place the million elements were taken from the
allocator. All we need to do in order to reuse the memory is just call the
reset method of the allocator.
Let’s
try to design the list class in such a way that we will benefit from custom
memory management.
Let’s do it.
The
TYPENAME macro in most cases expands into typemane, but for some old compilers
like Microsoft VC6 it won’t work. The macro expands to nothing. The major
difference from the classical STL implementation is the absence of static
members that are allocated dynamically, and as a result create problems with
reusing object instances together with the allocator memory reset.
For
instance, some STL implementations do not destroy dynamically allocated static
members even after the allocation has happened on the external allocator. As a
result we can not reset and reuse memory without corrupting those static
members. The second advantage is the size of the empty container. It’s only 8
bytes (here and further in the text we are assuming we have a 4 byte default
struct alignment) regardless of the type of the stored items. Also, you can't
find the methods which require the memory allocation or deallocation like
push_back, insert, etc. This is because we do not have a memory model yet. It
will be introduce in the inherited classes. At the same time all iterators
functionality is there because the internal structure of the further list
container is clear.
Some
static and even constant casting is required. It makes the code look a little
bit ugly, but it is a price we pay for the performance and compactness we fight
for.
The
next logical step would be to build the fully functional “list” class with the
memory management. But which type of memory management will be the best one?
The correct answer is “it depends”. It depends who will be responsible for the
life time of created nodes and items. Let’s take two radically different cases:
first delegate memory allocation to outside the class interface completely, the
second one, in contrast, encapsulates the memory allocation, hiding it inside
the class implementation. Let’s take a look at the first one.
Let’s do it.
The
second approach when the memory allocation policy is deeply encapsulated leads
us to a different “list” implementation.
Code.
We
will use the same node allocator as for the _list class, only we will keep it
inside the class. Also, we will keep the length of the container. The previous
_list class does not have internal item counter. This is a classical trade off
between a performance and compactness. The _list and list classes make
different choices.
The
same exact approach can be used for other STL containers like stack, map,
bitset, etc. Whenever a developer needs a container to perform the particular
small task she can use the above classes and external allocators. When the
container is not needed anymore, simply clear it and reset the allocator.
Stack class.
Map class.
Vector class.
Multithreaded
environment
As
many of you know the objects of multithreaded synchronization are platform
dependent ones, as well as thread API.
Before
we continue discussing allocators, we will have to leave this interesting topic
and look at examples of multithread programs. As we already know, threads, objects
of synchronization, are platform dependent objects, which we want to
incapsulate into the non-platform dependent C++ interface. To do this, we will
have to begin at the end and define these interfaces.
Event
is an object that lets us wait for a signal in one thread, while the signal is
being initiated in another thread.
Event class.
Some
objects of synchronization can be used in between threads in one process and
between threads in multiple processes. However, we will not look at
"global" interprocess objects of synchronization here.
As
the attentive reader might have noticed, the difference between a semaphore and
a mutex is the amount of threads simultaneously obtaining the resource and the
ability to wait for the availability of the resource a finite amount of time.
Semaphore class.
Mutex class.
Threads.
Let’s
think about what type of class "thread" we would like to create,
using analogies from real life.
1. The opening and closing of a thread is a very costly action. This is why we
would like to use already open threads multiple times. For instance, when you
rent a TV, you don't think that since the TV that you are given is new it will
be destroyed after you return it.
2.
The methods start and stop are necessary to open and close a thread. We don't
know exactly what the thread is going to have to do ahead of time, and that is
why we need to be able to assign and cancel the threads job:
"assign_job", "cancel_job". But how will the thread know
which job it will have to perform? The solution to this is problem is using the
standard interface of the class, which will act as the "employer" of
our thread.
Job empoyer class.
We can make the thread "ask" the employer if there is any work available
that needs to be done, and, if such work is available, will immediately begin
performing the job. We can also make the thread do this multiple times.
What if the employer has no work for the thread at the time the thread
"asks"? The thread will need to "sleep" until there is a
job, at which time the employer will send a message /event to the thread and it
will wake up.
To make the thread even smarter, we can give it jobs telling it to sleep for a
certain period of time, and when the thread automatically wakes up it will once
again go to our employer and "ask" for a job. In this case, if the
job is not time dependant and does not need to be started right away, the
employer does not need to wake the thread up, but can just wait until the
thread wakes up and asks for a job. The thread will wake up after a specified
amount of time by itself.
The reader has probably noted that we spend a lot of time discussing a subject and
not writing the code. As an excuse, I want to remind the reader that the code
is only a formal description of the processes that we want to program, and the
more we understand the processes, finding real life analogies, the better the
code will be.
As a matter of fact, one "employer" can hire multiple threads that will
perform the same or different jobs. Let’s widen our scope and use some analogies.
In real life, if an employer cannot predict how many threads/workers she will
need, then one of the more effective methods for the usage of resources would
be the manager of the threads. The employer could then "hire" a thread for an unlimited period of time, use that
thread, and then after completing all the jobs the thread would return to the
manager automatically. But how would we automate the return process? What if we
return the thread to the manager as soon as the employer has no jobs for the thread?
But what if this job will appear a second after the thread has been returned?
In this case, we will spend a lot of time to get the thread from the manager
and to return the thread back to the manager, maybe even every second. But what
if the thread, when there is no job available, will fall asleep for a bit (the
period of time will be dictated by the employer) and, if after the nap the
employer still has no job for the thread, the thread will return to the manger
automatically.
The manger of the threads needs to not only be able to open new threads when all
the already open ones have been taken, and not only to take back used threads,
but to also be able to close threads when they are not used for some time to
conserve resources.
Threadpool class.
Together, we have created a somewhat universal and a somewhat "smart" manger of
threads, that can perform any tasks given to it by the employers. But where
will we use this manager? Well, the answer to this question is "almost
everywhere, where we will need multiple threads for the high performance of a
program".
Memory
management 2
Let’s return to memory management for a while. We covered this topic in some previous
chapters. What if we take the idea of a manager of threads and apply it to a
memory allocator? This will create a manger that is able to give out memory
allocators and will be able to do this in a multithread environment. For this
we will need synchronization objects, at which we looked at in the previous chapter.
Moreover, what if we turn on our imagination and use the method of
generalizations, the best friend of our analogies. The idea of renting high cost items has won over the
minds of the inhabitants of this planet. Right now you can rent anything from a
normal TV to a space shuttle. Then why would we imagine something that has
already been imagined for us, if we can just find a new use for an old idea?
Let us imagine that we are the creators of a renting bureau that will rent anything
that is needed by programmers.
Resource
pool – Universal renting bureau.
We will use “templates” to turn our idea into reality. All we will need is some
basic functions that are necessary to create, destroy, activate, and deactivate
an object. The splitting of functionality into two separate functions of
creation and activation will add more flexibility to the resources.
In this chapter we will
try to create a multiplatform and multithread “socket pool”. We will try to use
everything that we covered in the previous chapters. Such an approach to
programming, when a code that has been created is used more than once, is very
productive. First of all, we don’t waste time on the development of basic
blocks such as allocators, threads, etc. Second of all, the code of these basic
blocks will be automatically tested in different environments, which will
enable the programmer to find errors while still testing the code.
What does the term “socket pool” mean to us? It should be a multiplatform
multithread class that will enable us to use an asynchronous interface with a
number of socket commands: create, listen, connect, send, recv, close.
A reader that has experience with working with sockets will notice that we left
out accept. No, we didn’t completely forget about it. We will try to delegate
this function to “socket pool”, because the command listen implies that we want
to receive incoming connections. At any given time we will be able to call
the method close for any socket handle to turn on the thing that was initiated
by the command listen. Also, we will need the method create to create a socket.
You might ask yourself, what type of sockets will we be able to create? Let’s
limit ourselves to TCP sockets, even though this is not essential. Since all of the commands
imply asynchronous completion, we will need callbacks to inform the user of the
completion of the asynchronous action, or of an error. We will need these
callback methods, which are convenient to implement as the abstract class.
Socket pool
How will the “socket pool” work?
One of the threads will initiate asynchronous actions
and will need to exist while the “socket pool” exists. For example, if we close
this thread in Windows, all asynchronous operations will automatically be
cancelled.
The second thread will only be needed to await the completion of the asynchronous operations.
This thread will not be able to invoke user callbacks, because this can take a significant and uncontrolled amount of time. This is
why the second thread will take the information about the completion of the
asynchronous completion and place it in the queue of outgoing events, and other
threads will call the user callbacks.
The maximum number of threads that process the queue
of outgoing events will be set by the user, by utilizing the socket pool. We would also like
to give the user control over the timeout functionality. That is, if the
“socket pool” cannot complete the asynchronous action in the set amount of
time, which will be set by the user, the socket pool will need to invoke the
callback for an error, and, if possible, cancel the asynchronous action.
To control the timeout
of the asynchronous actions we will need another thread, which will be taken
from the manager of threads when it is needed. Working threads that process
already completed asynchronous actions can also be returned to the manager of
threads when they are not used for a time.
When building multithread
applications like the socket pool, we are forced to ask ourselves
“uncomfortable” questions. For instance, what will happen if the user calls the
method close when we have already initiated the synchronous action? We need to be
ready for any action of the user, no matter how absurd they look at first, are
possible if the public interface lets her do it. The code which we are developing
right now will need to answer these “uncomfortable” questions definitely, if we
are to develop a “protected” code.
The STL library, de jure and de facto, is part of the standard C++ language. Will we be able to use
the STL library in highly efficient applications? First of all, we need
o be careful when choosing different implementations of the STL library. The
very fact that the library has more than one implementation tells us that it is
very hard to develop a library that would be optimal in all cases. That is why we
are going to write a couple of STL containers that will be different from
already existing ones and that will satisfy the standard. We will need to do
this to be able to implement these containers in our programs.
Let’s imagine that we need to create the container from a million little pieces that are organized in
a list, perform operations while looking through the list, and then destroy that
list. Most of the implementations of the STL container lists will call the
destructor a million times for the elements of the container and this will take
a significant and sometimes critical amount of time. What if all of the elements
for the container were created using external memory that was allocated by an
allocator and all we are responsible for is to make sure that if more memory is
needed for an element of the container that this is also allocated on an
external allocator? In this case calling the destructor is not necessary. All
we need to do is clear the container, which one command, and to reset the
allocator, which is another command, and we are ready to reuse the container
and the allocator. In this case, we use only two commands, but if we used the
classical STL implementation, we would have had to call the destructor a
million times which equals a million commands.
Let’s look at the allocation of memory more carefully.
Every time we allocate memory of the same size. It would be great if our
allocator could reuse freed blocks of memory multiple times. We have finally come to
the idea of an extended version of the basic allocator, which is able not only
to allocate blocks of memory of different sizes and to free the memory using
the principle “all or nothing”. The block allocator is able to allocate blocks
of memory of a certain size, and, more importantly, to free these blocks of
memory and reuse them.
Great! We now have a new type, block allocator, which we will use in places where the allocation/freeing
of memory blocks of a certain length is needed, such as in STL containers
list<>, stack<>, and map<>.
We will probably want
to have similar containers with simpler interfaces, and, more importantly,
which are supposed to call destructors for the elements of the container and
release memory to the block allocator. We could combine an affective allocator
with a more complete functionality of STL containers.
To be published...
Memory
Heap.
Let us, dear reader, try something more complex than
STL container. Of course, the author does not think that the memory heap can be looked at as
something more complex than the already described technology of the block
allocator. Let’s imagine, that we can sacrifice
memory in order to make the allocation more effective, and make the blocks a
bit bigger than we need them to be. For instance, if the user is requesting blocks
of memory 1-16 bytes, we will allocate a block of 16 bytes; if the user is
requesting a block 17-32 bytes, we will allocate a block of 32 bytes, and so on
and so forth. Then we will be able to develop a memory heap that would combine
the allocation and deallocation of blocks of different sizes.
Where would this type of memory heap be more
effective? For instance, it would be more effective during the allocation and
deallocation of memory for lines of different lengths.
Asynchronous IO gate as the next layer above the socket pool.
Asynchronous IO gate
For the developed socket pool to be fully functional, it is necessary to allocate memory for the
incoming messages and to save the already allocated memory until the
asynchronous operation “send” completes. IN this chapter, we will describe a
code that will greatly ease the process of working with the socket pool, which
will demonstrate the power of the C++ language As we do not know what length
the buffer for outgoing bytes through socket connections, we can break all the
messages into blocks of a fixed size (or less), for instance, 64Kb. The fact
that we are breaking outgoing messages into multiple blocks does not contradict
the logic of sending messages through sockets, as on a more basic level the
messages are broken into smaller packets anyways.
Can we apply the same logic to incoming messages? If we knew the size of the incoming message ahead
of time, it would be possible. However, theoretically, the incoming message can
be a stream of bytes, which is impossible to break into separate messages. This
is why the answer to this question is no, we cannot treat incoming messages as
a complete package. That is why we will just use a buffer with a fixed size.
The above code lets us initiate asynchronous send commands from any thread and stargate guarantees
the sending of messages in the same order that they were initiated in. The processing
of the incoming messages within one socket connection will be done by only one
thread. If this does not happen, the following could happen. One socket connection
thread (1) calls callback, and another thread for the next incoming message calls
callback for the same message. Depending on the current CPU usage, the latter
incoming message might be processed first.
That is why we will take the job of synchronizing threads for the processing of
incoming messages upon ourselves and ensure that the threads are synchronized.
Database.
Many of our respectful readers have already seen in their work the necessity for the
development of programs that communicate with databases. Even though developers
of database servers offer portable solutions for client libraries, most of the
time they are C libraries with a number of API functions. The development of
STL programs that use API is not one of the most enjoyable tasks. Moreover, the
transition from one database server to another, such as MySQL, Oracle, and
MSSQL will require changes to the already developed modules. Of course, we can just
use ODBC API for access to all databases. The drawbacks of this method are that
it does not support multiplatform full functionality and the C style of ODBC
API.
The idea of having one universal C++ interface, that is maximally independent from any one database
server, is very attractive. One of the important parts of the interface is the
ability to perform SQL requests asynchronously.
ODBC DB access
ORACLE DB access
MYSQL DB access
|