GiantSoft

August 18, 2009

Interrupts and Context Switching

Filed under: Linux, Software Programming — minitia @ 11:39 pm

Interrupts and Context Switching
To drastically simplify how computers work, you could say that computers do nothing more that shuffle bits (i.e. 1s and 0s) around. All computer data is based on these binary digits, which are represented in computers as voltages (5 V for a 1 and 0 V for a 0), and these voltages are physically manipulated through transistors, circuits, etc… When you get into the guts of a computer and start looking at how they work, it seems amazing how many operations it takes to do something simple, like addition or multiplication. Of course, computers have gotten a lot smaller and thus a lot faster, to the point where they can perform millions of these operations per second, so it still feels fast. The processor is performing these operations in a serial fashion – basically a single-file line of operations.

(more…)

August 12, 2009

High-Performance Server Architecture

Filed under: Software Programming — minitia @ 12:18 am

Introduction

The purpose of this document is to share some ideas that I’ve
developed over the years about how to develop a certain kind of
application for which the term "server" is only a weak approximation.
More accurately, I’ll be writing about a broad class of programs that
are designed to handle very large numbers of discrete messages or
requests per second. Network servers most commonly fit this definition,
but not all programs that do are really servers in any sense of the
word. For the sake of simplicity, though, and because "High-Performance
Request-Handling Programs" is a really lousy title, we’ll just say
"server" and be done with it.

I will not be writing about "mildly parallel" applications, even
though multitasking within a single program is now commonplace. The
browser you’re using to read this probably does some things in
parallel, but such low levels of parallelism really don’t introduce
many interesting challenges. The interesting challenges occur when the
request-handling infrastructure itself is the limiting factor on
overall performance, so that improving the infrastructure actually
improves performance. That’s not often the case for a browser running
on a gigahertz processor with a gigabyte of memory doing six
simultaneous downloads over a DSL line. The focus here is not on
applications that sip through a straw but on those that drink from a
firehose, on the very edge of hardware capabilities where how you do it
really does matter.

Some people will inevitably take issue with some of my comments and
suggestions, or think they have an even better way. Fine. I’m not
trying to be the Voice of God here; these are just methods that I’ve
found to work for me, not only in terms of their effects on performance
but also in terms of their effects on the difficulty of debugging or
extending code later. Your mileage may vary. If something else works
better for you that’s great, but be warned that almost everything I
suggest here exists as an alternative to something else that I tried
once only to be disgusted or horrified by the results. Your pet idea
might very well feature prominently in one of these stories, and
innocent readers might be bored to death if you encourage me to start
telling them. You wouldn’t want to hurt them, would you?

The rest of this article is going to be centered around what I’ll call the Four Horsemen of Poor Performance:

Data copiesContext switchesMemory allocationLock contention

There will also be a catch-all section at the end, but these are the
biggest performance-killers. If you can handle most requests without
copying data, without a context switch, without going through the
memory allocator and without contending for locks, you’ll have a server
that performs well even if it gets some of the minor parts wrong.

Data Copies

This could be a very short section, for one very simple reason: most
people have learned this lesson already. Everybody knows data copies
are bad; it’s obvious, right? Well, actually, it probably only seems
obvious because you learned it very early in your computing career, and
that only happened because somebody started putting out the word
decades ago. I know that’s true for me, but I digress. Nowadays it’s
covered in every school curriculum and in every informal how-to. Even
the marketing types have figured out that "zero copy" is a good
buzzword.

Despite the after-the-fact obviousness of copies being bad, though,
there still seem to be nuances that people miss. The most important of
these is that data copies are often hidden and disguised. Do you really
know whether any code you call in drivers or libraries does data
copies? It’s probably more than you think. Guess what "Programmed I/O"
on a PC refers to. An example of a copy that’s disguised rather than
hidden is a hash function, which has all the memory-access cost of a
copy and also involves more computation. Once it’s pointed out that
hashing is effectively "copying plus" it seems obvious that it should
be avoided, but I know at least one group of brilliant people who had
to figure it out the hard way. If you really want to get rid of data
copies, either because they really are hurting performance or because
you want to put "zero-copy operation" on your hacker-conference slides,
you’ll need to track down a lot of things that really are data copies
but don’t advertise themselves as such.

The tried and true method for avoiding data copies is to use
indirection, and pass buffer descriptors (or chains of buffer
descriptors) around instead of mere buffer pointers. Each descriptor
typically consists of the following:

A pointer and length for the whole buffer.A pointer and length, or offset and length, for the part of the buffer that’s actually filled.Forward and back pointers to other buffer descriptors in a list.A reference count.

Now, instead of copying a piece of data to make sure it stays in
memory, code can simply increment a reference count on the appropriate
buffer descriptor. This can work extremely well under some conditions,
including the way that a typical network protocol stack operates, but
it can also become a really big headache. Generally speaking, it’s easy
to add buffers at the beginning or end of a chain, to add references to
whole buffers, and to deallocate a whole chain at once. Adding in the
middle, deallocating piece by piece, or referring to partial buffers
will each make life increasingly difficult. Trying to split or combine
buffers will simply drive you insane.

I don’t actually recommend using this approach for everything, though. Why not? Because it gets to be a huge
pain when you have to walk through descriptor chains every time you
want to look at a header field. There really are worse things than data
copies. I find that the best thing to do is to identify the large
objects in a program, such as data blocks, make sure those get allocated separately as described above so that they don’t need to be copied, and not sweat too much about the other stuff.

This brings me to my last point about data copies: don’t go
overboard avoiding them. I’ve seen way too much code that avoids data
copies by doing something even worse, like forcing a context switch or
breaking up a large I/O request. Data copies are expensive, and when
you’re looking for places to avoid redundant operations they’re one of
the first things you should look at, but there is a point of
diminishing returns. Combing through code and then making it twice as
complicated just to get rid of that last few data copies is usually a
waste of time that could be better spent in other ways.

Context Switches

Whereas everyone thinks it’s obvious that data copies are bad, I’m
often surprised by how many people totally ignore the effect of context
switches on performance. In my experience, context switches are
actually behind more total "meltdowns" at high load than data copies;
the system starts spending more time going from one thread to another
than it actually spends within any thread doing useful work. The
amazing thing is that, at one level, it’s totally obvious what causes
excessive context switching. The #1 cause of context switches is having
more active threads than you have processors. As the ratio of active
threads to processors increases, the number of context switches also
increases – linearly if you’re lucky, but often exponentially. This
very simple fact explains why multi-threaded designs that have one
thread per connection scale very poorly. The only realistic alternative
for a scalable system is to limit the number of active threads so it’s
(usually) less than or equal to the number of processors. One popular
variant of this approach is to use only one thread, ever; while such an
approach does avoid context thrashing, and avoids the need for locking
as well, it is also incapable of achieving more than one processor’s
worth of total throughput and thus remains beneath contempt unless the
program will be non-CPU-bound (usually network-I/O-bound) anyway.

The first thing that a "thread-frugal" program has to do is figure
out how it’s going to make one thread handle multiple connections at
once. This usually implies a front end that uses select/poll,
asynchronous I/O, signals or completion ports, with an event-driven
structure behind that. Many "religious wars" have been fought, and
continue to be fought, over which of the various front-end APIs is
best. Dan Kegel’s C10K paper
is a good resource is this area. Personally, I think all flavors of
select/poll and signals are ugly hacks, and therefore favor either AIO
or completion ports, but it actually doesn’t matter that much. They all
- except maybe select() – work reasonably well, and don’t really do
much to address the matter of what happens past the very outermost
layer of your program’s front end.

The simplest conceptual model of a multi-threaded event-driven
server has a queue at its center; requests are read by one or more
"listener" threads and put on queues, from which one or more "worker"
threads will remove and process them. Conceptually, this is a good
model, but all too often people actually implement their code this way.
Why is this wrong? Because the #2 cause of context switches is
transferring work from one thread to another. Some people even compound
the error by requiring that the response to a request be sent by the
original thread – guaranteeing not one but two context switches per
request. It’s very important to use a "symmetric" approach in which a
given thread can go from being a listener to a worker to a listener
again without ever changing context. Whether this involves partitioning
connections between threads or having all threads take turns being
listener for the entire set of connections seems to matter a lot less.

Usually, it’s not possible to know how many threads will be active
even one instant into the future. After all, requests can come in on
any connection at any moment, or "background" threads dedicated to
various maintenance tasks could pick that moment to wake up. If you
don’t knowlimit
how many are active? In my experience, one of the most effective
approaches is also one of the simplest: use an old-fashioned counting
semaphore which each thread must hold whenever it’s doing "real work".
If the thread limit has already been reached then each listen-mode
thread might incur one extra context switch as it wakes up and then
blocks on the semaphore, but once all listen-mode threads have blocked
in this way they won’t continue contending for resources until one of
the existing threads "retires" so the system effect is negligible. More
importantly, this method handles maintenance threads – which sleep most
of the time and therefore dont’ count against the active thread count -
more gracefully than most alternatives. how many threads are active, how can you

Once the processing of requests has been broken up into two stages
(listener and worker) with multiple threads to service the stages, it’s
natural to break up the processing even further into more than two
stages. In its simplest form, processing a request thus becomes a
matter of invoking stages successively in one direction, and then in
the other (for replies). However, things can get more complicated; a
stage might represent a "fork" between two processing paths which
involve different stages, or it might generate a reply (e.g. a cached
value) itself without invoking further stages. Therefore, each stage
needs to be able to specify "what should happen next" for a request.
There are three possibilities, represented by return values from the
stage’s dispatch function:

The request needs to be passed on to another stage (an ID or pointer in the return value).The request has been completed (a special "request done" return value)The request was blocked (a special "request blocked" return value).
This is equivalent to the previous case, except that the request is not
freed and will be continued later from another thread.

Note that, in this model, queuing of requests is done within
stages, not between stages. This avoids the common silliness of
constantly putting a request on a successor stage’s queue, then
immediately invoking that successor stage and dequeuing the request
again; I call that lots of queue activity – and locking – for nothing.

If this idea of separating a complex task into multiple smaller
communicating parts seems familiar, that’s because it’s actually very
old. My approach has its roots in the Communicating Sequential Processes
concept elucidated by C.A.R. Hoare in 1978, based in turn on ideas from
Per Brinch Hansen and Matthew Conway going back to 1963 – before I was
born! However, when Hoare coined the term CSP he meant "process" in the
abstract mathematical sense, and a CSP process need bear no relation to
the operating-system entities of the same name. In my opinion, the
common approach of implementing CSP via thread-like coroutines within a
single OS thread gives the user all of the headaches of concurrency
with none of the scalability.

A contemporary example of the staged-execution idea evolved in a saner direction is Matt Welsh’s SEDA.
In fact, SEDA is such a good example of "server architecture done
right" that it’s worth commenting on some of its specific
characteristics (especially where those differ from what I’ve outlined
above).

SEDA’s "batching" tends to emphasize processing multiple requests
through a stage at once, while my approach tends to emphasize
processing a single request through multiple stages at once.SEDA’s one significant flaw, in my opinion, is that it allocates a
separate thread pool to each stage with only "background" reallocation
of threads between stages in response to load. As a result, the #1 and
#2 causes of context switches noted above are still very much present.In the context of an academic research project, implementing SEDA
in Java might make sense. In the real world, though, I think the choice
can be characterized as unfortunate.
Memory Allocation

Allocating and freeing memory is one of the most common operations
in many applications. Accordingly, many clever tricks have been
developed to make general-purpose memory allocators more efficient.
However, no amount of cleverness can make up for the fact that the very
generality of such allocators inevitably makes them far less efficient
than the alternatives in many cases. I therefore have three suggestions
for how to avoid the system memory allocator altogether.

Suggestion #1 is simple preallocation. We all know that static
allocation is bad when it imposes artificial limits on program
functionality, but there are many other forms of preallocation that can
be quite beneficial. Usually the reason comes down to the fact that one
trip through the system memory allocator is better than several, even
when some memory is "wasted" in the process. Thus, if it’s possible to
assert that no more than N items could ever be in use at once,
preallocation at program startup might be a valid choice. Even when
that’s not the case, preallocating everything that a request handler
might need right at the beginning might be preferable to allocating
each piece as it’s needed; aside from the possibility of allocating
multiple items contiguously in one trip through the system allocator,
this often greatly simplifies error-recovery code. If memory is very
tight then preallocation might not be an option, but in all but the
most extreme circumstances it generally turns out to be a net win.

Suggestion #2 is to use lookaside lists for objects that are
allocated and freed frequently. The basic idea is to put recently-freed
objects onto a list instead of actually freeing them, in the hope that
if they’re needed again soon they need merely be taken off the list
instead of being allocated from system memory. As an additional
benefit, transitions to/from a lookaside list can often be implemented
to skip complex object initialization/finalization.

It’s generally undesirable to have lookaside lists grow without
bound, never actually freeing anything even when your program is idle.
Therefore, it’s usually necessary to have some sort of periodic
"sweeper" task to free inactive objects, but it would also be
undesirable if the sweeper introduced undue locking complexity or
contention. A good compromise is therefore a system in which a
lookaside list actually consists of separately locked "old" and "new"
lists. Allocation is done preferentially from the new list, then from
the old list, and from the system only as a last resort; objects are
always freed onto the new list. The sweeper thread operates as follows:

Lock both lists.Save the head for the old list.Make the (previously) new list into the old list by assigning list heads.Unlock.Free everything on the saved old list at leisure.

Objects in this sort of system are only actually freed when they
have not been needed for at least one full sweeper interval, but always
less than two. Most importantly, the sweeper does most of its work
without holding any locks to contend with regular threads. In theory,
the same approach can be generalized to more than two stages, but I
have yet to find that useful.

One concern with using lookaside lists is that the list pointers
might increase object size. In my experience, most of the objects that
I’d use lookaside lists for already contain list pointers anyway, so
it’s kind of a moot point. Even if the pointers were only needed for
the lookaside lists, though, the savings in terms of avoided trips
through the system memory allocator (and object initialization) would
more than make up for the extra memory.

Suggestion #3 actually has to do with locking, which we haven’t
discussed yet, but I’ll toss it in anyway. Lock contention is often the
biggest cost in allocating memory, even when lookaside lists are in
use. One solution is to maintain multiple private lookaside lists, such
that there’s absolutely no possibility of contention for any one list.
For example, you could have a separate lookaside list for each thread.
One list per processor can be even better, due to cache-warmth
considerations, but only works if threads cannot be preempted. The
private lookaside lists can even be combined with a shared list if
necessary, to create a system with extremely low allocation overhead.

Lock Contention

Efficient locking schemes are notoriously hard to design, because of
what I call Scylla and Charybdis after the monsters in the Odyssey.
Scylla is locking that’s too simplistic and/or coarse-grained,
serializing activities that can or should proceed in parallel and thus
sacrificing performance and scalability; Charybdis is overly complex or
fine-grained locking, with space for locks and time for lock operations
again sapping performance. Near Scylla are shoals representing deadlock
and livelock conditions; near Charybdis are shoals representing race
conditions. In between, there’s a narrow channel that represents
locking which is both efficient and correct…or is there? Since
locking tends to be deeply tied to program logic, it’s often impossible
to design a good locking scheme without fundamentally changing how the
program works. This is why people hate locking, and try to rationalize
their use of non-scalable single-threaded approaches.

Almost every locking scheme starts off as "one big lock around
everything" and a vague hope that performance won’t suck. When that
hope is dashed, and it almost always is, the big lock is broken up into
smaller ones and the prayer is repeated, and then the whole process is
repeated, presumably until performance is adequate. Often, though, each
iteration increases complexity and locking overhead by 20-50% in return
for a 5-10% decrease in lock contention. With luck, the net result is
still a modest increase in performance, but actual decreases are not
uncommon. The designer is left scratching his head (I use "his" because
I’m a guy myself; get over it). "I made the locks finer grained like
all the textbooks said I should," he thinks, "so why did performance
get worse?"

In my opinion, things got worse because the aforementioned
approach is fundamentally misguided. Imagine the "solution space" as a
mountain range, with high points representing good solutions and low
points representing bad ones. The problem is that the "one big lock"
starting point is almost always separated from the higher peaks by all
manner of valleys, saddles, lesser peaks and dead ends. It’s a classic
hill-climbing problem; trying to get from such a starting point to the
higher peaks only by taking small steps and never going downhill almost
never works. What’s needed is a fundamentally different way of approaching the peaks.

The first thing you have to do is form a mental map of your program’s locking. This map has two axes:

The vertical axis represents code. If you’re using a staged
architecture with non-branching stages, you probably already have a
diagram showing these divisions, like the ones everybody uses for
OSI-model network protocol stacks.The horizontal axis represents data. In every stage, each request
should be assigned to a data set with its own resources separate from
any other set.

You now have a grid, where each cell represents a particular data
set in a particular processing stage. What’s most important is the
following rule: two requests should not be in contention unless they
are in the same data set and the same processing stage. If you can manage that, you’ve already won half the battle.

Once you’ve defined the grid, every type of locking your program
does can be plotted, and your next goal is to ensure that the resulting
dots are as evenly distributed along both axes as possible.
Unfortunately, this part is very application-specific. You have to
think like a diamond-cutter, using your knowledge of what the program
does to find the natural "cleavage lines" between stages and data sets.
Sometimes they’re obvious to start with. Sometimes they’re harder to
find, but seem more obvious in retrospect. Dividing code into stages is
a complicated matter of program design, so there’s not much I can offer
there, but here are some suggestions for how to define data sets:

If you have some sort of a block number or hash or transaction ID
associated with requests, you can rarely do better than to divide that
value by the number of data sets.Sometimes, it’s better to assign requests to data sets dynamically,
based on which data set has the most resources available rather than
some intrinsic property of the request. Think of it like multiple
integer units in a modern CPU; those guys know a thing or two about
making discrete requests flow through a system.It’s often helpful to make sure that the data-set assignment is
different for each stage, so that requests which would contend at one
stage are guaranteed not to do so at another stage.

If you’ve divided your "locking space" both vertically and
horizontally, and made sure that lock activity is spread evenly across
the resulting cells, you can be pretty sure that your locking is in
pretty good shape. There’s one more step, though. Do you remember the
"small steps" approach I derided a few paragraphs ago? It still has its
place, because now you’re at a good starting point instead of a
terrible one. In metaphorical terms you’re probably well up the slope
on one of the mountain range’s highest peaks, but you’re probably not
at the top of one. Now is the time to collect contention statistics and
see what you need to do to improve, splitting stages and data sets in
different ways and then collecting more statistics until you’re
satisfied. If you do all that, you’re sure to have a fine view from the
mountaintop.

Other Stuff

As promised, I’ve covered the four biggest performance problems in
server design. There are still some important issues that any
particular server will need to address, though. Mostly, these come down
to knowing your platform/environment:

How does your storage subsystem perform with larger vs. smaller
requests? With sequential vs. random? How well do read-ahead and
write-behind work?How efficient is the network protocol you’re using? Are there
parameters or flags you can set to make it perform better? Are there
facilities like TCP_CORK, MSG_PUSH, or the Nagle-toggling trick that
you can use to avoid tiny messages?Does your system support scatter/gather I/O (e.g. readv/writev)?
Using these can improve performance and also take much of the pain out
of using buffer chains.What’s your page size? What’s your cache-line size? Is it worth it
to align stuff on these boundaries? How expensive are system calls or
context switches, relative to other things?Are your reader/writer lock primitives subject to starvation? Of
whom? Do your events have "thundering herd" problems? Does your
sleep/wakeup have the nasty (but very common) behavior that when X
wakes Y a context switch to Y happens immediately even if X still has
things to do?

I’m sure I could think of many more questions in this vein. I’m sure
you could too. In any particular situation it might not be worthwhile
to do anything about any one of these issues, but it’s usually worth at
least thinking about them. If you don’t know the answers – many of
which you will not find in the system documentation – find out.
Write a test program or micro-benchmark to find the answers
empirically; writing such code is a useful skill in and of itself
anyway. If you’re writing code to run on multiple platforms, many of
these questions correlate with points where you should probably be
abstracting functionality into per-platform libraries so you can
realize a performance gain on that one platform that supports a
particular feature.

The "know the answers" theory applies to your own code, too. Figure
out what the important high-level operations in your code are, and time
them under different conditions. This is not quite the same as
traditional profiling; it’s about measuring design elements,
not actual implementations. Low-level optimization is generally the
last resort of someone who screwed up the design.

August 10, 2009

Anatomy of the Linux kernel

Filed under: Linux — minitia @ 10:42 pm

From:http://www.ibm.com/developerworks/linux/library/l-linux-slab-allocator/

 

August 6, 2009

Basic Input/Output

Filed under: C and C++ — minitia @ 2:40 am

Until now, the example programs of previous sections provided very little interaction with the user, if any at all. Using the standard input and output library, we will be able to interact with the user by printing messages on the screen and getting the user’s input from the keyboard.

C++ uses a convenient abstraction called streams to perform input and output operations in sequential media such as the screen or the keyboard. A stream is an object where a program can either insert or extract characters to/from it. We do not really need to care about many specifications about the physical media associated with the stream – we only need to know it will accept or provide characters sequentially.

The standard C++ library includes the header file iostream, where the standard input and output stream objects are declared.
(more…)

Variables. Data Types.

Filed under: C and C++ — minitia @ 12:42 am

The usefulness of the "Hello World" programs shown in the previous section is quite questionable. We had to write several lines of code, compile them, and then execute the resulting program just to obtain a simple sentence written on the screen as result. It certainly would have been much faster to type the output sentence by ourselves. However, programming is not limited only to printing simple texts on the screen. In order to go a little further on and to become able to write programs that perform useful tasks that really save us work we need to introduce the concept of variable.

Let us think that I ask you to retain the number 5 in your mental memory, and then I ask you to memorize also the number 2 at the same time. You have just stored two different values in your memory. Now, if I ask you to add 1 to the first number I said, you should be retaining the numbers 6 (that is 5+1) and 2 in your memory. Values that we could now for example subtract and obtain 4 as result.

The whole process that you have just done with your mental memory is a simile of what a computer can do with two variables. The same process can be expressed in C++ with the following instruction set:

a = 5;
b = 2;
a = a + 1;
result = a – b;

Obviously, this is a very simple example since we have only used two small integer values, but consider that your computer can store millions of numbers like these at the same time and conduct sophisticated mathematical operations with them.

Therefore, we can define a variable as a portion of memory to store a determined value.

Each variable needs an identifier that distinguishes it from the others, for example, in the previous code the variable identifiers were a, b and result, but we could have called the variables any names we wanted to invent, as long as they were valid identifiers.

Identifiers
A valid identifier is a sequence of one or more letters, digits or underscore characters (_). Neither spaces nor punctuation marks or symbols can be part of an identifier. Only letters, digits and single underscore characters are valid. In addition, variable identifiers always have to begin with a letter. They can also begin with an underline character (_ ), but in some cases these may be reserved for compiler specific keywords or external identifiers, as well as identifiers containing two successive underscore characters anywhere. In no case they can begin with a digit.

Another rule that you have to consider when inventing your own identifiers is that they cannot match any keyword of the C++ language nor your compiler’s specific ones, which are reserved keywords. The standard reserved keywords are:

asm, auto, bool, break, case, catch, char, class, const, const_cast, continue, default, delete, do, double, dynamic_cast, else, enum, explicit, export, extern, false, float, for, friend, goto, if, inline, int, long, mutable, namespace, new, operator, private, protected, public, register, reinterpret_cast, return, short, signed, sizeof, static, static_cast, struct, switch, template, this, throw, true, try, typedef, typeid, typename, union, unsigned, using, virtual, void, volatile, wchar_t, while

Additionally, alternative representations for some operators cannot be used as identifiers since they are reserved words under some circumstances:

and, and_eq, bitand, bitor, compl, not, not_eq, or, or_eq, xor, xor_eq

Your compiler may also include some additional specific reserved keywords.

Very important: The C++ language is a "case sensitive" language. That means that an identifier written in capital letters is not equivalent to another one with the same name but written in small letters. Thus, for example, the RESULT variable is not the same as the result variable or the Result variable. These are three different variable identifiers.

Fundamental data types
When programming, we store the variables in our computer’s memory, but the computer has to know what kind of data we want to store in them, since it is not going to occupy the same amount of memory to store a simple number than to store a single letter or a large number, and they are not going to be interpreted the same way.

The memory in our computers is organized in bytes. A byte is the minimum amount of memory that we can manage in C++. A byte can store a relatively small amount of data: one single character or a small integer (generally an integer between 0 and 255). In addition, the computer can manipulate more complex data types that come from grouping several bytes, such as long numbers or non-integer numbers.

Next you have a summary of the basic fundamental data types in C++, as well as the range of values that can be represented with each one:

NameDescriptionSize*Range*
char
Character or small integer.
1byte
signed: -128 to 127
unsigned: 0 to 255
short int
(short)
Short Integer.
2bytes
signed: -32768 to 32767
unsigned: 0 to 65535
int
Integer.
4bytes
signed: -2147483648 to 2147483647
unsigned: 0 to 4294967295
long int
(long)
Long integer.
4bytes
signed: -2147483648 to 2147483647
unsigned: 0 to 4294967295
bool
Boolean value. It can take one of two values: true or false.
1byte
true or false
float
Floating point number.
4bytes
+/- 3.4e +/- 38 (~7 digits)
double
Double precision floating point number.
8bytes
+/- 1.7e +/- 308 (~15 digits)
long double
Long double precision floating point number.
8bytes
+/- 1.7e +/- 308 (~15 digits)
wchar_t
Wide character.
2 or 4 bytes
1 wide character

* The values of the columns Size and Range depend on the system the program is compiled for. The values shown above are those found on most 32-bit systems. But for other systems, the general specification is that int has the natural size suggested by the system architecture (one "word") and the four integer types char, short, int and long must each one be at least as large as the one preceding it, with char being always 1 byte in size. The same applies to the floating point types float, double and long double, where each one must provide at least as much precision as the preceding one.

Declaration of variables
In order to use a variable in C++, we must first declare it specifying which data type we want it to be. The syntax to declare a new variable is to write the specifier of the desired data type (like int, bool, float…) followed by a valid variable identifier. For example:

int a;
float mynumber;

These are two valid declarations of variables. The first one declares a variable of type int with the identifier a. The second one declares a variable of type float with the identifier mynumber. Once declared, the variables a and mynumber can be used within the rest of their scope in the program.

If you are going to declare more than one variable of the same type, you can declare all of them in a single statement by separating their identifiers with commas. For example:

int a, b, c;

This declares three variables (a, b and c), all of them of type int, and has exactly the same meaning as:

int a;
int b;
int c;

The integer data types char, short, long and int can be either signed or unsigned depending on the range of numbers needed to be represented. Signed types can represent both positive and negative values, whereas unsigned types can only represent positive values (and zero). This can be specified by using either the specifier signed or the specifier unsigned before the type name. For example:

unsigned short int NumberOfSisters;
signed int MyAccountBalance;

By default, if we do not specify either signed or unsigned most compiler settings will assume the type to be signed, therefore instead of the second declaration above we could have written:

int MyAccountBalance;

with exactly the same meaning (with or without the keyword signed)

An exception to this general rule is the char type, which exists by itself and is considered a different fundamental data type from signed char and unsigned char, thought to store characters. You should use either signed or unsigned if you intend to store numerical values in a char-sized variable.

short and long can be used alone as type specifiers. In this case, they refer to their respective integer fundamental types: short is equivalent to short int and long is equivalent to long int. The following two variable declarations are equivalent:

short Year;
short int Year;

Finally, signed and unsigned may also be used as standalone type specifiers, meaning the same as signed int and unsigned int respectively. The following two declarations are equivalent:

unsigned NextYear;
unsigned int NextYear;

To see what variable declarations look like in action within a program, we are going to see the C++ code of the example about your mental memory proposed at the beginning of this section:

// operating with variables

#include <iostream>
using namespace std;

int main ()
{
// declaring variables:
int a, b;
int result;

// process:
a = 5;
b = 2;
a = a + 1;
result = a – b;

// print out the result:
cout << result;

// terminate the program:
return 0;
}

4

Do not worry if something else than the variable declarations themselves looks a bit strange to you. You will see the rest in detail in coming sections.

Scope of variables
All the variables that we intend to use in a program must have been declared with its type specifier in an earlier point in the code, like we did in the previous code at the beginning of the body of the function main when we declared that a, b, and result were of type int.

A variable can be either of global or local scope. A global variable is a variable declared in the main body of the source code, outside all functions, while a local variable is one declared within the body of a function or a block.

Global variables can be referred from anywhere in the code, even inside functions, whenever it is after its declaration.

The scope of local variables is limited to the block enclosed in braces ({}) where they are declared. For example, if they are declared at the beginning of the body of a function (like in function main) their scope is between its declaration point and the end of that function. In the example above, this means that if another function existed in addition to main, the local variables declared in main could not be accessed from the other function and vice versa.

Initialization of variables
When declaring a regular local variable, its value is by default undetermined. But you may want a variable to store a concrete value at the same moment that it is declared. In order to do that, you can initialize the variable. There are two ways to do this in C++:

The first one, known as c-like, is done by appending an equal sign followed by the value to which the variable will be initialized:

type identifier = initial_value ;

For example, if we want to declare an int variable called a initialized with a value of 0 at the moment in which it is declared, we could write:

int a = 0;

The other way to initialize variables, known as constructor initialization, is done by enclosing the initial value between parentheses (()):

type identifier (initial_value) ;

For example:

int a (0);

Both ways of initializing variables are valid and equivalent in C++.

// initialization of variables

#include <iostream>
using namespace std;

int main ()
{
int a=5; // initial value = 5
int b(2); // initial value = 2
int result; // initial value undetermined

a = a + 3;
result = a – b;
cout << result;

return 0;
}

6

Introduction to strings
Variables that can store non-numerical values that are longer than one single character are known as strings.

The C++ language library provides support for strings through the standard string class. This is not a fundamental type, but it behaves in a similar way as fundamental types do in its most basic usage.

A first difference with fundamental data types is that in order to declare and use objects (variables) of this type we need to include an additional header file in our source code: <string> and have access to the std namespace (which we already had in all our previous programs thanks to the using namespace statement).

// my first string
#include <iostream>
#include <string>
using namespace std;

int main ()
{
string mystring = "This is a string";
cout << mystring;
return 0;
}

This is a string

As you may see in the previous example, strings can be initialized with any valid string literal just like numerical type variables can be initialized to any valid numerical literal. Both initialization formats are valid with strings:

string mystring = "This is a string";
string mystring ("This is a string");

Strings can also perform all the other basic operations that fundamental data types can, like being declared without an initial value and being assigned values during execution:

// my first string
#include <iostream>
#include <string>
using namespace std;

int main ()
{
string mystring;
mystring = "This is the initial string content";
cout << mystring << endl;
mystring = "This is a different string content";
cout << mystring << endl;
return 0;
}

This is the initial string content
This is a different string content

For more details on C++ strings, you can have a look at the string class reference.

 

Previous:
Structure of a program

index
Next:
Constants

 

August 5, 2009

利用 squid 反向代理提高网站性能

Filed under: C and C++ — minitia @ 11:42 pm

 本文在介绍 squid 反向代理的工作原理的基础上,指出反向代理技术在提高网站访问速度,增强网站可用性、安全性方面有很好的用途。作者在具体的实验环境下,利用 DNS 轮询和 Squid 反向代理技术,实现了网站的负载均衡,从而提高了网站的可用性和可靠性。
    本文在介绍 squid 反向代理的工作原理的基础上,指出反向代理技术在提高网站访问速度,增强网站可用性、安全性方面有很好的用途。作者在具体的实验环境下,利用 DNS 轮询和 Squid 反向代理技术,实现了网站的负载均衡,从而提高了网站的可用性和可靠性。

    现在有许多大型的门户网站如 SINA 都采用 squid 反向代理技术来加速网站的访问速度,可将不同的 URL 请求分发到后台不同的 WEB 服务器上,同时互联网用户只能看到反向代理服务器的地址,加强了网站的访问安全。

    反向代理的概念

    反向代理服务器又称为 WEB 加速服务器,它位于 WEB 服务器的前端,充当 WEB 服

    务器的内容缓存器。其系统结构如图 1

    图 1. 系统结构

   

    反向代理服务器是针对 WEB 服务器设置的,后台 WEB 服务器对互联网用户是透明的,用户只能看到反向代理服务器的地址,不清楚后台 WEB 服务器是如何组织架构的。当互联网用户请求 WEB 服务时,DNS 将请求的域名解析为反向代理服务器的 IP 地址,这样 URL 请求将被发送到反向代理服务器,由反向代理服务器负责处理用户的请求与应答、与后台 WEB 服务器交互。利用反向代理服务器减轻了后台 WEB 服务器的负载,提高了访问速度,同时避免了因用户直接与 WEB 服务器通信带来的安全隐患。

    Squid 反向代理的实现原理

    目前有许多反向代理软件,比较有名的有 Nginx 和 Squid 。 Nginx 是由 Igor Sysoev 为俄罗斯访问量第二的 Rambler.ru 站点开发的,是一个高性能的 HTTP 和反向代理服务器,也是一个 IMAP/POP3/SMTP 代理服务器。

    Squid 是由美国政府大力资助的一项研究计划,其目的为解决网络带宽不足的问题,支持HTTP,HTTPS,FTP 等多种协议,是现在 Unix 系统上使用、最多功能也最完整的一套软体。下面将重点介绍 Squid 反向代理的实现原理和在提高网站性能方面的应用。

    Squid反向代理服务器位于本地 WEB 服务器和 Internet 之间 , 组织架构如图 2:

    图 2. 组织架构

   

    客户端请求访问 WEB 服务时,DNS 将访问的域名解析为 Squid 反向代理服务器的 IP 地址,这样客户端的 URL 请求将被发送到反向代理服务器。如果 Squid 反向代理服务器中缓存了该请求的资源,则将该请求的资源直接返回给客户端,否则反向代理服务器将向后台的 WEB 服务器请求资源,然后将请求的应答返回给客户端,同时也将该应答缓存在本地,供下一个请求者使用。

    Squid 反向代理一般只缓存可缓冲的数据(比如 html 网页和图片等),而一些 CGI 脚本程序或者 ASP、JSP 之类的动态程序默认不缓存。它根据从 WEB 服务器返回的 HTTP 头标记来缓冲静态页面。有四个最重要 HTTP 头标记:

    ●Last-Modified: 告诉反向代理页面什么时间被修改
    ●Expires: 告诉反向代理页面什么时间应该从缓冲区中删除
    ●Cache-Control: 告诉反向代理页面是否应该被缓冲
    ●Pragma: 用来包含实现特定的指令,最常用的是 Pragma:no-cache

利用 Squid 反向代理加速网站实例

    本实例的域名是 wenjin.cache.ibm.com.cn,通过DNS的轮询技术,将客户端的请求分发给其中一台 Squid 反向代理服务器处理,如果这台 Squid 缓存了用户的请求资源,则将请求的资源直接返回给用户,否则这台 Squid 将没有缓存的请求根据配置的规则发送给邻居 Squid 和后台的 WEB 服务器处理,这样既减轻后台 WEB 服务器的负载,又提高整个网站的性能和安全性。该系统结构图 3 如下:

    图 3. 系统结构

   

    配置的系统环境:

    ●一台 DNS 服务器:操作系统 Freebsd,软件 BIND 9.5,IP 192.168.76.222 ;
    ●三台 Squid 服务器:操作系统 Linux AS 4,软件 Squid 3.0,相应的 IP 如下:

  

Squid1:192.168.76.223
Squid2:192.168.76.224
Squid3:192.168.76.225

    ●三台 WEB 服务器:操作系统 Linux AS 4,应用软件 Tomcat 5.0+Mysql,相应的 IP 地址如下:

  

webServer1:210.82.118.195
webServer2:192.168.76.226
webServer1:192.168.76.227

    应用软件的安装和配置

    配置 DNS 服务器

    软件利用 Freebsd 自带的 bind 9.5 。然后针对该系统配置 bind,首先修改 bind 的配置文件 /etc/namedb/named.conf,在文件中添加

zone "cache.ibm.com.cn"{
type master;
file "master/ cache.ibm.com.cn ";
};

    再在 /etc/namedb/master 目录下添加 cache.ibm.com.cn 文件,该文件的内容如下:

   

$TTL 3600
@ IN SOA search. ibm.com.cn. root. ibm.com.cn. (
20080807 ; Serial
3600 ; Refresh
900 ; Retry
3600000 ; Expire
3600 ) ; Minimum
IN NS search.ibm.com.cn.
1 IN PTR localhost.ibm.com.cn.
wenjin IN A 192.168.76.223
wenjin IN A 192.168.76.224
wenjin IN A 192.168.76.225

    这样当用户请求的时候,DNS 通过轮询机制将 wenjin.cache.ibm.com.cn 的域名解析为 192.168.76.223、192.168.76.224 和 192.168.76.225 其中之一。

    配置完成后,运行 rndc star t 启动 bind 服务。可在 /etc/rc.conf 中设置 named_enable="YES" 使得开机自启动。

    用 ps – A |grep named 查看 bind 服务是否起来;

    用 nslookup wenjin.cache.ibm.com.cn 测试 bind 服务是否正常运行。

配置 Squid1 服务器

    下载 squid-3.0.STABLE8.tar.gz 源码包,将其放在 /home 目录下
    1.解压缩tar – zxvf squid-3.0.STABLE8.tar.gz
    2.设置配置参数:cd squid-3.0.STABLE10

   

./configure – prefix=/usr/local/squid

    将 squid 安装在 /usr/local 目录下
    3.编译安装:make&make install安装完以后会在 /usr/local 目录下看见 squid 目录。
    4.配置 squid 配置文件
    编辑 squid.conf 文件,vi /usr/local/squid/etc/squid.conf

  

cache_effective_user squid
cache_effective_group squid
######### 设定 squid 的主机名 , 如无此项 squid 将无法启动
visible_hostname squid1.nlc.gov.cn
############# 配置 squid 为加速模式 #################
http_port 80 accel vhost vport
icp_port 3130
##### 配置 squid2、squid3 为其邻居,当 squid1 在其缓存中没有找到请求的资源时,
通过 ICP 查询去其邻居中取得缓存
cache_peer squid2.ibm.com.cn sibling 80 3130
cache_peer squid3.ibm.com.cn sibling 80 3130
##### squid1 的三个父节点,originserver 参数指明是源服务器,
round-robin 参数指明 squid 通过轮询方式将请求分发到其中一台父节点;
squid 同时会对这些父节点的健康状态进行检查,如果父节点 down 了,
那么 squid 会从剩余的 origin 服务器中抓取数据
cache_peer 210.82.118.195 parent 8080 0 no-query originserver round-robin \
name=webServer1
cache_peer 192.168.76.226 parent 8080 0 no-query originserver round-robin \
name=webServer2
cache_peer 192.168.76.227 parent 8080 0 no-query originserver round-robin \
name=webServer3
#### 将 wenjin.cache.ibm.com.cn 域的请求通过 RR 轮询方式转发到三个父节点中的一个
cache_peer_domain webServer1 webServer2 webServer3 wenjin.cache.ibm.com.cn
##### 下面是一些访问控制、日志和缓存目录的设置
acl localnet src 192.168.76.223 192.168.76.224 192.168.76.225
acl all src 0.0.0.0/0.0.0.0
http_access allow all
icp_access allow localnet
cache_log /usr/local/squid/var/logs/cache.log
access_log /usr/local/squid/var/logs/access.log squid
cache_dir ufs /usr/local/squid/var/cache/ 1000 16 256
####### 对 squid 的一些优化 ###############
maximum_object_size 10240 KB ### 能缓存的最大对象为 10M
maximum_object_size_in_memory 512 KB ### 内存中缓存的最大对象 512K
cache_mem 256 MB ###squid 用于缓存的内存量

    保存后 :wq 退出。

    在 /etc/hosts 文件中添加

   

192.168.76.223 squid1.ibm.com.cn
192.168.76.224 squid2.ibm.com.cn
192.168.76.225 squid3.ibm.com.cn

    保存后 : wq 退出。

    检查 squid 配置文件正确与否:/usr/local/squid/bin/squid – k parse

    生成缓存目录/usr/local/squid/bin/squid – z

    启动squid:/usr/local/squid/bin/squid

    配置 squid2 和 squid3 服务器

    squid2 和 squid3 服务器的配置方法和配置参数和 squid1 一样,配置完成后,分别启动这两个服务器上的 squid 服务。

    在 squid 的日志文件 cache.log 中,出现如下日志信息则说明三台 squid 之间成功配置为 sibling,且配置了三个父代理。

   

2008/11/17 10:08:47| Configuring Sibling squid1.ibm.com.cn/80/3130
2008/11/17 10:08:47| Configuring Sibling squid3.ibm.com.cn/80/3130
2008/11/17 10:08:47| Configuring Parent 210.82.118.195/8080/0
2008/11/17 10:08:47| Configuring Parent 192.168.76.226/8080/0
2008/11/17 10:08:47| Configuring Parent 192.168.76.227/8080/0
2008/11/17 10:08:47| Ready to serve requests.

    测试

    测试之前,保证 DNS 服务、三台 squid 服务和三台 web 服务都正常起来。在客户端输入http://wenjin.cache.ibm.com.cn,则正确的显示该网页。服务器端的响应对客户端是透明的,客户端不知道请求是由哪台 WEB 服务器处理的;而且其中某台 Squid 服务器或 WEB 服务器发生故障,也不影响服务的正常运行。

    总结

    Squid 是一个开源的软件,利用它的反向代理技术可以提高网站系统的访问速度。本文在真实的网络环境下,利用三台 squid 反向代理服务器加速了网站的性能,同时结合 DNS 轮询技术实现了网站的负载均衡。经过一段时间的测试和试运行,该网站的访问速度和可用性方面都有很大的提高,从未出现过网站服务中断情况。

Linux kernel coding style

Filed under: C and C++ — minitia @ 11:41 pm

kernel/scripts/Lindent is a elegant tool to tidy up code into Linux kernel style.

        Linux kernel coding style

This is a short document describing the preferred coding style for the
linux kernel.  Coding style is very personal, and I won’t _force_ my
views on anybody, but this is what goes for anything that I have to be
able to maintain, and I’d prefer it for most other things too.  Please
at least consider the points made here.

First off, I’d suggest printing out a copy of the GNU coding standards,
and NOT read it.  Burn them, it’s a great symbolic gesture.

Anyway, here goes:

         Chapter 1: Indentation

Tabs are 8 characters, and thus indentations are also 8 characters.
There are heretic movements that try to make indentations 4 (or even 2!)
characters deep, and that is akin to trying to define the value of PI to
be 3.

Rationale: The whole idea behind indentation is to clearly define where
a block of control starts and ends.  Especially when you’ve been looking
at your screen for 20 straight hours, you’ll find it a lot easier to see
how the indentation works if you have large indentations.

Now, some people will claim that having 8-character indentations makes
the code move too far to the right, and makes it hard to read on a
80-character terminal screen.  The answer to that is that if you need
more than 3 levels of indentation, you’re screwed anyway, and should fix
your program.

In short, 8-char indents make things easier to read, and have the added
benefit of warning you when you’re nesting your functions too deep.
Heed that warning.

The preferred way to ease multiple indentation levels in a switch statement is
to align the "switch" and its subordinate "case" labels in the same column
instead of "double-indenting" the "case" labels.  E.g.:

    switch (suffix) {
    case ‘G’:
    case ‘g’:
        mem <<= 30;
        break;
    case ‘M’:
    case ‘m’:
        mem <<= 20;
        break;
    case ‘K’:
    case ‘k’:
        mem <<= 10;
        /* fall through */
    default:
        break;
    }

Don’t put multiple statements on a single line unless you have
something to hide:

    if (condition) do_this;
      do_something_everytime;

Don’t put multiple assignments on a single line either.  Kernel coding style
is super simple.  Avoid tricky expressions.

Outside of comments, documentation and except in Kconfig, spaces are never
used for indentation, and the above example is deliberately broken.

Get a decent editor and don’t leave whitespace at the end of lines.

        Chapter 2: Breaking long lines and strings

Coding style is all about readability and maintainability using commonly
available tools.

The limit on the length of lines is 80 columns and this is a strongly
preferred limit.

Statements longer than 80 columns will be broken into sensible chunks.
Descendants are always substantially shorter than the parent and are placed
substantially to the right. The same applies to function headers with a long
argument list. Long strings are as well broken into shorter strings. The
only exception to this is where exceeding 80 columns significantly increases
readability and does not hide information.

void fun(int a, int b, int c)
{
    if (condition)
        printk(KERN_WARNING "Warning this is a long printk with "
                        "3 parameters a: %u b: %u "
                        "c: %u \n", a, b, c);
    else
        next_statement;
}

        Chapter 3: Placing Braces and Spaces

The other issue that always comes up in C styling is the placement of
braces.  Unlike the indent size, there are few technical reasons to
choose one placement strategy over the other, but the preferred way, as
shown to us by the prophets Kernighan and Ritchie, is to put the opening
brace last on the line, and put the closing brace first, thusly:

    if (x is true) {
        we do y
    }

This applies to all non-function statement blocks (if, switch, for,
while, do).  E.g.:

    switch (action) {
    case KOBJ_ADD:
        return "add";
    case KOBJ_REMOVE:
        return "remove";
    case KOBJ_CHANGE:
        return "change";
    default:
        return NULL;
    }

However, there is one special case, namely functions: they have the
opening brace at the beginning of the next line, thus:

    int function(int x)
    {
        body of function
    }

Heretic people all over the world have claimed that this inconsistency
is …  well …  inconsistent, but all right-thinking people know that
(a) K&R are _right_ and (b) K&R are right.  Besides, functions are
special anyway (you can’t nest them in C).

Note that the closing brace is empty on a line of its own, _except_ in
the cases where it is followed by a continuation of the same statement,
ie a "while" in a do-statement or an "else" in an if-statement, like
this:

    do {
        body of do-loop
    } while (condition);

and

    if (x == y) {
        ..
    } else if (x > y) {
        …
    } else {
        ….
    }

Rationale: K&R.

Also, note that this brace-placement also minimizes the number of empty
(or almost empty) lines, without any loss of readability.  Thus, as the
supply of new-lines on your screen is not a renewable resource (think
25-line terminal screens here), you have more empty lines to put
comments on.

Do not unnecessarily use braces where a single statement will do.

if (condition)
    action();

This does not apply if one branch of a conditional statement is a single
statement. Use braces in both branches.

if (condition) {
    do_this();
    do_that();
} else {
    otherwise();
}

        3.1:  Spaces

Linux kernel style for use of spaces depends (mostly) on
function-versus-keyword usage.  Use a space after (most) keywords.  The
notable exceptions are sizeof, typeof, alignof, and __attribute__, which look
somewhat like functions (and are usually used with parentheses in Linux,
although they are not required in the language, as in: "sizeof info" after
"struct fileinfo info;" is declared).

So use a space after these keywords:
    if, switch, case, for, do, while
but not with sizeof, typeof, alignof, or __attribute__.  E.g.,
    s = sizeof(struct file);

Do not add spaces around (inside) parenthesized expressions.  This example is
*bad*:

    s = sizeof( struct file );

When declaring pointer data or a function that returns a pointer type, the
preferred use of ‘*’ is adjacent to the data name or function name and not
adjacent to the type name.  Examples:

    char *linux_banner;
    unsigned long long memparse(char *ptr, char **retptr);
    char *match_strdup(substring_t *s);

Use one space around (on each side of) most binary and ternary operators,
such as any of these:

    =  +  -  <  >  *  /  %  |  &  ^  <=  >=  ==  !=  ?  :

but no space after unary operators:
    &  *  +  -  ~  !  sizeof  typeof  alignof  __attribute__  defined

no space before the postfix increment & decrement unary operators:
    ++  –

no space after the prefix increment & decrement unary operators:
    ++  –

and no space around the ‘.’ and "->" structure member operators.

Do not leave trailing whitespace at the ends of lines.  Some editors with
"smart" indentation will insert whitespace at the beginning of new lines as
appropriate, so you can start typing the next line of code right away.
However, some such editors do not remove the whitespace if you end up not
putting a line of code there, such as if you leave a blank line.  As a result,
you end up with lines containing trailing whitespace.

Git will warn you about patches that introduce trailing whitespace, and can
optionally strip the trailing whitespace for you; however, if applying a series
of patches, this may make later patches in the series fail by changing their
context lines.

        Chapter 4: Naming

C is a Spartan language, and so should your naming be.  Unlike Modula-2
and Pascal programmers, C programmers do not use cute names like
ThisVariableIsATemporaryCounter.  A C programmer would call that
variable "tmp", which is much easier to write, and not the least more
difficult to understand.

HOWEVER, while mixed-case names are frowned upon, descriptive names for
global variables are a must.  To call a global function "foo" is a
shooting offense.

GLOBAL variables (to be used only if you _really_ need them) need to
have descriptive names, as do global functions.  If you have a function
that counts the number of active users, you should call that
"count_active_users()" or similar, you should _not_ call it "cntusr()".

Encoding the type of a function into the name (so-called Hungarian
notation) is brain damaged – the compiler knows the types anyway and can
check those, and it only confuses the programmer.  No wonder MicroSoft
makes buggy programs.

LOCAL variable names should be short, and to the point.  If you have
some random integer loop counter, it should probably be called "i".
Calling it "loop_counter" is non-productive, if there is no chance of it
being mis-understood.  Similarly, "tmp" can be just about any type of
variable that is used to hold a temporary value.

If you are afraid to mix up your local variable names, you have another
problem, which is called the function-growth-hormone-imbalance syndrome.
See chapter 6 (Functions).

        Chapter 5: Typedefs

Please don’t use things like "vps_t".

It’s a _mistake_ to use typedef for structures and pointers. When you see a

    vps_t a;

in the source, what does it mean?

In contrast, if it says

    struct virtual_container *a;

you can actually tell what "a" is.

Lots of people think that typedefs "help readability". Not so. They are
useful only for:

 (a) totally opaque objects (where the typedef is actively used to _hide_
     what the object is).

     Example: "pte_t" etc. opaque objects that you can only access using
     the proper accessor functions.

     NOTE! Opaqueness and "accessor functions" are not good in themselves.
     The reason we have them for things like pte_t etc. is that there
     really is absolutely _zero_ portably accessible information there.

 (b) Clear integer types, where the abstraction _helps_ avoid confusion
     whether it is "int" or "long".

     u8/u16/u32 are perfectly fine typedefs, although they fit into
     category (d) better than here.

     NOTE! Again – there needs to be a _reason_ for this. If something is
     "unsigned long", then there’s no reason to do

    typedef unsigned long myflags_t;

     but if there is a clear reason for why it under certain circumstances
     might be an "unsigned int" and under other configurations might be
     "unsigned long", then by all means go ahead and use a typedef.

 (c) when you use sparse to literally create a _new_ type for
     type-checking.

 (d) New types which are identical to standard C99 types, in certain
     exceptional circumstances.

     Although it would only take a short amount of time for the eyes and
     brain to become accustomed to the standard types like ‘uint32_t’,
     some people object to their use anyway.

     Therefore, the Linux-specific ‘u8/u16/u32/u64′ types and their
     signed equivalents which are identical to standard types are
     permitted — although they are not mandatory in new code of your
     own.

     When editing existing code which already uses one or the other set
     of types, you should conform to the existing choices in that code.

 (e) Types safe for use in userspace.

     In certain structures which are visible to userspace, we cannot
     require C99 types and cannot use the ‘u32′ form above. Thus, we
     use __u32 and similar types in all structures which are shared
     with userspace.

Maybe there are other cases too, but the rule should basically be to NEVER
EVER use a typedef unless you can clearly match one of those rules.

In general, a pointer, or a struct that has elements that can reasonably
be directly accessed should _never_ be a typedef.

        Chapter 6: Functions

Functions should be short and sweet, and do just one thing.  They should
fit on one or two screenfuls of text (the ISO/ANSI screen size is 80×24,
as we all know), and do one thing and do that well.

The maximum length of a function is inversely proportional to the
complexity and indentation level of that function.  So, if you have a
conceptually simple function that is just one long (but simple)
case-statement, where you have to do lots of small things for a lot of
different cases, it’s OK to have a longer function.

However, if you have a complex function, and you suspect that a
less-than-gifted first-year high-school student might not even
understand what the function is all about, you should adhere to the
maximum limits all the more closely.  Use helper functions with
descriptive names (you can ask the compiler to in-line them if you think
it’s performance-critical, and it will probably do a better job of it
than you would have done).

Another measure of the function is the number of local variables.  They
shouldn’t exceed 5-10, or you’re doing something wrong.  Re-think the
function, and split it into smaller pieces.  A human brain can
generally easily keep track of about 7 different things, anything more
and it gets confused.  You know you’re brilliant, but maybe you’d like
to understand what you did 2 weeks from now.

In source files, separate functions with one blank line.  If the function is
exported, the EXPORT* macro for it should follow immediately after the closing
function brace line.  E.g.:

int system_is_up(void)
{
    return system_state == SYSTEM_RUNNING;
}
EXPORT_SYMBOL(system_is_up);

In function prototypes, include parameter names with their data types.
Although this is not required by the C language, it is preferred in Linux
because it is a simple way to add valuable information for the reader.

        Chapter 7: Centralized exiting of functions

Albeit deprecated by some people, the equivalent of the goto statement is
used frequently by compilers in form of the unconditional jump instruction.

The goto statement comes in handy when a function exits from multiple
locations and some common work such as cleanup has to be done.

The rationale is:

- unconditional statements are easier to understand and follow
- nesting is reduced
- errors by not updating individual exit points when making
    modifications are prevented
- saves the compiler work to optimize redundant code away ;)

int fun(int a)
{
    int result = 0;
    char *buffer = kmalloc(SIZE);

    if (buffer == NULL)
        return -ENOMEM;

    if (condition1) {
        while (loop1) {
            …
        }
        result = 1;
        goto out;
    }
    …
out:
    kfree(buffer);
    return result;
}

        Chapter 8: Commenting

Comments are good, but there is also a danger of over-commenting.  NEVER
try to explain HOW your code works in a comment: it’s much better to
write the code so that the _working_ is obvious, and it’s a waste of
time to explain badly written code.

Generally, you want your comments to tell WHAT your code does, not HOW.
Also, try to avoid putting comments inside a function body: if the
function is so complex that you need to separately comment parts of it,
you should probably go back to chapter 6 for a while.  You can make
small comments to note or warn about something particularly clever (or
ugly), but try to avoid excess.  Instead, put the comments at the head
of the function, telling people what it does, and possibly WHY it does
it.

When commenting the kernel API functions, please use the kernel-doc format.
See the files Documentation/kernel-doc-nano-HOWTO.txt and scripts/kernel-doc
for details.

Linux style for comments is the C89 "/* … */" style.
Don’t use C99-style "// …" comments.

The preferred style for long (multi-line) comments is:

    /*
     * This is the preferred style for multi-line
     * comments in the Linux kernel source code.
     * Please use it consistently.
     *
     * Description:  A column of asterisks on the left side,
     * with beginning and ending almost-blank lines.
     */

It’s also important to comment data, whether they are basic types or derived
types.  To this end, use just one data declaration per line (no commas for
multiple data declarations).  This leaves you room for a small comment on each
item, explaining its use.

        Chapter 9: You’ve made a mess of it

That’s OK, we all do.  You’ve probably been told by your long-time Unix
user helper that "GNU emacs" automatically formats the C sources for
you, and you’ve noticed that yes, it does do that, but the defaults it
uses are less than desirable (in fact, they are worse than random
typing – an infinite number of monkeys typing into GNU emacs would never
make a good program).

So, you can either get rid of GNU emacs, or change it to use saner
values.  To do the latter, you can stick the following in your .emacs file:

(defun c-lineup-arglist-tabs-only (ignored)
  "Line up argument lists by tabs, not spaces"
  (let* ((anchor (c-langelem-pos c-syntactic-element))
     (column (c-langelem-2nd-pos c-syntactic-element))
     (offset (- (1+ column) anchor))
     (steps (floor offset c-basic-offset)))
    (* (max steps 1)
       c-basic-offset)))

(add-hook ‘c-mode-common-hook
          (lambda ()
            ;; Add kernel style
            (c-add-style
             "linux-tabs-only"
             ‘("linux" (c-offsets-alist
                        (arglist-cont-nonempty
                         c-lineup-gcc-asm-reg
                         c-lineup-arglist-tabs-only))))))

(add-hook ‘c-mode-hook
          (lambda ()
            (let ((filename (buffer-file-name)))
              ;; Enable kernel mode for the appropriate files
              (when (and filename
                         (string-match (expand-file-name "~/src/linux-trees")
                                       filename))
                (setq indent-tabs-mode t)
                (c-set-style "linux-tabs-only")))))

This will make emacs go better with the kernel coding style for C
files below ~/src/linux-trees.

But even if you fail in getting emacs to do sane formatting, not
everything is lost: use "indent".

Now, again, GNU indent has the same brain-dead settings that GNU emacs
has, which is why you need to give it a few command line options.
However, that’s not too bad, because even the makers of GNU indent
recognize the authority of K&R (the GNU people aren’t evil, they are
just severely misguided in this matter), so you just give indent the
options "-kr -i8" (stands for "K&R, 8 character indents"), or use
"scripts/Lindent", which indents in the latest style.

"indent" has a lot of options, and especially when it comes to comment
re-formatting you may want to take a look at the man page.  But
remember: "indent" is not a fix for bad programming.

        Chapter 10: Kconfig configuration files

For all of the Kconfig* configuration files throughout the source tree,
the indentation is somewhat different.  Lines under a "config" definition
are indented with one tab, while help text is indented an additional two
spaces.  Example:

config AUDIT
    bool "Auditing support"
    depends on NET
    help
      Enable auditing infrastructure that can be used with another
      kernel subsystem, such as SELinux (which requires this for
      logging of avc messages output).  Does not do system-call
      auditing without CONFIG_AUDITSYSCALL.

Features that might still be considered unstable should be defined as
dependent on "EXPERIMENTAL":

config SLUB
    depends on EXPERIMENTAL && !ARCH_USES_SLAB_PAGE_STRUCT
    bool "SLUB (Unqueued Allocator)"
    …

while seriously dangerous features (such as write support for certain
filesystems) should advertise this prominently in their prompt string:

config ADFS_FS_RW
    bool "ADFS write support (DANGEROUS)"
    depends on ADFS_FS
    …

For full documentation on the configuration files, see the file
Documentation/kbuild/kconfig-language.txt.

        Chapter 11: Data structures

Data structures that have visibility outside the single-threaded
environment they are created and destroyed in should always have
reference counts.  In the kernel, garbage collection doesn’t exist (and
outside the kernel garbage collection is slow and inefficient), which
means that you absolutely _have_ to reference count all your uses.

Reference counting means that you can avoid locking, and allows multiple
users to have access to the data structure in parallel – and not having
to worry about the structure suddenly going away from under them just
because they slept or did something else for a while.

Note that locking is _not_ a replacement for reference counting.
Locking is used to keep data structures coherent, while reference
counting is a memory management technique.  Usually both are needed, and
they are not to be confused with each other.

Many data structures can indeed have two levels of reference counting,
when there are users of different "classes".  The subclass count counts
the number of subclass users, and decrements the global count just once
when the subclass count goes to zero.

Examples of this kind of "multi-level-reference-counting" can be found in
memory management ("struct mm_struct": mm_users and mm_count), and in
filesystem code ("struct super_block": s_count and s_active).

Remember: if another thread can find your data structure, and you don’t
have a reference count on it, you almost certainly have a bug.

        Chapter 12: Macros, Enums and RTL

Names of macros defining constants and labels in enums are capitalized.

#define CONSTANT 0×12345

Enums are preferred when defining several related constants.

CAPITALIZED macro names are appreciated but macros resembling functions
may be named in lower case.

Generally, inline functions are preferable to macros resembling functions.

Macros with multiple statements should be enclosed in a do – while block:

#define macrofun(a, b, c)             \
    do {                    \
        if (a == 5)            \
            do_this(b, c);        \
    } while (0)

Things to avoid when using macros:

1) macros that affect control flow:

#define FOO(x)                    \
    do {                    \
        if (blah(x) < 0)        \
            return -EBUGGERED;    \
    } while(0)

is a _very_ bad idea.  It looks like a function call but exits the "calling"
function; don’t break the internal parsers of those who will read the code.

2) macros that depend on having a local variable with a magic name:

#define FOO(val) bar(index, val)

might look like a good thing, but it’s confusing as hell when one reads the
code and it’s prone to breakage from seemingly innocent changes.

3) macros with arguments that are used as l-values: FOO(x) = y; will
bite you if somebody e.g. turns FOO into an inline function.

4) forgetting about precedence: macros defining constants using expressions
must enclose the expression in parentheses. Beware of similar issues with
macros using parameters.

#define CONSTANT 0×4000
#define CONSTEXP (CONSTANT | 3)

The cpp manual deals with macros exhaustively. The gcc internals manual also
covers RTL which is used frequently with assembly language in the kernel.

        Chapter 13: Printing kernel messages

Kernel developers like to be seen as literate. Do mind the spelling
of kernel messages to make a good impression. Do not use crippled
words like "dont"; use "do not" or "don’t" instead.  Make the messages
concise, clear, and unambiguous.

Kernel messages do not have to be terminated with a period.

Printing numbers in parentheses (%d) adds no value and should be avoided.

There are a number of driver model diagnostic macros in <linux/device.h>
which you should use to make sure messages are matched to the right device
and driver, and are tagged with the right level:  dev_err(), dev_warn(),
dev_info(), and so forth.  For messages that aren’t associated with a
particular device, <linux/kernel.h> defines pr_debug() and pr_info().

Coming up with good debugging messages can be quite a challenge; and once
you have them, they can be a huge help for remote troubleshooting.  Such
messages should be compiled out when the DEBUG symbol is not defined (that
is, by default they are not included).  When you use dev_dbg() or pr_debug(),
that’s automatic.  Many subsystems have Kconfig options to turn on -DDEBUG.
A related convention uses VERBOSE_DEBUG to add dev_vdbg() messages to the
ones already enabled by DEBUG.

        Chapter 14: Allocating memory

The kernel provides the following general purpose memory allocators:
kmalloc(), kzalloc(), kcalloc(), and vmalloc().  Please refer to the API
documentation for further information about them.

The preferred form for passing a size of a struct is the following:

    p = kmalloc(sizeof(*p), …);

The alternative form where struct name is spelled out hurts readability and
introduces an opportunity for a bug when the pointer variable type is changed
but the corresponding sizeof that is passed to a memory allocator is not.

Casting the return value which is a void pointer is redundant. The conversion
from void pointer to any other pointer type is guaranteed by the C programming
language.

        Chapter 15: The inline disease

There appears to be a common misperception that gcc has a magic "make me
faster" speedup option called "inline". While the use of inlines can be
appropriate (for example as a means of replacing macros, see Chapter 12), it
very often is not. Abundant use of the inline keyword leads to a much bigger
kernel, which in turn slows the system as a whole down, due to a bigger
icache footprint for the CPU and simply because there is less memory
available for the pagecache. Just think about it; a pagecache miss causes a
disk seek, which easily takes 5 miliseconds. There are a LOT of cpu cycles
that can go into these 5 miliseconds.

A reasonable rule of thumb is to not put inline at functions that have more
than 3 lines of code in them. An exception to this rule are the cases where
a parameter is known to be a compiletime constant, and as a result of this
constantness you *know* the compiler will be able to optimize most of your
function away at compile time. For a good example of this later case, see
the kmalloc() inline function.

Often people argue that adding inline to functions that are static and used
only once is always a win since there is no space tradeoff. While this is
technically correct, gcc is capable of inlining these automatically without
help, and the maintenance issue of removing the inline when a second user
appears outweighs the potential value of the hint that tells gcc to do
something it would have done anyway.

        Chapter 16: Function return values and names

Functions can return values of many different kinds, and one of the
most common is a value indicating whether the function succeeded or
failed.  Such a value can be represented as an error-code integer
(-Exxx = failure, 0 = success) or a "succeeded" boolean (0 = failure,
non-zero = success).

Mixing up these two sorts of representations is a fertile source of
difficult-to-find bugs.  If the C language included a strong distinction
between integers and booleans then the compiler would find these mistakes
for us… but it doesn’t.  To help prevent such bugs, always follow this
convention:

    If the name of a function is an action or an imperative command,
    the function should return an error-code integer.  If the name
    is a predicate, the function should return a "succeeded" boolean.

For example, "add work" is a command, and the add_work() function returns 0
for success or -EBUSY for failure.  In the same way, "PCI device present" is
a predicate, and the pci_dev_present() function returns 1 if it succeeds in
finding a matching device or 0 if it doesn’t.

All EXPORTed functions must respect this convention, and so should all
public functions.  Private (static) functions need not, but it is
recommended that they do.

Functions whose return value is the actual result of a computation, rather
than an indication of whether the computation succeeded, are not subject to
this rule.  Generally they indicate failure by returning some out-of-range
result.  Typical examples would be functions that return pointers; they use
NULL or the ERR_PTR mechanism to report failure.

        Chapter 17:  Don’t re-invent the kernel macros

The header file include/linux/kernel.h contains a number of macros that
you should use, rather than explicitly coding some variant of them yourself.
For example, if you need to calculate the length of an array, take advantage
of the macro

  #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))

Similarly, if you need to calculate the size of some structure member, use

  #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f))

There are also min() and max() macros that do strict type checking if you
need them.  Feel free to peruse that header file to see what else is already
defined that you shouldn’t reproduce in your code.

        Chapter 18:  Editor modelines and other cruft

Some editors can interpret configuration information embedded in source files,
indicated with special markers.  For example, emacs interprets lines marked
like this:

-*- mode: c -*-

Or like this:

/*
Local Variables:
compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"
End:
*/

Vim interprets markers that look like this:

/* vim:set sw=8 noet */

Do not include any of these in source files.  People have their own personal
editor configurations, and your source files should not override them.  This
includes markers for indentation and mode configuration.  People may use their
own custom mode, or may have some other magic method for making indentation
work correctly.

        Appendix I: References

The C Programming Language, Second Edition
by Brian W. Kernighan and Dennis M. Ritchie.
Prentice Hall, Inc., 1988.
ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).
URL: http://cm.bell-labs.com/cm/cs/cbook/

The Practice of Programming
by Brian W. Kernighan and Rob Pike.
Addison-Wesley, Inc., 1999.
ISBN 0-201-61586-X.
URL: http://cm.bell-labs.com/cm/cs/tpop/

GNU manuals – where in compliance with K&R and this text – for cpp, gcc,
gcc internals and indent, all available from http://www.gnu.org/manual/

WG14 is the international standardization working group for the programming
language C, URL: http://www.open-std.org/JTC1/SC22/WG14/

Kernel CodingStyle, by greg@kroah.com at OLS 2002:
http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/


Last updated on 2007-July-13.

August 4, 2009

100 Vim commands every programmer should know

Filed under: C and C++ — minitia @ 11:47 pm

Search
/word Search “word” from top to bottom
?word Search “word” from bottom to top
/jo[ha]n Search “john” or “joan”
/\< the Search “the”, “theatre” or “then”
/the\> Search “the” or “breathe”
/\< the\> Search “the”
/\< ….\> Search all words of 4 letters
/\/ Search “fred” but not “alfred” or “frederick”
/fred\|joe Search “fred” or “joe”
/\<\d\d\d\d\> Search exactly 4 digits
/^\n\{3} Find 3 empty lines
:bufdo /searchstr/ Search in all open files
Replace
:%s/old/new/g Replace all occurences of “old” by “new” in file
:%s/old/new/gw Replace all occurences with confirmation
:2,35s/old/new/g Replace all occurences between lines 2 and 35
:5,$s/old/new/g Replace all occurences from line 5 to EOF
:%s/^/hello/g Replace the begining of each line by “hello”
:%s/$/Harry/g Replace the end of each line by “Harry”
:%s/onward/forward/gi Replace “onward” by “forward” , case unsensitive
:%s/ *$//g Delete all white spaces
:g/string/d Delete all lines containing “string”
:v/string/d Delete all lines containing which didn’t contain “string”
:s/Bill/Steve/ Replace the first occurence of “Bill” by “Steve” in current line
:s/Bill/Steve/g Replace “Bill” by “Steve” in current line
:%s/Bill/Steve/g Replace “Bill” by “Steve” in all the file
:%s/\r//g Delete DOS carriage returns (^M)
:%s/\r/\r/g Transform DOS carriage returns in returns
:%s#<[^>]\+>##g Delete HTML tags but keeps text
:%s/^\(.*\)\n\1$/\1/ Delete lines which appears twice
Ctrl+a Increment number under the cursor
Ctrl+x Decrement number under cursor
ggVGg? Change text to Rot13
Case
Vu Lowercase line
VU Uppercase line
g~~ Invert case
vEU Switch word to uppercase
vE~ Modify word case
ggguG Set all text to lowercase
:set ignorecase Ignore case in searches
:set smartcase Ignore case in searches excepted if an uppercase letter is used
:%s/\<./\u&/g Sets first letter of each word to uppercase
:%s/\<./\l&/g Sets first letter of each word to lowercase
:%s/.*/\u& Sets first letter of each line to uppercase
:%s/.*/\l& Sets first letter of each line to lowercase
Read/Write files
:1,10 w outfile Saves lines 1 to 10 in outfile
:1,10 w >> outfile Appends lines 1 to 10 to outfile
:r infile Insert the content of infile
:23r infile Insert the content of infile under line 23
File explorer
:e . Open integrated file explorer
:Sex Split window and open integrated file explorer
:browse e Graphical file explorer
:ls List buffers
:cd .. Move to parent directory
:args List files
:args *.php Open file list
:grep expression *.php Returns a list of .php files contening expression
gf Open file name under cursor
Interact with Unix
:!pwd Execute the “pwd” unix command, then returns to Vi
!!pwd Execute the “pwd” unix command and insert output in file
:sh Temporary returns to Unix
$exit Retourns to Vi
Alignment
:%!fmt Align all lines
!}fmt Align all lines at the current position
5!!fmt Align the next 5 lines
Tabs
:tabnew Creates a new tab
gt Show next tab
:tabfirst Show first tab
:tablast Show last tab
:tabm n(position) Rearrange tabs
:tabdo %s/foo/bar/g Execute a command in all tabs
:tab ball Puts all open files in tabs
Window spliting
:e filename Edit filename in current window
:split filename Split the window and open filename
ctrl-w up arrow Puts cursor in top window
ctrl-w ctrl-w Puts cursor in next window
ctrl-w_ Maximise current window
ctrl-w= Gives the same size to all windows
10 ctrl-w+ Add 10 lines to current window
:vsplit file Split window vertically
:sview file Same as :split in readonly mode
:hide Close current window
 nly Close all windows, excepted current
:b 2 Open #2 in this window
Auto-completion
Ctrl+n Ctrl+p (in insert mode) Complete word
Ctrl+x Ctrl+l Complete line
:set dictionary=dict Define dict as a dictionnary
Ctrl+x Ctrl+k Complete with dictionnary
Marks
mk Marks current position as k
‘k Moves cursor to mark k
d’k Delete all until mark k
Abbreviations
:ab mail mail@provider.org Define mail as abbreviation of mail@provider.org
Text indent
:set autoindent Turn on auto-indent
:set smartindent Turn on intelligent auto-indent
:set shiftwidth=4 Defines 4 spaces as indent size
ctrl-t, ctrl-d Indent/un-indent in insert mode
>> Indent
<< Un-indent
Syntax highlighting
:syntax on Turn on syntax highlighting
:syntax off Turn off syntax highlighting
:set syntax=perl Force syntax highlighting

 

August 3, 2009

VARIANT内部数据成员与类型的对应关系

Filed under: C and C++ — minitia @ 1:21 am

 

Platform SDK: Automation 
VARIANT and VARIANTARG

Use VARIANTARG to describe arguments passed within DISPPARAMS, and VARIANT to
specify variant data that cannot be passed by reference. When a variant refers
to another variant by using the VT_VARIANT | VT_BYREF vartype, the variant being
referred to cannot also be of type VT_VARIANT | VT_BYREF. VARIANTs can be passed
by value, even if VARIANTARGs cannot. The following definition of VARIANT is
described in OAIDL.H automation header file:

typedef struct tagVARIANT VARIANT;
typedef struct tagVARIANT VARIANTARG;
struct tagVARIANT
{
union
{
struct __tagVARIANT
{
VARTYPE vt;
WORD wReserved1;
WORD wReserved2;
WORD wReserved3;
union
{
LONGLONG llVal;
LONG lVal;
BYTE bVal;
SHORT iVal;
FLOAT fltVal;
DOUBLE dblVal;
VARIANT_BOOL boolVal;
_VARIANT_BOOL bool;
SCODE scode;
CY cyVal;
DATE date;
BSTR bstrVal;
IUnknown *punkVal;
IDispatch *pdispVal;
SAFEARRAY *parray;
BYTE *pbVal;
SHORT *piVal;
LONG *plVal;
LONGLONG *pllVal;
FLOAT *pfltVal;
DOUBLE *pdblVal;
VARIANT_BOOL *pboolVal;
_VARIANT_BOOL *pbool;
SCODE *pscode;
CY *pcyVal;
DATE *pdate;
BSTR *pbstrVal;
IUnknown **ppunkVal;
IDispatch **ppdispVal;
SAFEARRAY **pparray;
VARIANT *pvarVal;
PVOID byref;
CHAR cVal;
USHORT uiVal;
ULONG ulVal;
ULONGLONG ullVal;
INT intVal;
UINT uintVal;
DECIMAL *pdecVal;
CHAR *pcVal;
USHORT *puiVal;
ULONG *pulVal;
ULONGLONG *pullVal;
INT *pintVal;
UINT *puintVal;
struct __tagBRECORD
{
PVOID pvRecord;
IRecordInfo *pRecInfo;
} __VARIANT_NAME_4;
} __VARIANT_NAME_3;
} __VARIANT_NAME_2;
DECIMAL decVal;
} __VARIANT_NAME_1;
struct tagVARIANT
{
union
{
struct __tagVARIANT
{
VARTYPE vt;
WORD wReserved1;
WORD wReserved2;
WORD wReserved3;
union {
LONGLONG llval; // VT_I8.
LONG lVal; // VT_I4.
BYTE bVal; // VT_UI1.
SHORT iVal; // VT_I2.
FLOAT fltVal; // VT_R4.
DOUBLE dblVal; // VT_R8.
VARIANT_BOOL boolVal; // VT_BOOL.
_VARIANT_BOOL bool;
SCODE scode; // VT_ERROR.
CY cyVal; // VT_CY.
DATE date; // VT_DATE.
BSTR bstrVal; // VT_BSTR.
IUnknown * punkVal; // VT_UNKNOWN.
IDispatch * pdispVal; // VT_DISPATCH.
SAFEARRAY * parray; // VT_ARRAY|*.
BYTE * pbVal; // VT_BYREF|VT_UI1.
SHORT * piVal; // VT_BYREF|VT_I2.
LONG * plVal; // VT_BYREF|VT_I4.
LONGLONG * pllVal; // VT_BYREF|VT_I8.
FLOAT * pfltVal; // VT_BYREF|VT_R4.
DOUBLE * pdblVal; // VT_BYREF|VT_R8.
VARIANT_BOOL * pboolVal; // VT_BYREF|VT_BOOL.
_VARIANT_BOOL * pbool;
SCODE * pscode; // VT_BYREF|VT_ERROR.
CY * pcyVal; // VT_BYREF|VT_CY.
DATE * pdate; // VT_BYREF|VT_DATE.
BSTR * pbstrVal; // VT_BYREF|VT_BSTR.
IUnknown ** ppunkVal; // VT_BYREF|VT_UNKNOWN.
IDispatch ** ppdispVal; // VT_BYREF|VT_DISPATCH.
SAFEARRAY ** pparray; // VT_ARRAY|*.
VARIANT * pvarVal; // VT_BYREF|VT_VARIANT.
PVOID * byref; // Generic ByRef.
CHAR cVal; // VT_I1.
USHORT uiVal; // VT_UI2.
ULONG ulVal; // VT_UI4.
ULONGLONG ullVal; // VT_UI8.
INT intVal; // VT_INT.
UINT uintVal; // VT_UINT.
DECIMAL * pdecVal // VT_BYREF|VT_DECIMAL.
CHAR * pcVal; // VT_BYREF|VT_I1.
USHORT * puiVal; // VT_BYREF|VT_UI2.
ULONG * pulVal; // VT_BYREF|VT_UI4.
ULONGLONG * pullVal; // VT_BYREF|VT_UI8.
INT * pintVal; // VT_BYREF|VT_INT.
UINT * puintVal; // VT_BYREF|VT_UINT.
struct __tagBRECORD
{
PVOID pvRecord;
IRecordInfo *pRecInfo;
} __VARIANT_NAME_4;
} __VARIANT_NAME_3;
} __VARIANT_NAME_2;
DECIMAL decVal;
} __VARIANT_NAME_1;

};
};

To simplify extracting values from VARIANTARGs, Automation provides a set of
functions for manipulating this type. Use of these functions is strongly
recommended to ensure that applications apply consistent coercion rules.

The vt value governs the interpretation of the union as follows:

Value
Description

VT_EMPTY
No value was specified. If an optional argument to an Automation
method is left blank, do not pass a VARIANT of type VT_EMPTY. Instead, pass a
VARIANT of type VT_ERROR with a value of DISP_E_PARAMNOTFOUND.

VT_EMPTY | VT_BYREF
Not valid.

VT_UI1
An unsigned 1-byte character is stored in bVal.

VT_UI1 | VT_BYREF
A reference to an unsigned 1-byte character was passed. A
pointer to the value is in pbVal.

VT_UI2
An unsigned 2-byte integer value is stored in
uiVal.

VT_UI2 | VT_BYREF
A reference to an unsigned 2-byte integer was passed. A pointer
to the value is in puiVal.

VT_UI4
An unsigned 4-byte integer value is stored in
ulVal.

VT_UI4 | VT_BYREF
A reference to an unsigned 4-byte integer was passed. A pointer
to the value is in pulVal.

VT_UI8
An unsigned 8-byte integer value is stored in
ullVal.

VT_UI8 | VT_BYREF
A reference to an unsigned 8-byte integer was passed. A pointer
to the value is in pullVal.

VT_UINT
An unsigned integer value is stored in uintVal.

VT_UINT | VT_BYREF
A reference to an unsigned integer value was passed. A pointer
to the value is in puintVal.

VT_INT
An integer value is stored in intVal.

VT_INT | VT_BYREF
A reference to an integer value was passed. A pointer to the
value is in pintVal.

VT_I1
A 1-byte character value is stored in cVal.

VT_I1 | VT_BYREF
A reference to a 1-byte character was passed. A pointer the
value is in pcVal.

VT_I2
A 2-byte integer value is stored in iVal.

VT_I2 | VT_BYREF
A reference to a 2-byte integer was passed. A pointer to the
value is in piVal.

VT_I4
A 4-byte integer value is stored in lVal.

VT_I4 | VT_BYREF
A reference to a 4-byte integer was passed. A pointer to the
value is in plVal.

VT_I8
A 8-byte integer value is stored in llVal.

VT_I4 | VT_BYREF
A reference to a 8-byte integer was passed. A pointer to the
value is in pllVal.

VT_R4
An IEEE 4-byte real value is stored in fltVal.

VT_R4 | VT_BYREF
A reference to an IEEE 4-byte real value was passed. A pointer
to the value is in pfltVal.

VT_R8
An 8-byte IEEE real value is stored in dblVal.

VT_R8 | VT_BYREF
A reference to an 8-byte IEEE real value was passed. A pointer
to its value is in pdblVal.

VT_CY
A currency value was specified. A currency number is stored as
64-bit (8-byte), two’s complement integer, scaled by 10,000 to give a
fixed-point number with 15 digits to the left of the decimal point and 4 digits
to the right. The value is in cyVal.

VT_CY | VT_BYREF
A reference to a currency value was passed. A pointer to the
value is in pcyVal.

VT_BSTR
A string was passed; it is stored in bstrVal. This
pointer must be obtained and freed by the BSTR functions, which are described in
Conversion and Manipulation Functions.

VT_BSTR | VT_BYREF
A reference to a string was passed. A BSTR* that points
to a BSTR is in pbstrVal. The referenced pointer must be obtained
or freed by the BSTR functions.

VT_DECIMAL
Decimal variables are stored as 96-bit (12-byte) unsigned
integers scaled by a variable power of 10. VT_DECIMAL uses the entire 16 bytes
of the Variant.

VT_DECIMAL | VT_BYREF
A reference to a decimal value was passed. A pointer to the
value is in pdecVal.

VT_NULL
A propagating null value was specified. (This should not be
confused with the null pointer.) The null value is used for tri-state logic, as
with SQL.

VT_NULL | VT_BYREF
Not valid.

VT_ERROR
An SCODE was specified. The type of the error is specified in
scodee. Generally, operations on error values should raise an exception
or propagate the error to the return value, as appropriate.

VT_ERROR | VT_BYREF
A reference to an SCODE was passed. A pointer to the
value is in pscode.

VT_BOOL
A 16 bit Boolean (True/False) value was specified. A value of
0xFFFF (all bits 1) indicates True; a value of 0 (all bits 0) indicates False.
No other values are valid.

VT_BOOL | VT_BYREF
A reference to a Boolean value. A pointer to the Boolean value
is in pbool.

VT_DATE
A value denoting a date and time was specified. Dates are
represented as double-precision numbers, where midnight, January 1, 1900 is 2.0,
January 2, 1900 is 3.0, and so on. The value is passed in date.

This is the same numbering system used by most spreadsheet programs, although
some specify incorrectly that February 29, 1900 existed, and thus set January 1,
1900 to 1.0. The date can be converted to and from an MS-DOS representation
using VariantTimeToDosDateTime, which is discussed
in Conversion and Manipulation Functions.

VT_DATE | VT_BYREF
A reference to a date was passed. A pointer to the value
is in pdate.

VT_DISPATCH
A pointer to an object was specified. The pointer is in
pdispVal. This object is known only to implement IDispatch.
The object can be queried as to whether it supports any other desired interface
by calling QueryInterface on the object. Objects that do not
implement IDispatch should be passed using VT_UNKNOWN.

VT_DISPATCH | VT_BYREF
A pointer to a pointer to an object was specified. The pointer
to the object is stored in the location referred to by
ppdispVal.

VT_VARIANT
Invalid. VARIANTARGs must be passed by reference.

VT_VARIANT | VT_BYREF
A pointer to another VARIANTARG is passed in pvarVal.
This referenced VARIANTARG, pvarVal, cannot be another
VT_VARIANT|VT_BYREF. This value can be used to support languages that allow
functions to change the types of variables passed by reference.

VT_UNKNOWN
A pointer to an object that implements the IUnknown
interface is passed in punkVal.

VT_UNKNOWN | VT_BYREF
A pointer to the IUnknown interface is passed in
ppunkVal. The pointer to the interface is stored in the location referred
to by ppunkVal.

VT_ARRAY | <anything>
An array of data type <anything> was passed. (VT_EMPTY and
VT_NULL are invalid types to combine with VT_ARRAY.) The pointer in pparray
points to an array descriptor, which describes the dimensions, size, and
in-memory location of the array. The array descriptor is never accessed
directly, but instead is read and modified using the functions described in Conversion and Manipulation Functions.


Platform SDK Release: October
2002


What did you think of this topic?
sdkfdbk@microsoft.com" href="mailto:sdkfdbk@microsoft.com?subject=TITLE:%20VARIANT%20and%20VARIANTARG;%20RELEASE:%20October%202002;%20URL:%20chap6_7zdz.htm">Let
us know.

Order a Platform SDK CD Online
(U.S/Canada)   (International)

 

VARIANT内部数据成员与类型的对应关系

Filed under: C and C++ — minitia @ 1:20 am

 

Platform SDK: Automation 
VARIANT and VARIANTARG

Use VARIANTARG to describe arguments passed within DISPPARAMS, and VARIANT to
specify variant data that cannot be passed by reference. When a variant refers
to another variant by using the VT_VARIANT | VT_BYREF vartype, the variant being
referred to cannot also be of type VT_VARIANT | VT_BYREF. VARIANTs can be passed
by value, even if VARIANTARGs cannot. The following definition of VARIANT is
described in OAIDL.H automation header file:

typedef struct tagVARIANT VARIANT;
typedef struct tagVARIANT VARIANTARG;
struct tagVARIANT
{
union
{
struct __tagVARIANT
{
VARTYPE vt;
WORD wReserved1;
WORD wReserved2;
WORD wReserved3;
union
{
LONGLONG llVal;
LONG lVal;
BYTE bVal;
SHORT iVal;
FLOAT fltVal;
DOUBLE dblVal;
VARIANT_BOOL boolVal;
_VARIANT_BOOL bool;
SCODE scode;
CY cyVal;
DATE date;
BSTR bstrVal;
IUnknown *punkVal;
IDispatch *pdispVal;
SAFEARRAY *parray;
BYTE *pbVal;
SHORT *piVal;
LONG *plVal;
LONGLONG *pllVal;
FLOAT *pfltVal;
DOUBLE *pdblVal;
VARIANT_BOOL *pboolVal;
_VARIANT_BOOL *pbool;
SCODE *pscode;
CY *pcyVal;
DATE *pdate;
BSTR *pbstrVal;
IUnknown **ppunkVal;
IDispatch **ppdispVal;
SAFEARRAY **pparray;
VARIANT *pvarVal;
PVOID byref;
CHAR cVal;
USHORT uiVal;
ULONG ulVal;
ULONGLONG ullVal;
INT intVal;
UINT uintVal;
DECIMAL *pdecVal;
CHAR *pcVal;
USHORT *puiVal;
ULONG *pulVal;
ULONGLONG *pullVal;
INT *pintVal;
UINT *puintVal;
struct __tagBRECORD
{
PVOID pvRecord;
IRecordInfo *pRecInfo;
} __VARIANT_NAME_4;
} __VARIANT_NAME_3;
} __VARIANT_NAME_2;
DECIMAL decVal;
} __VARIANT_NAME_1;
struct tagVARIANT
{
union
{
struct __tagVARIANT
{
VARTYPE vt;
WORD wReserved1;
WORD wReserved2;
WORD wReserved3;
union {
LONGLONG llval; // VT_I8.
LONG lVal; // VT_I4.
BYTE bVal; // VT_UI1.
SHORT iVal; // VT_I2.
FLOAT fltVal; // VT_R4.
DOUBLE dblVal; // VT_R8.
VARIANT_BOOL boolVal; // VT_BOOL.
_VARIANT_BOOL bool;
SCODE scode; // VT_ERROR.
CY cyVal; // VT_CY.
DATE date; // VT_DATE.
BSTR bstrVal; // VT_BSTR.
IUnknown * punkVal; // VT_UNKNOWN.
IDispatch * pdispVal; // VT_DISPATCH.
SAFEARRAY * parray; // VT_ARRAY|*.
BYTE * pbVal; // VT_BYREF|VT_UI1.
SHORT * piVal; // VT_BYREF|VT_I2.
LONG * plVal; // VT_BYREF|VT_I4.
LONGLONG * pllVal; // VT_BYREF|VT_I8.
FLOAT * pfltVal; // VT_BYREF|VT_R4.
DOUBLE * pdblVal; // VT_BYREF|VT_R8.
VARIANT_BOOL * pboolVal; // VT_BYREF|VT_BOOL.
_VARIANT_BOOL * pbool;
SCODE * pscode; // VT_BYREF|VT_ERROR.
CY * pcyVal; // VT_BYREF|VT_CY.
DATE * pdate; // VT_BYREF|VT_DATE.
BSTR * pbstrVal; // VT_BYREF|VT_BSTR.
IUnknown ** ppunkVal; // VT_BYREF|VT_UNKNOWN.
IDispatch ** ppdispVal; // VT_BYREF|VT_DISPATCH.
SAFEARRAY ** pparray; // VT_ARRAY|*.
VARIANT * pvarVal; // VT_BYREF|VT_VARIANT.
PVOID * byref; // Generic ByRef.
CHAR cVal; // VT_I1.
USHORT uiVal; // VT_UI2.
ULONG ulVal; // VT_UI4.
ULONGLONG ullVal; // VT_UI8.
INT intVal; // VT_INT.
UINT uintVal; // VT_UINT.
DECIMAL * pdecVal // VT_BYREF|VT_DECIMAL.
CHAR * pcVal; // VT_BYREF|VT_I1.
USHORT * puiVal; // VT_BYREF|VT_UI2.
ULONG * pulVal; // VT_BYREF|VT_UI4.
ULONGLONG * pullVal; // VT_BYREF|VT_UI8.
INT * pintVal; // VT_BYREF|VT_INT.
UINT * puintVal; // VT_BYREF|VT_UINT.
struct __tagBRECORD
{
PVOID pvRecord;
IRecordInfo *pRecInfo;
} __VARIANT_NAME_4;
} __VARIANT_NAME_3;
} __VARIANT_NAME_2;
DECIMAL decVal;
} __VARIANT_NAME_1;

};
};

To simplify extracting values from VARIANTARGs, Automation provides a set of
functions for manipulating this type. Use of these functions is strongly
recommended to ensure that applications apply consistent coercion rules.

The vt value governs the interpretation of the union as follows:

Value
Description

VT_EMPTY
No value was specified. If an optional argument to an Automation
method is left blank, do not pass a VARIANT of type VT_EMPTY. Instead, pass a
VARIANT of type VT_ERROR with a value of DISP_E_PARAMNOTFOUND.

VT_EMPTY | VT_BYREF
Not valid.

VT_UI1
An unsigned 1-byte character is stored in bVal.

VT_UI1 | VT_BYREF
A reference to an unsigned 1-byte character was passed. A
pointer to the value is in pbVal.

VT_UI2
An unsigned 2-byte integer value is stored in
uiVal.

VT_UI2 | VT_BYREF
A reference to an unsigned 2-byte integer was passed. A pointer
to the value is in puiVal.

VT_UI4
An unsigned 4-byte integer value is stored in
ulVal.

VT_UI4 | VT_BYREF
A reference to an unsigned 4-byte integer was passed. A pointer
to the value is in pulVal.

VT_UI8
An unsigned 8-byte integer value is stored in
ullVal.

VT_UI8 | VT_BYREF
A reference to an unsigned 8-byte integer was passed. A pointer
to the value is in pullVal.

VT_UINT
An unsigned integer value is stored in uintVal.

VT_UINT | VT_BYREF
A reference to an unsigned integer value was passed. A pointer
to the value is in puintVal.

VT_INT
An integer value is stored in intVal.

VT_INT | VT_BYREF
A reference to an integer value was passed. A pointer to the
value is in pintVal.

VT_I1
A 1-byte character value is stored in cVal.

VT_I1 | VT_BYREF
A reference to a 1-byte character was passed. A pointer the
value is in pcVal.

VT_I2
A 2-byte integer value is stored in iVal.

VT_I2 | VT_BYREF
A reference to a 2-byte integer was passed. A pointer to the
value is in piVal.

VT_I4
A 4-byte integer value is stored in lVal.

VT_I4 | VT_BYREF
A reference to a 4-byte integer was passed. A pointer to the
value is in plVal.

VT_I8
A 8-byte integer value is stored in llVal.

VT_I4 | VT_BYREF
A reference to a 8-byte integer was passed. A pointer to the
value is in pllVal.

VT_R4
An IEEE 4-byte real value is stored in fltVal.

VT_R4 | VT_BYREF
A reference to an IEEE 4-byte real value was passed. A pointer
to the value is in pfltVal.

VT_R8
An 8-byte IEEE real value is stored in dblVal.

VT_R8 | VT_BYREF
A reference to an 8-byte IEEE real value was passed. A pointer
to its value is in pdblVal.

VT_CY
A currency value was specified. A currency number is stored as
64-bit (8-byte), two’s complement integer, scaled by 10,000 to give a
fixed-point number with 15 digits to the left of the decimal point and 4 digits
to the right. The value is in cyVal.

VT_CY | VT_BYREF
A reference to a currency value was passed. A pointer to the
value is in pcyVal.

VT_BSTR
A string was passed; it is stored in bstrVal. This
pointer must be obtained and freed by the BSTR functions, which are described in
Conversion and Manipulation Functions.

VT_BSTR | VT_BYREF
A reference to a string was passed. A BSTR* that points
to a BSTR is in pbstrVal. The referenced pointer must be obtained
or freed by the BSTR functions.

VT_DECIMAL
Decimal variables are stored as 96-bit (12-byte) unsigned
integers scaled by a variable power of 10. VT_DECIMAL uses the entire 16 bytes
of the Variant.

VT_DECIMAL | VT_BYREF
A reference to a decimal value was passed. A pointer to the
value is in pdecVal.

VT_NULL
A propagating null value was specified. (This should not be
confused with the null pointer.) The null value is used for tri-state logic, as
with SQL.

VT_NULL | VT_BYREF
Not valid.

VT_ERROR
An SCODE was specified. The type of the error is specified in
scodee. Generally, operations on error values should raise an exception
or propagate the error to the return value, as appropriate.

VT_ERROR | VT_BYREF
A reference to an SCODE was passed. A pointer to the
value is in pscode.

VT_BOOL
A 16 bit Boolean (True/False) value was specified. A value of
0xFFFF (all bits 1) indicates True; a value of 0 (all bits 0) indicates False.
No other values are valid.

VT_BOOL | VT_BYREF
A reference to a Boolean value. A pointer to the Boolean value
is in pbool.

VT_DATE
A value denoting a date and time was specified. Dates are
represented as double-precision numbers, where midnight, January 1, 1900 is 2.0,
January 2, 1900 is 3.0, and so on. The value is passed in date.

This is the same numbering system used by most spreadsheet programs, although
some specify incorrectly that February 29, 1900 existed, and thus set January 1,
1900 to 1.0. The date can be converted to and from an MS-DOS representation
using VariantTimeToDosDateTime, which is discussed
in Conversion and Manipulation Functions.

VT_DATE | VT_BYREF
A reference to a date was passed. A pointer to the value
is in pdate.

VT_DISPATCH
A pointer to an object was specified. The pointer is in
pdispVal. This object is known only to implement IDispatch.
The object can be queried as to whether it supports any other desired interface
by calling QueryInterface on the object. Objects that do not
implement IDispatch should be passed using VT_UNKNOWN.

VT_DISPATCH | VT_BYREF
A pointer to a pointer to an object was specified. The pointer
to the object is stored in the location referred to by
ppdispVal.

VT_VARIANT
Invalid. VARIANTARGs must be passed by reference.

VT_VARIANT | VT_BYREF
A pointer to another VARIANTARG is passed in pvarVal.
This referenced VARIANTARG, pvarVal, cannot be another
VT_VARIANT|VT_BYREF. This value can be used to support languages that allow
functions to change the types of variables passed by reference.

VT_UNKNOWN
A pointer to an object that implements the IUnknown
interface is passed in punkVal.

VT_UNKNOWN | VT_BYREF
A pointer to the IUnknown interface is passed in
ppunkVal. The pointer to the interface is stored in the location referred
to by ppunkVal.

VT_ARRAY | <anything>
An array of data type <anything> was passed. (VT_EMPTY and
VT_NULL are invalid types to combine with VT_ARRAY.) The pointer in pparray
points to an array descriptor, which describes the dimensions, size, and
in-memory location of the array. The array descriptor is never accessed
directly, but instead is read and modified using the functions described in Conversion and Manipulation Functions.


Platform SDK Release: October
2002


What did you think of this topic?
sdkfdbk@microsoft.com" href="mailto:sdkfdbk@microsoft.com?subject=TITLE:%20VARIANT%20and%20VARIANTARG;%20RELEASE:%20October%202002;%20URL:%20chap6_7zdz.htm">Let
us know.

Order a Platform SDK CD Online
(U.S/Canada)   (International)

 

Older Posts »

Powered by WordPress