CLI definition language

Hey,

I’m sure you’ve had the occasion on which you developed an interesting application, and now required the ability to control it from the command line. The standard C/C++ functions for implementing various switches and input is nothing fun to work with, and the various template and composition based solutions are too complex most of the time.

Check out this interesting project:
http://www.codesynthesis.com/projects/cli/

You define the options you want in a very, very simple definition language, compile it into a C++ class file and voila: you can use it in your main() function.

Nice. 

map/set iterators incompatible

So, you’ve been working on some piece of code that employs STL containers, specifically a map. Perhaps you’ve also added some multi-threaded flavoring to pack a punch. Great.

Suddenly, you see this:

Debug Assertion Failed!

Program: C:\….\Test.exe
File: C:\Program Files\Microsoft Visual Studio 8\VC\include\xtree
Line: 293

Expression: map/set iterators incompatible

Strange. What just happened? Did you just mix the iterators of different classes together? No, you’re certain you used map iterators, not set iterators. Everything SEEMS well…

Ah, that’s right. You have an iterator that points to the map’s items, and are indeed iterating over them. The map is protected by a mutex, but – so as to not block for too long – you release and reacquire the mutex every iteration. Another thread then executes and clear()’s the map, rendering your iterator useless. You now compare it to the end() iterator, and voila – crash!

The solution? Make sure you don’t invalidate your iterators while using them. That means – don’t modify the STL container from another thread (even if you are using synchronization mechanisms) if you’re not absolutely certain that you can and that you will not invalidate existing iterators. Operations like insert, erase and clear can invalidate iterators – read the details at each container’s documentation page. 

Winsock error 10049 with no apparent reason

Hey all,

So, it’s been a while since I last updated. I assume I am experiencing the way too common “early enthusiasm leads to early laziness” syndrome.

Without further a do – a lovely issue I had to face.

On certain machines – all Vista SP1 machines, we thought – certain outgoing TCP connections to well known HTTP servers (www.google.com, www.facebook.com, www.microsoft.com etc) simply failed.

The failure returned a last error code 10049 (0x2741) which means “The requested address is not valid in its context”. This usually happens when someone tries to use a non-existent local IP address as the socket address, or when someone tries to connect to an invalid remote computer.

I wrote a test program and reproduced the behavior. I discovered that, oddly, using ADDR_ANY works perfectly – but binding to a specific interface doesn’t work.

The test program seemed to work everywhere, including Vista SP2 machines. I found no mention of this behavior anywhere on-line. I disabled the Windows Firewall and the on-board ESET NOD32 3.0 Business anti-virus, and double checked that everything was turned off. It still failed. I was stumped.

I posted on alt.programming.winsock asking about this:


Hello all,

I have been witnessing some very strange behavior of Winsock on
Windows Vista SP1, and would like to share my findings to see if
anyone could help me figure out the answer to what’s wrong.

I’ve been running a simple program which tests the behavior of Winsock
when it comes to binding a UDP socket, binding a TCP socket and
binding and then connecting a TCP socket with a wildcard interface and
a specific interface.

On my Windows XP SP3, running as an administrator, I get the following
results:

—————————————————————————————-

Testing bind() UDP with 0.0.0.0:0…
+++ Success!!!
Testing bind() UDP with 0.0.0.0:1024…
+++ Success!!!
Testing bind() UDP with 0.0.0.0:32033…
+++ Success!!!
Testing bind() UDP with 0.0.0.0:55301…
+++ Success!!!
Testing bind() UDP with 192.168.2.110:0…
+++ Success!!!
Testing bind() UDP with 192.168.2.110:1024…
+++ Success!!!
Testing bind() UDP with 192.168.2.110:32033…
+++ Success!!!
Testing bind() UDP with 192.168.2.110:55301…
+++ Success!!!
Testing bind() TCP with 0.0.0.0:0…
+++ Success!!!
Testing bind() TCP with 0.0.0.0:1024…
+++ Success!!!
Testing bind() TCP with 0.0.0.0:32033…
+++ Success!!!
Testing bind() TCP with 0.0.0.0:55301…
+++ Success!!!
Testing bind() TCP with 192.168.2.110:0…
+++ Success!!!
Testing bind() TCP with 192.168.2.110:1024…
+++ Success!!!
Testing bind() TCP with 192.168.2.110:32033…
+++ Success!!!
Testing bind() TCP with 192.168.2.110:55301…
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:0…
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:1024…
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:32033…
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:55301…
+++ Success!!!
Testing bind() AND connect() TCP with 192.168.2.110:0…
+++ Success!!!
Testing bind() AND connect() TCP with 192.168.2.110:1024…
+++ Success!!!
Testing bind() AND connect() TCP with 192.168.2.110:32033…
+++ Success!!!
Testing bind() AND connect() TCP with 192.168.2.110:55301…
+++ Success!!!
Press any key to continue . . .

—————————————————————————————-

The same results appear in Windows Vista SP2, being run as an
Administrator and as a standard user. Windows firewall is disabled, or
an exception is added, in all cases.

Windows Vista SP1, with firewall disabled, eSET NOD32 antivirus
disabled and no other special software I can think of gives me this
result, however:

—————————————————————————————-

Testing bind() UDP with 0.0.0.0:0…
+++ Success!!!
Testing bind() UDP with 0.0.0.0:1024…
+++ Success!!!
Testing bind() UDP with 0.0.0.0:32033…
+++ Success!!!
Testing bind() UDP with 0.0.0.0:55301…
+++ Success!!!
Testing bind() UDP with 192.168.2.24:0…
+++ Success!!!
Testing bind() UDP with 192.168.2.24:1024…
+++ Success!!!
Testing bind() UDP with 192.168.2.24:32033…
+++ Success!!!
Testing bind() UDP with 192.168.2.24:55301…
+++ Success!!!
Testing bind() TCP with 0.0.0.0:0…
+++ Success!!!
Testing bind() TCP with 0.0.0.0:1024…
+++ Success!!!
Testing bind() TCP with 0.0.0.0:32033…
+++ Success!!!
Testing bind() TCP with 0.0.0.0:55301…
+++ Success!!!
Testing bind() TCP with 192.168.2.24:0…
+++ Success!!!
Testing bind() TCP with 192.168.2.24:1024…
+++ Success!!!
Testing bind() TCP with 192.168.2.24:32033…
+++ Success!!!
Testing bind() TCP with 192.168.2.24:55301…
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:0…
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:1024…
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:32033…
+++ Success!!!
Testing bind() AND connect() TCP with 0.0.0.0:55301…
+++ Success!!!
Testing bind() AND connect() TCP with 192.168.2.24:0…
— connect() failed: 10049 (The requested address is not valid in its
context.).
Testing bind() AND connect() TCP with 192.168.2.24:1024…
— connect() failed: 10049 (The requested address is not valid in its
context.).
Testing bind() AND connect() TCP with 192.168.2.24:32033…
— connect() failed: 10049 (The requested address is not valid in its
context.).
Testing bind() AND connect() TCP with 192.168.2.24:55301…
— connect() failed: 10049 (The requested address is not valid in its
context.).

—————————————————————————————-

I can’t seem to understand why this happens only on Windows Vista SP1,
with a specific *CORRECT* interface.

The program code is attached after this post.

Does anyone have any clue as to what is causing this?

Thanks,

Alon


I wrote an ICMP ping client and discovered that ping worked properly on all machines – including Vista SP1. Strange – is this something to do with specific ports?

I received a response in the newsgroups – something about Vista SP1’s anti-malware logic. I was so confused at this point that I actually started looking this up, but naturally – I found nothing.

Then I discovered that a clean Vista SP1 does not exhibit this issue at all – it worked properly.

I started searching for TCP/IP stack hooks, or for NDIS filter drivers – I was looking everywhere for answer, really.

I found nothing!

My final move was to recheck the firewall and anti-virus. I was amazed to discover that shutting down the ESET NOD32 anti-virus and disabling all active protection doesn’t disable the HTTP filtering module.

This module, which basically blocks all outgoing connections from or to ports 80, 3128 and 8080 from all non-approved “browser” applications, was the cause of all my problems!

I discovered this issue in only one other place on-line, at yoics.

ESET themselves explain this on their own page, but they don’t say anything about what behavior a programmer should expect to see if their HTTP filter is active.

Basically – you can can create a socket, you can bind(), but if you try and connect to an address with port 80/3128/8080 using a socket bound to a specific interface – Winsock error 10049 is thrown. This is as if the address is invalid. Confusing!

Right now I’m looking at anti-virus/firewall detection possibilities – including WMI. This might help us notify our users as to what actions they must take to unblock our application with their anti-virus “flavor”.

I hope this post will help others, somewhere, somehow. 

Proper usage of shared_ptr

shared_ptr is a very helpful class. It allows a developer to basically forget about dynamic memory deallocation. Everything is taken care of by the C++0x (or, right now, C++ TR1) library. shared_ptr objects can be used inside STL containers, as opposed to other smart pointer classes such as auto_ptr. Deallocation occurs automatically when the dynamically allocated resource is no longer referenced – this is a great answer to the “caller/callee deallocation responsibility” issue.

In my short amount of experience with shared_ptr objects, I’ve found some common pitfalls and best practices, so here goes.

  1. Pass shared_ptr objects as parameter by reference whenever possible. Why? Because passing the shared_ptr object by value will cause a copy of the object to be created, using a copy constructor. This is not a very light operation: it creates a new shared_ptr object and sets its internal pointer to the copied shared_ptr’s internal pointer, and it increments the reference count in the control block shared by all shared_ptr that point to this internal pointer. This copy construction takes 10-20 times longer than simply passing a raw pointer – and, worst of all, it does not have any added benefits. The copied shared_ptr is destroyed as soon as the scope is exited. In an application that requires many function calls with shared_ptr parameters, this can seriously hamper performance. Passing a shared_ptr object by reference, however, is equivalent to passing a raw pointer. The downside is that implicit casting is not supported when passing a reference.
  2. Don’t use shared_ptr objects in performance oriented algorithms. Don’t use shared_ptr objects in code that encodes or decodes a buffer, for example. It makes no sense. If the buffer that is being encoded/decoded is passed as a shared_ptr – great. Take the internal pointer using the get() method and code the encoding/decoding using raw pointers. Since the shared_ptr is still in scope – the dynamically allocated object will not be deallocated. Using raw pointers instead of shared_ptr’s as the basis for such encoding/decoding can prevent many, many occurences of dereferencing the shared_ptr. In our code, an experience coder made this exact mistake of using the shared_ptr for an encoder/decoder, and discovered that it operated over 20 times slower!
  3. Don’t use shared_ptr objects for private data members, unless they are passed in via the constructor or setter methods. I don’t think there is any need explaining why this is pointless – if a data member is constructed internally and is used internally, it can be deleted internally.
  4. Use enable_shared_from_this when necessary, but remember that shared_from_this() only works when you already have a shared_ptr somewhere in the process. If you call shared_from_this() on an object that was dynamically created somewhere, but does not already have a shared_ptr object pointing to it – you’ll see the following exception:

    std::tr1::bad_weak_ptr at memory location 0xDEADBEEF

    or perhaps:

    std::bad_weak_ptr at memory location 0xDEADBEEF

    This is because shared_from_this() actually searches for a shared_ptr control block (which, as we stated above, is shared for all shared_ptr objects and contains the reference counter) inside its internal data structures – and if it does not find a control block (as there is no shared_ptr object at all), it throws this exception. It goes without saying that a shared_ptr to an automatic (stack allocated) variable is as bad as delete’ing such a variable.

ACE_High_Res_Timer runs too slow – loss of precision

This post is based on an ACE PRF, so it is formatted like it. Sorry :)

ACE VERSION: 5.6.9

HOST MACHINE and OPERATING SYSTEM:

x86 architecture (32-bit), Windows XP Service Pack 3, Winsock 2.2 or newer (WS2_32.DLL version is 5.1.2600.5512)

TARGET MACHINE and OPERATING SYSTEM, if different from HOST:

Windows 2000/XP/Vista/7

THE $ACE_ROOT/ace/config.h FILE [if you use a link to a platform-
specific file, simply state which one]:

config-win32.h

THE $ACE_ROOT/include/makeinclude/platform_macros.GNU FILE [if you
use a link to a platform-specific file, simply state which one
(unless this isn’t used in this case, e.g., with Microsoft Visual
C++)]:

Not used, MSVC8 and MSVC9.

CONTENTS OF $ACE_ROOT/bin/MakeProjectCreator/config/default.features
(used by MPC when you generate your own makefiles):

Not used.

AREA/CLASS/EXAMPLE AFFECTED:

Any code reliant on both ACE_High_Res_Timer for timing precision, and
the system clock for standard timestamping and comparisons.

SYNOPSIS:

ACE_High_Res_Timer runs too slow, when compared to the standard system
time function: ACE_OS::gettimeofday().

About every 4 hours, my high performance timer seems to lose 1 second,
when compared to the standard system time. This can be clearly seen via timer
printouts every 15 seconds.

Unfortunately, this means that my code – which uses an
ACE_Timer_Queue_Thread_Adapter object – gets notified of events too early,
and since my code expects to be run *after* the event time, it doesn’t work
properly.

DESCRIPTION:

ACE_High_Res_Timer gets its global scale factor from the host operating system.
This scale factor, as explained in the documentation, is used to convert the high
performance timer to seconds. This factor is stored in an ACE_UINT32 member of
the timer object, called global_scale_factor_.

Under Windows (and generally in most OSs) the amount of high resolution timer
ticks per second is returned as a 64-bit value, in my case – 2341160000. It is divided
by ACE_HR_SCALE_CONVERSION, which is defined as
ACE_ONE_SECOND_IN_USECS, which is defined as 1000000. This divided value
is then set to global_scale_factor_.

Note that 2341160000 / 1000000 = 2341.16, and so 2341 is set as the
global_scale_factor_. Thus – we lose 16 ticks of precision every 234116 ticks.

16 / 234116 = 0.0000683421893, which means that we lose 0.98412752592 seconds
in our high resolution timer every 4 hours. After 10 days of operation – we are a whole
minute earlier than the system clock.

REPEAT BY:

Use an ACE_Timer_Queue_Thread_Adapter, and set it to use ACE_High_Res_Timer instead of the standard system time. Set a timer event – every 15 seconds, for at least 4 hours. Use ACE_LOG((LM_DEBUG, “Timestamp:%T\n”)) to print out the current system time when that timer event is triggered.

Sample output:

After 0 hours: Standard time: 18:19:41.093000, HiRes time: 04:56:08.082000
After 4 hours: Standard time: 22:19:39.437000, HiRes time: 08:56:08.082000
After 8 hours: Standard time: 02:19:38.406000, HiRes time: 12:56:08.082000

SAMPLE FIX/WORKAROUND:

A workaround would be to add the time difference (about a second every 4 hours)
manually by having a timer event that causes this value to be calculated. It is
quite trivial to calculate this, but changing the actual timers based on it can be
a hassle.

Another workaround would be to simply use the system time as the basis for the
ACE_Timer_Queue_Thread_Adapter, at the cost of losing precision – but at least
with the timer events executed *after* the system time, not *before* the system time.

A better fix would be to add an ACE_USE_64BIT_HIGH_RES_TIMER_CALCULATIONS
definition, that – if set – would allow the timers to use 64-bit division operations,
and would not require the division of the global_scale_factor_ by ACE_HR_SCALE_CONVERSION. Instead, the division would be done using 64-bit
integers, and the division by ACE_HR_SCALE_CONVERSION would be done
later on – on the final result.

Instead of calculating:

(QueryPerformanceCounter() / ACE_HR_SCALE_CONVERSION) /
(QueryPerformanceFrequency() / ACE_HR_SCALE_CONVERSION)

we could do:

(QueryPerformanceCounter() / QueryPerformanceFrequency()) /
ACE_HR_SCALE_CONVERSION

There’d be less loss of precision.

Does anyone have any other suggestions on how I should deal with this? 

Speeding up build times in Visual Studio 2008

Build times in my current project have been slowly ramping up.

Recently, it took me about 120 seconds to complete build the project from scratch – which can be quite annoying and anti-productive. This is in an early stage of development. I want to cut down on build times as soon as I can.

My first instinct was distributed building. I know of an Israeli solution called IncrediBuild which greatly speeds up build times in organization with many machines. Since my organization has many machines, when compared to the amount of developers, and since most of the machines are idle most of the time – this seemed like a good way to speed things up. However, the solution is very expensive.

I could not find any good free or open source solutions that support Visual Studio 2008. I did discover distcc, which is a good solution for non-Windows development. There is a project under development that aims to add support for Visual Studio 2008 to distcc – this might be a good solution in the future.

In the meanwhile, I decided to explore other ways to optimize the build time. I discovered this lovely article, which pointed my in several interesting directions.

Here are my conclusions.

  1. Use the full capabilities of your CPU! There is a new option in the Visual Studio 2008 C++ compiler named /MP (“Build with Multiple Processes”). Adding /MP to the compiler options cut down the build time on my machine by 60%! Note that it seems that this is only worthwhile on systems with multiple processors or multiple cores. To do this, go to the following property page:

    Project > “Project” Properties… > Configuration Properties > C/C++ > Command Line

    Under “Additional options”, add:

    /MP

    Make sure you also go to

    Project > “Project” Properties… > Configuration Properties > C/C++ > Code Generation

    and disable the “Enable Minimal Rebuild” option, which is incompatible with the “Build with Multiple Processes” option. This might seem like a strange thing to do – disabling the option that is described as “Detect changes to C++ class definitions and recompile only affected source files” – but based on my experience, “Build with Multiple Processes” gives a much bigger performance boost.

  2. Use precompiled headers! They seem like a hassle, but using them correctly, as explained in this lovely article by Bruce Dawson cut compilation time from 53 seconds to 13 seconds in my current project! Obviously, you will benefit much more from precompiled headers if you use a “whale” include file that is static (non-changing), like “Windows.h”, various STL headers and – in my case – the ACE headers. More about this issue, in a more user friendly way, can be found here.
  3. Employ smarter physical code design. This is based on ideas documented by John Lakos in his wonderful book Large Scale C++ Software Design, which – while outdated nowadays – still is a worthy read. You can learn a lot about this in these two articles in Games from Within. Remember, however, that using these techniques can sometimes impair the clarity and brevity of your code. I made some small changes in my code – and cut down the build time by another several seconds.

    All in all, after employing all of these techniques – I managed to cut down the build time from ~120 seconds to ~10 seconds. Pretty cool for 30 minutes of research and coding.

"If" conditional does not work properly

While debugging, I noticed that an “if” conditional displayed some pretty strange behavior.

I had code similar to this:

if (SomeFunctionReturningBoolean() == true)
{
// do something…
print “1”;
}
else
{
// do something else.
print “2”;
}

Strangely, even though SomeFunctionReturningBoolean() returned true (as the debugger helpfully noted) – the “else” clause was executed, rather than the “if” clause.

I started investigating this.

First, I suspected a mismatch between the code and the executable. I rebuilt from scratch, but no dice. I manually cleaned all of the intermediate files, nope – same strange behavior.

I then changed the code to the following:

bool fResult = SomeFunctionReturningBoolean();

if (fResult == true)
{
// do something…
print “1”;
}
else
{
// do something else.
print “2”;
}

fResult clearly held “true”, but “2” was still printed.

Next, I changed the code to:

bool fResult = SomeFunctionReturningBoolean();

if (fResult == true)
{
// do something…
print “1”;
}

if (fResult == false)
{
// do something else.
print “2”;
}

Neither “1” nor “2” were printed. Strangely, this seemed to work:

bool fResult = SomeFunctionReturningBoolean();

if (fResult)
{
// do something…
print “1”;
}
else
{
// do something else.
print “2”;
}

it printed “1”. I started wondering whether the function was indeed returning true. I changed to disassembly mode, and to my surprise – I noticed that the function was returning 0x6E, and that the equality check was:

cmp eax, 1

Going deeper into the code, I discovered that a function used by SomeFunctionReturningBoolean() as a basis for its return coded simply did not return any value. The programmer was annoyed by some of the warnings returned by an external include file, and disabled all of the warnings using a pragma – so the “Not all code paths return a value” warning simply did not appear when I compiled the code.

The Visual Studio 2008 debugger treats a bool variable with a value of 0 as false, and it displays “false” as its value in any variable windows. If the bool variable contains anything but 0 – the value displayed is “true”. This is generally correct, since non-zero values are handled as “true” by most conditional assembly instructions. However, a comparison instruction like

if (variable == true)

is compiled to

cmp eax, 1

which does not check for “true”-ness, but rather for equality with 1 – which is the value that “true” signifies in the current Microsoft C++ world.

So – the actual source code compared the bool to a specific value – 1, and since 0x6E is not equal to 1 – the check failed and we executed the “else” clause.

So, what did we learn from this issue?

  1. Never disable all of the warnings using a pragma – disable specific warnings, otherwise you’ll lose important warnings that might appear.
  2. The VS2008 debugger displays “true”/”false” values in the variables window based on a check of “true”-ness, not an equality to the “true” value.
  3. The “Go to assembly” can be very, very useful at times. You should familiarize yourself with it.