|
Here is the test code:
http://codepad.org/zKTatneG[^]
struct Empty
{
};
struct JustLong
{
long data;
};
struct LongAndEmpty : public JustLong, private Empty
{
};
struct StringAndEmpty: public std::string, private Empty
{
};
int main()
{
cout << "Empty:" << sizeof(Empty) << endl;
cout << "JustLong:" << sizeof(JustLong) << endl;
cout << "String:" << sizeof(std::string) << endl;
cout << "LongAndEmpty:" << sizeof(LongAndEmpty) << endl;
cout << "StringAndEmpty:" << sizeof(StringAndEmpty) << endl;
return 0;
}
When you try it in other compiler (for example, see link above for codepad.org), results are correct, i.e. output looks like this:
Empty:1
JustLong:4
String:4
LongAndEmpty:4
StringAndEmpty:4
Now, when you try this in VC++ (2010), size of struct with std::string is always (debug and release) 4 more that size of just std::string. Any other structs/classes are of correct size, and only those with std::string are bigger than they should be. Am I missing something or this is MS bug?
|
|
|
|
|
First it's not technically a bug unless it either doesn't work or doesn't meet the C++ spec. I don't think either of these things is the case.
Second it looks much more likely that StringAndEmpty is larger because it's derived from an already complex class the v-table for which is compiled into an external module. It's a guess but I reckon you've got an extra pointer in there to look up the pre-existing std::string vtable.
I have read a full explanation of the various sizes and formats of MSVC and other compilers minimal/empty/simple/complex structs and classes with and without virtual inheritance. I can neither remember all the details nor where I read it but I do remember that there were a lot more variations 4-byte, 8-byte, 12-byte, 16-byte, 20-byte 'headers' than I would ever have thought and much of the variation was within one type of compiler rather than between them. If I remember where that detailed research is I'll post a link.
"The secret of happiness is freedom, and the secret of freedom, courage."
Thucydides (B.C. 460-400)
|
|
|
|
|
Well, consider the following struct:
struct JustString: public std::string
{
};
It has the same size as just std::string, so no, there is no extra pointer in VFT. The question I have is about Empty Base Class Optimization - size of empty base class should be reduced to zero in descendant if possible. Yes, it is not a requirement, only a suggestion in a standard. Problem is, I don't understand this: it works for non-MS compilers; in MS-compiler it works for all cases EXCEPT std:string. I will check later with VC++ 2012 to see if they made it work, but right now I have no other explanation than some problems with MS compiler. Or is there any compiler option specifically for this case that I don't know about?
|
|
|
|
|
The example you give would not need an 'extra' pointer because it would only need the one to the external vtable as everything internal can be optimised away. So far only when you have an internal and an external base you have found the size increases. It's likely that even in that case the internal base could be optimised away but it isn't which may be a flaw in the Microsoft compiler or there may be some practical reason why that shouldn't be done. As I mentioned there are quite a number of combinations of single vs multiple inheritance, virtual and non-virtual bases, local and external bases, exported and non exported, complete and incomplete classes at the point of first declaration. Some optimisations such as empty base class removal will be available under some combinations and not others. For a comparison you could try the Clang compiler which is more C++11 compliant than MSVC and being open source has support forums where you can ask exactly why they did it a particular way. Some odd behaviour in this area in MSVC is more about backward compatability of the resulting binaries with older version of the compiler and of Windows than about mistakes in the current version.
"The secret of happiness is freedom, and the secret of freedom, courage."
Thucydides (B.C. 460-400)
|
|
|
|
|
Kosta Cherry wrote: ...size of empty base class should be reduced to zero in descendant if possible. The size of an empty class must not be zero to ensure that the addresses of two different objects will be different.
"One man's wage rise is another man's price increase." - Harold Wilson
"Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons
"Show me a community that obeys the Ten Commandments and I'll show you a less crowded prison system." - Anonymous
|
|
|
|
|
Standalone one - yes. But we are talking here about descendants, where empty class/struct can be eliminated. This is called Empty Base Class Optimization (you can google for it).
|
|
|
|
|
Kosta Cherry wrote: Am I missing something or this is MS bug?
The former.
As you have already googled this you understand that it is an optimization. Failure to optimize is not a "bug".
|
|
|
|
|
Hello Friends
I am creating a MFC application in which I am saving a image from DC into bmp.I gave different effects on that image by using Gaussian Filter Formula by following this link
http://lodev.org/cgtutor/filtering.html[^]
I am setting image blur,embossed and etc by getting image pixels and then settign back.
Now, I want to add noise Effect to image based on some range(0-100)[if 10 then less particles in image,if 20 more particles and so on].
Using same Gaussian,I didnt get any idea to use any filter to give effect of Noise.
I tried my own formula also but it is not adding random particles to image.
Here is own way to set noise.I got pixels of Image and randomly change some pixels to black but tht was not producing noise in some continuous manner. I want it should apply in some manner.
int n = 0;
for(int x = 0; x < nWidth; ++x)
{
for(int y = 0; y < nHeight; ++y)
{
n++;
if(n > 9 + randomFactor)
{
n =0;
randomFactor += x; memDC.SetPixel(x,y,RGB(0,0,0));
}
}
}
I want to add noise randomly but in some continuous manner.
Any Ideas??
Regards
Y
|
|
|
|
|
Well... I have to admit I have no idea of the Gaussian formula but I found a page with formulas which may help you:
IA State[^]
|
|
|
|
|
Two things: You are not really adding noise, but just setting a pixel to black here and there.
The idea of noise is to modify all color components of all pixels by adding a random value (plus or minus). So, start off by using a random number generator like rand(), scale its return value to the amount of noise you want and add the outcome pixel by pixel. You may think of it as computing a noise image and adding that image to your original image.
If you want to refine that, don't use equally distributed random numbers, but gaussian distribution or others. And you may apply filterind (low-pass, higo-pass, etc) to the noise image before adding to modify the spectral distribution of noise.
|
|
|
|
|
Hope you don't mind me posting this as a second question, but wanted to catch attention of anyone who is especially knowledgeable about timeGetTime as well.
Here is the simple code I'm using:
DWORD dwt=timeGetTime();
static DWORD dwt_was;
if(dwt_was!=0)
if(dwt<dwt_was)
dwt=dwt;
dwt_was=dwt;
Also - it's not the timer wrap around
I have just got it here set a break point on the dwt-dwt line and it shows:
dwt_was 54493247
dwt 54493246
How can timeGetTime() go backwards in time?
In case it is relevant this is on a Lenovo Thinkpad Edge e520
ALL FIVE TIMERS ARE INACCURATE - even when they are constrained to a single core:
GetSystemTimeAsFileTime(..)
timeGetTime()
GetTickCount()
QueryPerformanceCounter()
RTDSC()
They all jump back in time frequently on this machine. I have tested them all constrained to a single core.
The only thing I can think of is to try a best guess averaging process as described in this article but not convinced it will be much of an improvement or worth the work involved:
http://www.mindcontrol.org/~hplus/pc-timers.html[^]
Discussion here
http://devmaster.net/forums/topic/4670-need-help-with-queryperformancecounter-and-dual-processors/[^]
On the whole it works reasonably well most of the time. Only now and again get these glitches, they go back in time perhaps every second or two, by a ms or two - and with the metronome typically only playing a few notes per second at most, the chance of hitting a major glitch is fairly small.
I think that's the only reason that it is working as a functional metronome on my computer. It's actually pretty good, but wanted to make it more accurate than it is already.
Also this two discussion:
http://stackoverflow.com/questions/2904887/sub-millisecond-precision-timing-in-c-or-c[^]
There's also the KeQueryPerformanceCounter but I think that might just be for writers of Drivers - and with all the other ones not working don't know if it would fix the issue anyway.
There's the Windows timestamp project here:
href="http://windowstimestamp.com/description
[^]
but it seems to be work in progress, don't think you can actually use it in your apps yet??
If it can't be fixed, I'm also interested in any setting I can ask the user to set on their computer if they want to have the highest possible accuracy of timing? E.g. to switch off some power saving frequency adjusting feature or something?
The only other thing I can think of is to write to midi, convert the midi to audio, and play the audio - that is well timed, but not possible to do in real time.
(sorry forgot you can edit these posts - hence all those deleted extra messages below).
modified 15-Mar-13 3:39am.
|
|
|
|
|
|
|
|
|
This is now solved.
First - it was a core switching issue. The code to confine to a single core has to be called in advance - I think possibly the thread has to sleep first before the system can assign it to a new core, at any rate doesn't work to just set it immediately before every time check and release it again afterwards. Also there was a bug in my code when checking to see if hte time was going backwards forgot to check if the previous time was recorded for the same thread, the dTimeWas should have been a thread local variable.
But - even after fixing all that, and tied to a single core, the timing was still inaccurate. It was monotonic but inaccurately timed, the reported time in ms sometimes passed more quickly than real time and sometimes passed more slowly.
I could check this by doing a real time recording to audio of the notes played by my app - which according to the high performance counter were played at equally spaced 100 ms intervals - but when you looked at the recording the actual recorded times were offset sometimes as much as 30 ms from the previously recorded note - had one recording that recorded an 80 ms note followed by a 110 ms note when the high precision timer said that they were all exactly 100 ms to within sub millisecond precision.
Finally fixed it by looking up the interrupt timer which is available to every user mode process as a volatile area of shared memory in a structure called KUSER_SHARED_DATA in the same location in the address space in every process.
This timer is highly accurate, not just sub-millisecond, it is also well sub-microsecond on my laptop anyway. It also records the time correctly. And there is almost no overhead involved in looking it up as it is only a memory look up, just like accessing any other area of memory.
Details here:
QueryPerformanceCounter-inaccurate-timing - SOLVED[^]
|
|
|
|
|
Hi there,
I'm using the high performance counter for timing musical notes - and have run into a problem. It doesn't time them accurately. What should be a regular rhythm is irregular.
When I debug to find out what is happening, then the numbers returned by QueryPerformanceCounter(..) sometimes change direction - in this code below, if I put a break point on the
if(ddwt<ddwt_was)
ddwt=ddwt;
then it gets triggered.
As you see, I have requested that the thread runs on core 1 of dual core machine. I'm testing it on a laptop with two processors and two cores in each processor.
Is SetThreadAffinityMask(..) perhaps not enough to ensure that the thread runs on a single core? Or is there some other reason for it?
Is there any way to fix it so I can get sub-millisecond timing working accurately on any computer?
Thanks for your help.
LARGE_INTEGER HPT_PerformanceCount,HPT_PerformanceFrequency;
int found_HighPerformanceTimer_capabilities;
BOOL bHasHighPerformanceTimer;
double HighPerformanceTimer(void)
{
DWORD threadAffMask=SetThreadAffinityMask(GetCurrentThread(),1);
DWORD threadAffMaskNew=0;
if(found_HighPerformanceTimer_capabilities==0)
{
bHasHighPerformanceTimer=QueryPerformanceFrequency(&HPT_PerformanceFrequency);
found_HighPerformanceTimer_capabilities=1;
}
if(HPT_PerformanceFrequency.QuadPart==0)
{
SetThreadAffinityMask(GetCurrentThread(),threadAffMask );
return timeGetTime();
}
QueryPerformanceCounter(&HPT_PerformanceCount);
threadAffMaskNew=SetThreadAffinityMask(GetCurrentThread(),threadAffMask );
{
__int64 count=(__int64)(HPT_PerformanceCount.QuadPart);
__int64 freq=(__int64)(HPT_PerformanceFrequency.QuadPart);
double dcount=(double)count;
double dfreq=(double)freq;
double ddwt=dcount*1000.0/dfreq;
#ifdef _DEBUG
static double ddwt_was;
if(ddwt_was!=0)
if(ddwt<ddwt_was)
ddwt=ddwt;
ddwt_was=ddwt;
#endif
return ddwt;
}
}
|
|
|
|
|
|
Just a hunch, but I believe it might be processor related.
Out-of-Order eXecution (OOX)
Code entering the pipeline isn't always executed sequentially. Most modern CPUs do it.
"It's true that hard work never killed anyone. But I figure, why take the chance." - Ronald Reagan
That's what machines are for.
Got a problem?
Sleep on it.
|
|
|
|
|
Hi, sorry I didn't reply. I've got involved in a long discussion with some friends on Facebook and doing many tests, but nothing clear yet it is quite puzzling.
But - some things I have found out.
First - you get really big errors sometimes as much as 15 ms was the largest I got which seems to rule out things like OOX?
Also - it seems to be something to do with the cores the thread runs on - and seems that SetThreadAffinityMask is not quite doing what I expect it to do.
I've tested:
RTDSC
QueryPerformanceCounter
GetSystemTimeAsFileTime
timeGetTime
GetTickCount
All of them run backwards sometimes, typically you get glitches like that every second or so on the higher resolution timers (HPC and RTDSC) and somewhat less often but still frequent on the other timers.
If I use SetThreadAffinityMask before every single time check, that doesn't fix the issue.
I think that maybe after you use SetThreadAffinityMask(..) then the thread or process has to sleep before the OS assigns it to the desired core.
SOLUTION ATTEMPT - TIE RHYTHM THREAD TO A SINGLE CORE
So - I set the thread affinity for the entire rhythm playing thread to core 1.
When I do that then again all the timers are monotonic.
So it rather looks as if it is a cpu core issue.
BTW my laptop has turbo boost. So I wondered if it was something to do with that. Ran the turbo boost monitor and I can see it going up and down a lot from 2.7 to 2.8 Ghz say. I don't know if it is anything to do with that.
Anyway - but this still doesn't seem to have solved the issue.
NOTES ARE STILL NOT PLAYED TO SUB MILLISECOND PRECISION - ERRORS OF up to 20 MS
I tested this by recording to audio as the notes are played.
One test I did just now has notes that according to my program and the windows timers were sent at 100 ms intervals with maximum error 27.576 µs (this is with the program running at real time base priority and with the rhythm playing thread at time critical on top of that so should have nothing at all interrupting it).
But it was audibly well out - and when I look at the audio recording, I had for instance one note of 80 ms immediately followed by another note of 110 ms, or a variation in the timing of about 140 % over a single note, for 100 ms notes.
That's using QueryPerformanceCounter localized to a single core - it is monotonic now, but doesn't seem to be accurate.
It might alternatively be some delay in the synth, but seems unlikely that any synth could be so badly programmed that it causes delays of as much as 30 ms, though I can do more tests by testing different synths that way.
It is unlikely to be midi relaying causing these delays as I am using a virtual cable to do the midi relaying, and I can relay about 10 midi notes per ms around a loopback and back to my program again.
Also don't see how my app could be introducing these sorts of errors - I have very much simplified the code so that the code to play the notes is very lightweight.
Basically it calculates the notes to play well before they are needed, then sits in a realtime busy wait for a few ms until the note is ready to be sent out via midi out. So there are just a few lines of code at the moment the note is played. It exits the busy wait, checks the time (which it reports as the time the note was sent) and then in the next line of code, sends it using MidiOutShortMsg.
HARDWARE CLOCKS
Anyway I have also researched and found out that there are actually two hardware clocks on the processor, the real time clock (over 1 Mhz so would achieve microsecond precision), and the HPET timer. Windows however by default doesn't seem to use either of them for its timers which might be the reason for the performance issues..
You can force Windows to use the HPET timer by using
bcdedit /set useplatformclock true and rebooting (and also setting HPET in the bios)
(Which some people find causes other performance issues so not sure if it is a good solution).
But I don't seem to have any way to enable HPET in the bios on this computer so it might not have it.
I haven't tested this yet to see if it makes a difference
OTHER WAYS TO ACCESS THE RTC
I think you can access the real time clock in kernel mode while writing a driver. Not sure, but KeQueryPerformanceCounter might access the RTC (or HPET). In Windows CE you have OEMGetRealTime but that doesn't seem to be available in Windows 7.
In Windows 8 you have GetSystemTimePreciseAsFileTime which at least according to the description seems to be using one of the hardware timers because it claims less than microsecond precision - could be using HPET but could be using the RTC which would give sufficient precision too (just).
I only have Windows 8 within a virtual machine however (seems unlikely it would work there) and haven't tested that.
THE WAY AHEAD
If I can't solve this, the only solution I can think of is to create my own sample player- and that would work because it would use audio streaming and you can simply count the number of samples played to work out the time, which is guaranteed to be accurate so long as the audio is played without any breaking up of the sound. It is for a metronome so a reasonable enough sample player for that purpose doesn't seem to tricky to do, playing non melodic percussion.
Then - presumably will get cleared up for anyone running it on Windows 8 on native hardware with GetSystemTimePreciseAsFileTime() though I haven't tested that yet.
ANY THOUGHTS ANYONE?
So anyway that's where it is at now. Once more really interested to know if anyone has any other thoughts about this or other avenues to explore.
There are so many tests you can do - and can't seem to find any online account of this in detail, just some intriguing but not complete forum discussions.
Thanks!
|
|
|
|
|
If you're seeing 15ms then I'm pretty sure that's the thread 'quantum' of Windows scheduling. It used to be 10ms on Windows NT 4.0
I've encountered this problem too. The best you can do is to not bother with the high perf counters, as they're still at the mercy of scheduler. All I do is get the average of about 3-5 readings, but the trick seems to be to put the thread to sleep until the next exact second (1000-current_millisecond) and repeat with timing durations less than the scheduling quantum. Which means it will take 3 - 4 seconds to get your average. Put thread into realtime priority, which should reduce preemption. (if you're confident that your timing-code is free of bugs)
sleep until msec:000, timing loop < 15ms
repeat as many times as you need to, the cpu scaling of some processors creates problems with initial result, so be prepared to discard the first and second result or ramp up the cpu to 100%.
If it's any consolation it's the same problem with any pre-emptive OS, but also disappointing to know that xGHz machines can't give you a reliable reading from within a modern OS. Linux calculates the timing before it boots.
"It's true that hard work never killed anyone. But I figure, why take the chance." - Ronald Reagan
That's what machines are for.
Got a problem?
Sleep on it.
modified 19-Mar-13 7:12am.
|
|
|
|
|
Thanks that does explain one glitch I got with GetTickCount and timeGetTime.
With the HPC though after dealing with the core changing issues, then it wasn't doing any sudden big jumps. Just measuring the time wrong.
I'm doing this to measure musical notes. But I've found a solution at last .
The key was discovery of yet another MS Windows timing routine, KeQueryUnbiasedInterruptTime which looked like the type of precise timing I need, but unfortunately only accessible in kernel mode
KeQueryUnbiasedInterruptTime[^]
However, then I found, that there is a little known structure called KUSER_SHARED_DATA which is added to the process address space of every user mode process at the address 0x7FFE0000 which is a system wide shared memory location. It is volatile, and gets updated with the kernel interrupt time continuously. This means that to check the kernel interrupt time, all you need to do is to look at the correct address location in your process address space.
This apparently is a hardware timer, not affected by changes in cycle rate of the processor cores. It might be using either the RTC or the HPET though I don't know which it is. And since it is just a memory look up then there is far less overhead involved than in any of the other timers used in Windows.
Combining this with the code I have already, which does a sleep that stops slightly short of the desired time followed by a busy wait time checking loop at real time time critical mode until the time is reached - and you get sub microsecond "sample perfect" timing of midi in Windows.
Here is the code:
typedef struct _KSYSTEM_TIME
{
UINT32 LowPart;
INT32 High1Time;
INT32 High2Time;
} KSYSTEM_TIME, *PKSYSTEM_TIME;
typedef struct _KUSER_SHARED_DATA
{
volatile ULONG TickCountLow;
UINT32 TickCountMultiplier;
volatile KSYSTEM_TIME InterruptTime;
volatile KSYSTEM_TIME SystemTime;
volatile KSYSTEM_TIME TimeZoneBias;
} KUSER_SHARED_DATA, *PKUSER_SHARED_DATA;
#define MM_SHARED_USER_DATA_VA 0x7FFE0000
#define USER_SHARED_DATA ((KUSER_SHARED_DATA * const)MM_SHARED_USER_DATA_VA)
double dInterruptTimer(void)
{
union
{
KSYSTEM_TIME SysTime;
__int64 CurrTime;
} ts;
ts.SysTime.High1Time = USER_SHARED_DATA->InterruptTime.High1Time;
ts.SysTime.LowPart = USER_SHARED_DATA->InterruptTime.LowPart;
return ts.CurrTime/(double)10000;
}
I got this by modifying source code available here
_glue.c[^]
which is part of Microsoft's "Invisible computing" real time operating system also known as MMLite. Microsoft Invisible Computing[^]
So it is documented Microsoft code and not just a hack using undocumented structures. It requires Windows NT or later.
This is an actual recording made in real time using my program to play the notes via midi on the Microsoft GS Wavetable synth:
This is a test of the use of the Windows interrupt timer to time notes in Bounce Metronome.[^]
Here is a screen shot of the recording where you can see the sample precise alignment of the notes. You can tell it is sample precise because all the details of the waveform are exactly the same - all the irregularities that you get which will look slightly different if you move the waveform by a sample or so and then look at it zoomed out like this.
screen shot of recording with sample precise timing[^]
This also seems like a great way to do performance testing too, without all that averaging we are used to, and with almost no overhead.
|
|
|
|
|
Glad you've finally solved it.
I was going to suggest the multimedia timers as another option. Kernel mode stuff is a bit beyond me to be quite honest.
I'd still like to find out why the timing stuff sometimes appear to go backward. It's not just windows, I've had similar problems with the Java VM.
Thanks for the invisible computing links, I'm always interested in Microsoft Research projects with accompanying source code.
"It's true that hard work never killed anyone. But I figure, why take the chance." - Ronald Reagan
That's what machines are for.
Got a problem?
Sleep on it.
|
|
|
|
|
Yes I've tried the multimedia timers. The problem is that on Windows they aren't quite good enough for musicians, frankly. They don't let you quite achieve the 1 ms precision that musicians require. There is a Microsoft article about that here too:
Guidelines For Providing Multimedia Timer Support[^]
So, anyway, it actually turned out it is only partly solved. I had the HPET enabled as well - the HPET described in that article which I think is a normal feature of modern chips.
To enable it you open up an admin level command prompt and type:
bcdedit /set useplatformclock true
then you need to reboot.
This requires Vista or later.
You know it is enabled if QueryPerformanceFrequency gives you a frequency in the range of 14+ MHz. You might possibly need to enable it in the BIOS as well.
To disable you do the same and type
bcdedit /deletevalue useplatformclock
and reboot.
It's disabled by default. Which seems surprising since it improves multimedia performance - or should - but from the forums it seems some users have issues including reduced performance of some games and mouse pointer "ghosting" and some report freezes with it enabled in the forums. So they might have decided it is more trouble than it is worth to have it enabled by default.
So - with HPET enabled and using the interrupt timer I get this perfect timing.
Actually turns out the interrupt timer changes only once every ms on my computer (roughly) and stays steady in between those calls. So I suppose on some other computers might change less often.
So though very fast look up, you wouldn't probably use it for code performance testing. I'm planning to use it most of the time, but to use the QueryPerformanceCounter(..) for the last couple of ms of the loop just to time fractional ms increments, if required (or for longer if it turns out to have larger increments than 1 ms).
Also bug fix in the previous code, seems you have to do this every time you check it:
for(;;)
{
ts.SysTime.High1Time = USER_SHARED_DATA->InterruptTime.High1Time;
ts.SysTime.LowPart = USER_SHARED_DATA->InterruptTime.LowPart;
if(ts.SysTime.High1Time == USER_SHARED_DATA->InterruptTime.High2Time);
break;
ntests++;
}
On the backwards timers, there is one thing to be careful about - this caught me out - if you have a multi-threaded app with different threads calling the time simultaneously it might seem to run backwards just because you are using the previously recorded time of one thread and comparing it with the current time of the current time. I fixed that by using thread local variables to check the previously recorded time.
It turns out that that was the reason why I thought I had the time going backwards - not multiple cores or anything just a bug in the code for testing if the time was monotonic in a multi-threaded app.
Sorry about that!
Anyway - still have these timing issues for users who run my program on a computer with the OS set to not use the HPET - as it is by default. So - what I'd really like to know is if there is any way to access the HPET in the case where Windows isn't using it for timing itself? Is there some assembly language way for instance to get at its register even though Windows itself isn't using it?
Don't really think there is or surely someone would have posted a way to do it by now, and everyone would be doing it, but you never know .
|
|
|
|
|
Oh - still getting those reversed times
static Spec_Thread double ddwt_was;
if(ddwt_was!=0)
if(ddwt<ddwt_was)
ddwt=ddwt;
ddwt_was=ddwt;
ddwt 22589372.894629
ddwt_was 22589372.895075
And this is fixed using:
{
DWORD threadAffMask=SetThreadAffinityMask(GetCurrentThread(),1);
QueryPerformanceCounter(&HPT_PerformanceCount);
SetThreadAffinityMask(GetCurrentThread(),threadAffMask );
}
So - seems - that code actually does work, you don't need to do it for the entire thread, just with every call as in the example code.
Sorry for the confusion.
So - seems - here anyway - that you can get time reversals if you let the time be measured on any core.
You avoid them if you access the interrupt timer via KUSER_SHARED_DATA though that doesn't have the same resolution on my computer as the QueryPerformanceCounter (only changes every 1ms - though it seems likely that it reports the exact time at the moment that it changes - so you could time an exact ms by just waiting in a busy loop for it to change).
But to get the clocks running at a constant rate you need to force the OS to use HPET. Doesn't seem to be anyway for a program to access HPET if Windows isn't using it as far as I can see anyway.
HPET is guaranteed, on Intel machines anyway, to be accurate to 0.05 % over 1 ms and
http://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/software-developers-hpet-spec-1-0a.pdf[^]
|
|
|
|
|