C / C++ / MFC

Bram van Kampen12-Nov-07 16:33

12-Nov-07 16:33

Allignment Cry | :((

I thought this was a quite Basic, Simple and well understood topic, but I must admit, cannot find any articles about it. Must write a short one myself to fill this void. Watch this space, and I get something together soon.

Regards,

Bram van Kampen

George_George12-Nov-07 19:54

12-Nov-07 19:54

Hi Bram van Kampen,

I am not saying that I do not understand what is memory alignment. Smile | :)

My question is I do not know why you mentioned memory alignment matters to the performance of equal comparison in my question. Could you provide more information please?

regards,
George

JudyL_MD13-Nov-07 2:00

JudyL_MD

13-Nov-07 2:00

If the item being compared is less than the natural word size in the CPU, the item may need to be shifted before the comparison takes place. That takes more time than if the item falls on a natural word boundary.

Judy

George_George14-Nov-07 1:01

14-Nov-07 1:01

Thanks Judy,

I do not know why if loading one byte, other than 4-bytes (32-bit), the system needs additional efforts to do alignment? For one byte, the system can load the continuous 4 bytes from memory which contains the one byte and could meet with 32-bit alignment requirement, and in this way, only one CPU cycle is needed, and no additional cycle is needed to do alignment for 4-byte (32-bit) data, right?

Please correct me if there are anything wrong with my statements above. Smile | :)

regards,
George

JudyL_MD14-Nov-07 3:32

JudyL_MD

14-Nov-07 3:32

Read Bram's post again more carefully. The difference comes in if the item to be compared is not one byte long and is not stored on a 4-word boundary. If the item is 4 bytes long but is not aligned on the boundary, the CPU must do two fetches to get the item in its entirety before it can compare.

Judy

Bram van Kampen14-Nov-07 15:45

14-Nov-07 15:45

Thanks, At least You Understand what I'm trying to explain. You confirmed I'm Not writing gibberish afterall. Trying to close the tread , without success sofar.

We were all learners once.

Regards and thanks

Bram van Kampen

George_George14-Nov-07 18:11

14-Nov-07 18:11

Hi Judy,

In my original question, I am comparing one byte with another one byte. So, I do not think there is any alignment issues. Bram is talking about if a WORD or something is put across the alignment boundary, so we need to 2 CPU cycles to fetch -- it is another case. I am talking about a byte, not a WORD.

Please feel free to correct me if I am wrong. Smile | :)

If we need any additional alignment operations for one byte (not one WORD or DWORD), please also correct and I would be willing to learn.

regards,
George

JudyL_MD15-Nov-07 2:57

JudyL_MD

15-Nov-07 2:57

You are correct for a byte. Your original post way back when also mentioned an int which is not one byte so you got the long discussion on alignment. This statement concerns me:

George_George wrote:
If we need any additional alignment operations for one byte (not one WORD or DWORD), please also correct and I would be willing to learn.

You do not do anything to deal with alignment with respect to the CPU, it handles that itself. You asked a pretty low level performance question about the comparison of two one-byte numbers versus the comparison of two four-byte numbers. You got a low-level answer on how the CPU handles these comparisons which is where the answer to your original performance question lies.

The alignment Bram and I have been talking about is not the same as the "struture member alignment" option you can specify in the compiler options and override with #pragma pack. Two completely different beasts.

The first answer when dealing with a question about low-level performance should always be: code it in a sane and reasonable manner without trying to optimize performance and see how it actually performs before tinkering with the code. Nine times out of ten, it performs fine. In the one case where it doesn't, do some profiling and see where the bottlenecks actually are. They are usually not where you were worrying about in the first place.

Judy

George_George18-Nov-07 4:32

18-Nov-07 4:32

Thanks Judy!

I think my question is answered. I appreciate your help and patience all the time. Smile | :)

regards,
George

Bram van Kampen24-Nov-07 15:49

24-Nov-07 15:49

Thanx for your Support. You understood what I Tried to explain.

Sent the Following to George_George to Close the Subject:-

May the tread continue in virtual heaven, May those humans who contributed and have expired since the thread started, go to their respective heavens, Those that are still in the land of the living. for, all those who contributed, did not break new boundaries, but merely covered points which first year University Courses should have covered.

Claim Bonus Points for Your respective heavens, whatever their religion.

Bottom Line: Education should not concentrate only on the virtual experience of how a compiler compiles, It should keep the new bucks down a bit by also teaching programming in assy language and Basic principles and a basic understanding of how an I86, or whatever chip works!

Bram van Kampen

Bram van Kampen13-Nov-07 13:00

13-Nov-07 13:00

The processor only loads data at 4 byte boundaries. If the allignment is off, it transparently carries out internal shifts,
and loads the 32 bits in two goes.

If you compare Bytes, Words or Dwords alligned on a 4 byte boundary, there is in all cases one 32 bit wide fetch cycle for each operand. The Compare cycle is also identical, it generates all three possible results in the one go. The difference between them is which result gets stored in the Flag Register.

Now if you have a DWORD stored on a 2 byte boundary, that takes 2 fetch cycles. whereas a WORD stored on a 2 byte boundary takes only One fetch cycle. That means that a DWORD comparison can be slower than a WORD comparison, depending on allignment. You can fill in the rest yourself for the situation with Bytes.

BTW.
This is more a significant issue when you do things like RF.digital signal processing. If I were to do something like that, I would definitely not start with a pentium chip. Horses for Courses as they say.
It has never cropped up anywhere in my experience as an issue of major importance when writing CPP Windows/MFC Code, which is what this forum is about.

Then again, There's nothing wrong with being curious.

Regards, Smile | :)

Bram van Kampen

George_George14-Nov-07 1:03

14-Nov-07 1:03

Thanks Bram,

I agree memory is aligned on 32-bit if the machine is 32-bit. But I do not know why alignment needs additional efforts.

For one byte, the system can load the continuous 4 bytes from memory which contains the one byte and could meet with 32-bit alignment requirement, and in this way, only one CPU cycle is needed, and no additional cycle is needed to do alignment for 4-byte (32-bit) data, right?

Please correct me if there are anything wrong with my statements above. Smile | :)

regards,
George

Bram van Kampen14-Nov-07 15:32

14-Nov-07 15:32

It appeared you Missed my point explained over several pages. I Suggest you go to the Intel Site and investigate. But Also go to the AD Site, and all other Hardware Sites. The Problem is: your code has to run on your customers PC's. My suggestion is: Your original question was Daft Afterall. Re-Submit it with more details whenever you got sence about this.

Regards Cry | :((

Bram van Kampen

George_George14-Nov-07 18:19

14-Nov-07 18:19

Hi Bram,

Sorry for any inconvenience. In my original question, I am comparing one byte with another one byte (with the performance comparison of one 32-bit integer and another 32-bit integer). So, I do not think there is any alignment issues for a byte. I think you are talking about if a WORD or something is put across the alignment boundary, so we need to 2 CPU cycles to fetch -- it is another case. I am talking about a byte, not a WORD or a DWORD.

Please feel free to correct me if I am wrong. Smile | :)

If we need any additional alignment operations for one byte (not one WORD or DWORD), please also correct and I would be willing to learn.

regards,
George

Bram van Kampen17-Nov-07 16:47

17-Nov-07 16:47

Please re-read what I wrote in the past. You seem to have some block of imagination between what you write in your file, and how it's used after being compiled. Please explain further why these timing issues are so important. As I Tried to explain before, but I'll now spel it out:-

The idea of writing in windows and MFC is, that whatever platform you write for, In Escence your Code will work. Believe it or not, if written prudently, your code will work on a MAC, on Windows 2000, or, Windows NT. Things work, because the type of question you ask here in how long does it take to perform a Core Operation, like a Compare, or the differerence therein by size of operant, does not come into the equasion, and is largely insignificant in most cases, because of the nature of the User Interface.

Please let me know WHY it is so important to know these timing differences.

Regards Smile | :)

Bram van Kampen

George_George18-Nov-07 1:24

18-Nov-07 1:24

Thanks Bram,

It is my pure technical interest to learn how internal things work, like compare. I appreciate your help all the time. Smile | :)

regards,
George

Bram van Kampen22-Nov-07 16:48

22-Nov-07 16:48

George_George wrote:
It is my pure technical interest to learn how internal things work, like compare. I appreciate your help all the time

I Thought that All Along, Otherwise I might have dismissed you with a smart remark. (not my style though)Hope my comments were helpful to yourself an the community. You ask many basic questions, an that's GOOD!

Smile | :)

Bram van Kampen

George_George22-Nov-07 20:04

22-Nov-07 20:04

Thanks for your encouragement, Bram van Kampen!

regards,
George

Bram van Kampen24-Nov-07 15:43

24-Nov-07 15:43

May the tread continue in virtual heaven, May those humans who contributed and have expired since the thread started, go to their respective heavens, Those that are still in the land of the living. for, all those who contributed, did not break new boundaries, but merely covered points which first year University Courses should have covered.

Claim Bonus Points for Your respective heavens, whatever their religion.

Bottom Line: Education should not concentrate only on the virtual experience of how a compiler compiles, It should keep the new bucks down a bit by also teaching programming in assy language and Basic principles and a basic understanding of how an I86, or whatever chip works! D'Oh! | :doh:

Bram van Kampen

heap

George_George10-Nov-07 0:08

10-Nov-07 0:08

Hello everyone,

Two concepts about heap on Windows after reading MSDN document about heap functions.

1. Default heap. Each process has a default heap. But the default heap of different processes are different, right? Example, process 1 has default heap A and process 2 has default heap B, then A and B should be different heaps, right?

2. Why a process needs to allocate private heap, any practical use?

3. Are there any default global heap which different processes could share?

thanks in advance,
George

cmk10-Nov-07 12:20

cmk

10-Nov-07 12:20

1. A and B should be different heaps - Yes.
2. Practical uses - Yes.
3. Global heap - No.

...cmk

The idea that I can be presented with a problem, set out to logically solve it with the tools at hand, and wind up with a program that could not be legally used because someone else followed the same logical steps some years ago and filed for a patent on it is horrifying.
- John Carmack

George_George10-Nov-07 23:38

10-Nov-07 23:38

Thanks cmk,

Could you help to show some practical usage of creating private heap please?

regards,
George

Luc Pattyn10-Nov-07 15:31

Luc Pattyn

10-Nov-07 15:31

2. A process needs a heap to allocate dynamic things (e.g. malloc function, and new keyword).

1. When a process exits, it must release its memory. How could it release memory that
contains its heap intertwined with some other processes heap?

3. no, see 1.

Smile | :)

Luc Pattyn [Forum Guidelines] [My Articles]

this months tips:
- use PRE tags to preserve formatting when showing multi-line code snippets
- before you ask a question here, search CodeProject, then Google

George_George10-Nov-07 23:40

10-Nov-07 23:40

Thanks Luc,

Could you help to show some practical usage of creating private heap please?

regards,
George