In ASM there is no distinction between functions and procedures.I haven't been programming languages that made a syntactically explicit distinction between functions and procedures since I used Pascal last time (and that is quite a few years ago). For a short period, I found it difficult to merge the two into one concept, but soon I started asking myself 'Why?'. A function with a void (/null) result is as good a procedure as any!
Regarding passing parameters to procedures in ASM.Again, this is not specific to ASM. Some platforms, such as ARM, define a binary call and parameter interface independent of programming language. If you follow that standard, you can call functions in any other language, and any other language can call your ASM functions. If you do not, then you are misbehaving
You didn't mention one parameter passing method that was the only viable one on machines with extremely small stacks (like the 8051): The accumulator holds the address of a 'struct'-like block of values, allocated anywhere, possibly statically. The call conventions say that the accumulator is volatile; you never expect it to retain its value when other code is executed, so you do not save/restore it for a function call.
Btw: In the Win32 API, this convention is used for a share of the function calls: (The address of) a single composite struct is passed by the caller. The first word in the struct indicates its size, so when a new, extended version of the function is published, taking more parameters, the name of the function is unchanged, and the extra parameters are added at the end of the struct. The function can see whether the the caller wants the old or the new extended functionality from the size of the struct. And, it reduces the risk of overflow.
The alternative, used by another share of the Win32 functions, is to extend the function name with an 'Ex' (and a an extended parameter specification. Later comes the 'FuncExEx', and 'FuncExExEx' and ... there are cases of function names with five 'Ex' suffixes in a row. I think that is extremely messy. I much prefer the 'parameter struct' alternative (using that philosophy in my own code).
The c/C++ compilers use of course the "C" approach.By default, that is. I have never used a C/C++ compiler that could not be directed to use Pascal conventions (that is a requirement for calling Win32 functions!). Note that 64 bit Windows has different calling conventions.
discarding the stack by the caller is more naturalSo every caller must have code to do the cleanup for every call ... Well, for the simple cases where nothing more is required than an SP update, it is fair enough. In more complex cases (e.g. a non-linear stack), the question is more debatable.
One issue regarding stacks: In recent years, use of threads has become far more common. Often a software system may be implemented by several hundred or even thousands of threads, which are usually preemptable. Each requires its own stack space, which must be large enough to handle the very deepest call sequence that this thread might make. So you could end up tying up quite large amounts of RAM for thread stacks. In theory, every thread might be preempted at its deepest call level, all at the same time. That never happens in practice, so you really occupy a lot more RAM than really needed.
There are machines supporting non-linear stacks. No stack space is initially allocated to the threads; when a call is executed, a stack frame is allocated from the heap. Upon return, the frame is released to the heap. Then no more RAM is occupied than what is in actual, active use at any time. Especially if you implement (possibly parts of) the system as non-preemptible, the compiler can make optimizations to collapse multiple heap allocations/frees into one, to reduce overhead. However, this requires the allocation / release to be handled by the called routine; the caller does not have enough information to handle it.
The "C" convention works also with procedures that have a variable number of parameters.Note that passing a 'parameter struct' (headed by its size) would also handle this.
That's why is best to avoid methods that return objects in C++/C# etc.Eeeh ... In C#, objects are always addressed through a reference. They are always allocated on the heap. You do not see the reference as such, the way you do in C/C++, but at the binary level, returning a MyObject* in C++ or a MyObject in C# is practically identical.
In any case, any compiler documents (or should document) very exactly how it transfers parameter to procedures, and hw results are returned by functions.I beg to disagree. This is not to be defined by each compiler (/language), but by the machine architecture. All compilers should follow the same conventions, so that you can mix languages freely. One good thing about dotNet is that high level language compilers do not generate binary code; they generate an architecture independent Common Intermediate Language (CIL), which is not transformed to 'real' machine code until the assembly is loaded into one specific machine, at which time native code for that architecture is generated, regardless of programming language.
Unfortunately, ASM is not so much taught in universities nowadays (more just as an addendum to digital electronics) and this is really a pity.I agree only halfway (or less). Sure, students should learn what the compiler does, with registers and stacks and such, but not for coding ASM themselves.
Much more than ASM mnemonics, programmers need to understand concepts like paging and other aspects memory management. You do not teach memory mapped files through assembly code! Actually, you do not see the MMS at all from ASM code (unless you teach OS kernel programming, which is not for the average application programmer). Interrupts are similarly 'invisible' - and equally important, both with regard to execution time costs, and for synchronization / protection issues. Note that as early as the mid-1970s, Per Brinch Hansen developed a complete set of synchronization concepts, from simple semaphores, through critical regions and monitors, in a high level language, Concurrent Pascal.
Students make a mess of ASM, abusing it in the worst way possible. Generally, they believe that they can make really, really super-fast ASM code, which is simply not true with any modern CPU, using prefetch and pipelining and speculative execution and hyperthreading and whathaveyou of hardware tricks affecting real execution speed.
An extreme/funny example: I was teaching CPU architecture 25-30 years ago, with a few ASM coding exercises on the x86 (which is a terrible architecture for teaching good principles!). I tried to stress that ASM is hard to read; we must code for the best possible readability. To zero AX, you move zero into it: MOV AX, 0. A few students insisted that the right way of doing it is XOR AX, AX - it is faster. No, it is not! I had to dig up timing tables for various x86 CPUs, showing that for the original 8086, you sure would save one whole clock cycle using XOR, but since 286, the alternatives where equally fast. (We were using 386.) They kept insisting on using XOR, because they 'wanted the code to be optimal for the slowest CPUs'. For the next hand-in, they delivered a code file headed by a comment: 'This is the style our lecturer forces us to code:' - and a readable, clean solution - followed by a large comment block headed by 'This is how REAL programmers would do it:', and the messiest, most unreadable ASM code I ever saw!
ASM serves no function in code optimization. Long ago, I read the proceedings from the first Conference on the History of Programming Languages (or something like that), where the developers of the first optimizing compiler, Fortran II, told that they had spent days to understand how the h* that compiler had found out that the code would run faster if it did so-and-so. Note: These were the people who had developed the optimizing techniques! Modern compilers go much further; there is no way that you could do any similar optimizing 'by hand' in ASM. Actually, the same goes for heap management: There are still lots of programmers that believe they can do a better job than a modern GC system. They can not. (Possible exception: Extremely small heaps e.g. in tiny embedded systems - but in most such cases the right alternative is to abandon dynamic allocation at all!)
ASM serves a single purpose today: To get access to facilities that cannot be addressed directly through high level languages, such as special registers or peripherals with strange interfaces to the CPU. Commonly, providing such access to an HLL requires less than a dozen instructions. Usually, there are no loops, no jumps - that is handled at the HLL level.
Sometimes, you come across architectures where interrupt handlers are activated in special ways so they cannot be defined as plain functions, but usually, C compilers for those architectures offer modifiers for those 'calling conventions'. Last time I needed ASM was when I had to write a couple dozen instruction to handle a full CPU reset, to set up stack areas etc. before high level code could take over, but that is like OS programming - not something that every application programmer need to relate to.
I'd prefer to teach 'memory allocation, constructors, destructors etc.' using a high level language (if you consider C 'high level' ) to manage the data structures etc. I always thought that Donald Knuth made a serious mistake when choosing to illustrate large families of algorithms using (a hypothetical) ASM language rather than a high level language. Conceptually, his The Art of Computer Programming is great, but for all practical purposes, the code examples have about zero value today, and even 30 years ago. The textual description is not a sufficient good reason to use this series as a reference work for basic algorithms; you read it for historical purposes only.