|
Sorry Richard, indeed I didn't notice that. It's been a rather busy weekend
Mircea
|
|
|
|
|
trønderen wrote: If you just send it the entire string, leaving to the display unit to discard what won't fit, for one: You may upset the display device.
To be fair however your example is suggesting that developer has no idea what the business space even is.
So for example if I expect the uses to use a PC with a larger display, and they use a phone, then it is not going to work very well. But if I expect them to use a phone also then I should account for that and test/develop for that also.
trønderen wrote: This software may put restrictions on the lengths of both prompt strings and data values.
That isn't true. Such data is always limited. Nothing in computing is unlimited. Doesn't matter how it is used. If a developer is not considering that from initial creation then that will cause problems. Display problems is just one case. It is similar to a very naive developer that designs a database and makes every text field into a blob just on the chance that the extra space might be needed.
|
|
|
|
|
I read your article on the use of UTF-8, but I have a point of (very slight) disagreement. The majority of Windows API calls have both UNICODE and ASCII versions, and the C and C++ runtimes are very much UTF-8 biased. So there is no need to make your projects UNICODE based, as there are only a few instances when you are forced to use Unicode strings. I used to write my applications so that I could easily build them either way, but gave that up and now manage the conversions for the odd functions that force me to use Unicode text.
|
|
|
|
|
Richard MacCutchan wrote: The majority of Windows API calls have both UNICODE and ASCII versions, Indeed, but the question is: what happens when you call an ANSI function, let's say MessageBoxA with a text that contains characters outside the 0-128 range? The result depends on the active code page setting. Until recently (2019) the programmer had no control on what ACP user has selected. He could only query the this setting using the GetACP function.
In newer versions of Windows, you can declare the ACP page you want to use using the app manifest. If you declare UTF-8 code page, you can use UTF-8 with the ANSI functions.
For more details see Use UTF-8 code pages in Windows apps - Windows apps | Microsoft Learn[^]
Mircea
|
|
|
|
|
Mircea Neacsu wrote: what happens when ... Assuming the developers understand the customers' requirements* this will have been catered for.
*which is probably on the toss of a coin
|
|
|
|
|
Mircea Neacsu wrote: Indeed, but the question is: what happens when you call an ANSI function, let's say MessageBoxA with a text that contains characters outside the 0-128 range? ANSI always was 8-bit, 0-255, ISO 8859-x, 'x' varying with the region. Almost all the Western world had their needs covered by 8859-1. (Lapps were one exception; they used 8859-8, if my memory is correct.)
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
trønderen wrote: Almost all the Western world had their needs covered by 8859-1 Funny you say that: I cannot even properly write my last name in 8859-1. It is written "Neacşu". I have to go to Windows-1252 for that. The Wikipedia page for ISO/IEC 8859-1[^] lists also other languages that are not fully covered.
Anyway, at this point, I think we should agree to disagree as I'll continue to keep everything in UTF-8 inside my programs.
Mircea
|
|
|
|
|
trønderen wrote: But a new situation has arisen: Emojis have procreated to a number far exceeding the number of WinDings. They do not all fit in BMP
I did not even consider that as a possibility.
So make sure I add something that says "no emojis!".
|
|
|
|
|
|
I understand. Thank you guys
|
|
|
|
|
You are welcome.
"In testa che avete, Signor di Ceprano?"
-- Rigoletto
|
|
|
|
|
Did *nix, or C under any other OS, ever run on a machine with a non-binary word size, like 12, 18 or 36 bits? Such as DEC-10 / DEC-20 or Univac 1100 series. I never heard of any, but I'd expect that it was done. Univac could either operate with 6 bit FIELDATA bytes or 9 bit bytes. DEC put five 7 bit bytes into each word, with one bit to spare.
If C was run on such machines, how was char handled? Enforcing 8 bits, making the strings incompatible with other programming languages? Or did C surrender to 6 or 7 bit bytes? If they stuck to 8 bits, did each word on a 36 bit machine have 4 bits to spare? (2 bits on an 18 bit machine) Or did they fit 4.5 char a word - 9 char per two words? (On Univac, you could address 6 or 9 bit bytes semi-directly, using string instructions, without the need for shifting and masking.)
By old definition, both architectures had a char size of one byte. The old understanding of 'byte' was the space required to store a single character; it could vary from 5 bits (Baudot code) to at least 9 (Univac and others) bits. Lots of international standards (developed outside of internet environments!) use the term 'octet' for 8 bits to avoid confusion with non-binary word/byte sizes.
Religious freedom is the freedom to say that two plus two make five.
|
|
|
|
|
|
As best as I can remember, C only requires that type char be large enough to hold any character in the current (system) encoding. That is, if the CPU has 7, 8, or 9 bit "bytes", then a C char would also be 7, 8 or 9 bits.
"A little song, a little dance, a little seltzer down your pants"
Chuckles the clown
|
|
|
|
|
|
Voluntarily removed
modified 5 days ago.
|
|
|
|
|
|
As I'm sure we all know, there are basically three ways to handle currency values in code.
1) Can store the value as cents in an integer. So, $100.00 would be 10,000. Pro: You don't have to worry about floating point precision issues. Con: Harder to mentally glance at without converting it in your head. Con: Depending on size of integer may significantly reduce available numbers to be stored.
2) Can use a float with a large enough precision. Pro: Easy to read. Con: Rounding issues, precision issues, etc.
3) Can use a struct with a dollars and cents. Pro: Same benefit as integer with no number loss. Pro: Easy to read mentally. Con: Have to convert back and forth or go out of the way for calculations.
Historically, I've always gone with either 1 or 2 with just enough precision to get by. However, I'm working on a financial app and figured... why not live a little.
So, I'm thinking about using 128-bit ints and shift its "offset" by 6, so I can store up to 6 decimal places in the int. For a signed value, this would effectively max me out at ~170 nonillion (170,141,183,460,469,231,731,687,303,715,884.105727). Now, last I checked there's not that much money in the world. But, this will be the only app running on a dedicated machine and using 1GB of RAM is completely ok. So, it's got me thinking...
Here's the question... any of y'all ever use 128-bit ints and did you find them to be incredibly slow compared to 64-bit or is the speed acceptable?
Jeremy Falcon
modified 2-Sep-24 14:37pm.
|
|
|
|
|
If you're not concerned about speed, then the decimal::decimal[32/64/128] numerical types might be of interest. You'd need to do some research on them, though. It's not clear how you go about printing them, for example. Latest Fedora rawhide still chokes on
#include <iostream>
#include <decimal/decimal>
int main()
{
std::decimal::decimal32 x = 1;
std::cout << x << '\n';
} where the compiler produces a shed load of errors at std::cout << x , so the usefulness is doubtful. An alternative might be an arbitrary precision library like gmp
A quick test of a loop adding 1,000,000 random numbers showed very little difference between unsigned long and __uint128_t For unsigned long the loop took 0.0022 seconds, and for __int128_t it took 0.0026 seconds. Slower, but not enough to not consider them as a viable data type. But as with the decimal::decimal types, you would probably have to convert to long long for anything other than basic math.
"A little song, a little dance, a little seltzer down your pants"
Chuckles the clown
|
|
|
|
|
Well, just so you know I'm not using C++ for this. But the ideas are transferable, for instance a decimal type is just a fixed-point number. Which, in theory sounds great, but as you mentioned it's slow given the fact there's no FPU-type hardware support for them.
I was more interested in peeps using 128-bit integers in practice rather than simply looping. I mean ya know, I can write a loop.
While I realize 128-bit ints still have to be broken apart for just about every CPU to work with, I was curious to know if peeps have noticed any huge performance bottlenecks with doing heavy maths with them in a real project.
Not against learning a lib like GMP if I have to, but I think for my purposes I'll stick with ints, in a base 10 fake fixed-point fashion, as they are fast enough. It's only during conversions in and out of my fake fixed-point I'll need to worry about the hit if so.
So the question was just how much slower is 128-bit compared to 64-bit... preferably in practice.
Jeremy Falcon
|
|
|
|
|
You mean 64-bit CPUs can't deal natively with 128-bit integers?
You had me at the beginning thinking that it was a real possibility.
The difficult we do right away...
...the impossible takes slightly longer.
|
|
|
|
|
I'm too tired to know if this is a joke or not. My brain is pooped for the day.
Richard Andrew x64 wrote: You had me at the beginning thinking that it was a real possibility. Any time I can crush your dreams. You just let me know man. I got you.
Jeremy Falcon
|
|
|
|
|
FYI I wasn't joking.
The difficult we do right away...
...the impossible takes slightly longer.
|
|
|
|
|
Ah, I haven't played with ASM since the 16-bit days and it was only a tiny bit back then to help me debug C code really. So, this may be old and crusty info...
But, yeah typically in a 64-bit CPU the registers don't go any wider than 64-bits. Now, there are extended instruction sets (SSE, SSE2, etc.), but those usually deal more with capabilities per instruction than data/bus width.
One notable exception is that all CPUs have FPUs these days and most FPUs can process 80-bit wide floats natively, even on a 64-bit CPU. AFAIK, there are no 128-bit registers/extensions for 64-bit CPUs for anything.
Which means, if I got a 128-bit number, any programming language that compiles it will have to treat that as two 64-bit values in the binary. Good news is, it's a loooooooooot easier to do with integers than floats. Say for instance, a quadruple precision float that's 128-bits is over 100 times slower than a 80-bit float. With an integer, you're just one bit shift away from getting the high word.
Stuff like the C runtime will have a native 128-bit type, but the binary will still have to break it down into two under the hood.
Jeremy Falcon
|
|
|
|
|
I would use 64-bit integers, representing cents.
My 0x0000000000000002.
"In testa che avete, Signor di Ceprano?"
-- Rigoletto
|
|
|
|
|