|
Quote: Compileonline.com[^] will shows you the buggy char on String str2 . Interesting though. Smile |
I was talking about the Above BTW.
Well strange compilation.
Same Line twice and Different Results..
We should be building great things that don't exist-Lary Page
|
|
|
|
|
I once read a kind of ironic posting about what you could do to obscrure your code (and this way make yourself irreplacable).
One of the topics was using similar letters from different alphabets in variable names. They used the example of the Cyrillic 'a' which looks just like the Latin 'a' but is seen as different by the compiler. I assume you could have reached a similar effect by using a Cyrillic 'r' instead of the Latin 'p' in the URL.
The good thing about pessimism is, that you are always either right or pleasently surprised.
|
|
|
|
|
Did not know that, this is even more evil than the invisible character. I take some notes.
You can spot the invisible char by doing str1.Length, but a Cyrillic 'a'... huhu. 
|
|
|
|
|
Well, some phishers used that in web addresses. Then, browsers were changed to show some encoded values in the address bar for such characters. www.dеutsсhеbаnk.соm looks so nice at first view, but Firefox changes it into www.xn--dutshbnk-66g8be6l.xn--m-0tbi nowadays.
|
|
|
|
|
|
Does the Terminal font trick would find the bug ? 
|
|
|
|
|
I would never, ever, ever stoop so low as to unleash this on my coworkers...
but I can think of some suppliers who may benefit from this (see my previous posts on Coding Horrors The Wierd and the Wonderful).
bwa ha ha ha
bwa ha ha ha ha
bwa ha ha ha ha ha
bwa ha ha ha ha ha ha
"If you don't fail at least 90 percent of the time, you're not aiming high enough."
Alan Kay.
|
|
|
|
|
So evil... Heh...
Of course, the VS theme[^] I'm using automatically underlines hyperlinks. And it doesn't underline that second one. Gee, I wonder why
|
|
|
|
|
I refused to copy because I also wanted to find out where.
static void Main(string[] args)
{
String str1 = "http://toto.com/";
String str2 = "http://toto.com/";
bool eq = str1 == str2;
int j = str1.Length;
Console.WriteLine(string.Format("Evaluates to {0}, Length = {1},{2}", eq, j, str2.Length));
for (int i = 0; i < j; i++)
{
if (str1[i] != str2[i])
Console.WriteLine(string.Format("Mismatch found index={0}, char(2),int(2) = {1}-{2},{3}-{4}"
, i, str1[i], str2[i], (int)str1[i], (int)str2[i]));
}
str1 = "http://toto.com/";
str2 = "http://toto.com/";
eq = str1 == str2;
Console.WriteLine(eq);
Console.Read();
}
PS my "find-out" code has a bug in it that I only realized after it ran successfully through pure luck. (Do you see it?)
|
|
|
|
|
Have you tried
String str1 = "аrnold";
String str2 = "arnold";
This is not the same problem 
|
|
|
|
|
Looks like the same problem to me. One character is ASCII and the other is Unicode. (Well, they both are Unicode and one isn't ASCII. You of course, could have both not ASCII
static void Main(string[] args)
{
String str1 = "http://toto.com/";
String str2 = "http://toto.com/";
TestStrs(str1, str2);
str1 = "аrnold";
str2 = "arnold";
TestStrs(str1, str2);
str1 = "http://toto.com/";
str2 = "http://toto.com/";
TestStrs(str1, str2);
Console.Read();
}
static bool TestStrs(string str1, string str2)
{
bool eq = str1 == str2;
if (eq)
{
Console.WriteLine(string.Format("Two Strings ({0}) are the same", str1));
return eq;
}
Console.WriteLine(string.Format("Mismatch, two Strings ({0}) ({1})are not the same", str1, str2));
int j = str1.Length, i = str2.Length;
if (j > i)
{
j = i;
}
for (i = 0; i < j; i++)
{
if (str1[i] != str2[i])
Console.WriteLine(string.Format("Mismatch found index={0}, char(2),int(2) = {1}-{2},{3}-{4}", i, str1[i], str2[i], (int)str1[i], (int)str2[i]));
}
return eq;
}
Of course this has a bug in it too. Multiple true Unicode strings would blow up with an overindex error. Help says exactly what I thought it said, which is patently wrong:
String . Length Property (System) - MSDN – the Microsoft ...
The Length property returns the number of Char objects in this instance, not the number of Unicode characters. The reason is that a Unicode character might be ...
http://msdn.microsoft.com/en-us/library/system.string.length[^]
If it did what it said it would do, I wouldn't have spotted the bug in the first place.
No, I'm wrong. I was under the impression that char could hold one or two bytes. I guess instead when a true Unicode character is indexed in the string, the next index location points to the start of the next Unicode character, otherwise your string would have a bunch of mismatches in the loop. So true multiple UNICODE strings should overindex the looped character and blow up. Oh well, every time you're wrong, you get a chance to learn something. I didn't test my theory, by making sure both strings are concatenated so both strings have one Unicode character each and prove it blows up with an overindex.
Well, one more thing to read: Use the System.Globalization.StringInfo class to work with Unicode characters instead of Char objects. Hope that has a true Length property based on the length of true Unicode characters. Hmmm, is there a Unicode type similar to char?
|
|
|
|
|
The different is that the http://toto.com/ example contains a char that is hidden.
However the "arnold" == "arnold" are using two different "a".
This is why the two arnold have same length, but not the two http://toto.com/
I was not aware of the StringInfo class and how a unicode char could take two chars.
Very intersting stuff, I have not idea if a "Unicode char" exists.
|
|
|
|
|
I had been taught that char supported UNICODE format and I accepted it, then I ran into the documentation that said the Length field only represented the char length. Therefore, I figured all I'd learned was wrong. Then you mentioned the hidden character and I remembered something else. I cast the char into an int. The only way it could be larger than 256 is if it was a Unicode char.
I thought for sure I could get this to blow up with more Unicode characters. By the way, you are wrong. There isn't a hidden character in the original problem. The last character in the string is an ASCII "47" character and a UNICODE "8207" character.
Your "a" code is also an ASCII and UNICODE difference, though I can't prove it without going back to your post. The last version evaluates as true.
|
|
|
|
|
There is an hidden character in the first version.
Look, I just moved it.
"http://toto.com/"
Unicode 8207 was not the '/', that's the reason why the two length were not the same.
However, for the arnold example, you get the same length.
|
|
|
|
|
Nicolas Dorier wrote: Unicode 8207 was not the '/', that's the reason why the two length were not the same. The Unicode character system is a set of bits that when cobbled together become the symbol of a character that is represented. Another way a set of bits can be cobbled together become the symbol of a character. In the same way, a Unicode character can be represented as a number. However that character can reach into the millions as a number.
In order to represent a character or number on the screen a graphical image must be displayed. That means it has to have software that when given the number displays the right graphical image. An ASCII character MUST be one byte long, no more, no less. IE it requires 8 bits to represent a character. In ASCII, a zero is equivalent to SQL's NULL and isn't a character, so ASCII can only have 255 characters. I don't believe the ASCII set of defined characters has a length of 255 in it. So, depending on the software, the image displayed can change between the devices. Unicode is a character that uses at least 2 bytes and up to 4 bytes, which means it could have a character set that is 4 billion characters long.
We're nowhere near that level, like I said, I believe the length is around a million. As far as "a" is concerned, this is an ASCII character and a Unicode character, there isn't a hidden character involved. One uses one byte and one uses two bytes. The application software supports enough graphical design to support both versions of "a" and I believe because of that it only counts the length as 1 even though it is using two bytes because it is accounted for in software.
The 8207 is the character that has a graphical image of "/" in some software, but when it runs in code it looks exactly like the special 'a' character using Console.WriteLine. It isn't a hidden character, both times it re-reads 8207, so both times it reads the exact same location twice. It's a screw-up in how it counts so even the left/right arrows read the same location twice. In that respect, yes, it is a hidden character.
|
|
|
|
|
The 8207 is the character that has a graphical image of "/"
You are wrong this has nothing to do with "/"
Here is the same bug without "/" but with 8207.
String str1 = "test";
String str2 = "test";
Moreover, a char is not ASCII but a 16 bit unicode character, contrary to C/C++.
|
|
|
|
|
Yep. Your "a"'s are ascii 97 and Unicode 1072. Interesting, the second one also exceeds 1 byte, but still counts as one character. (2^10 is 1024, so it takes 11 bits, 2^3 is 8 so the "/" longer character takes 14 bits. You are absolutely right that debug makes it look like the / character is too long to display the interpretation.
|
|
|
|
|
it is not related to the "/" but the unicode char 8107 that you can put where you want. (see my previous comment)
|
|
|
|
|
var scheduleContacts = Engine.APIProxy.GetAllCompanyInfoContactScheduleContactsByCompanyInfoContactScheduleHeaderId(schedule.Id);
That is a REALLY long method name.
If it's not broken, fix it until it is
|
|
|
|
|
Good one.
This is taking the suggested C# naming conventions a bit far, but I do think it is better than
var scheduleContacts = Engine.APIProxy.GetAllCmpnInfCntctSchdlCntctsByCmpnInfCntctSchdlHdrId(schedule.Id);
Soren Madsen
"When you don't know what you're doing it's best to do it quickly" - Jase #DuckDynasty
|
|
|
|
|
Probably someone taking the Aggregate Root concept to the extreme.
Also, a horrible use of var.
|
|
|
|
|
Vark111 wrote: Also, a horrible use of var.
I know what you mean!
List<CompanyInfoContactScheduleContactsByCompanyInfoContactScheduleHeader> scheduleContacts = Engine.APIProxy.GetAllCompanyInfoContactScheduleContactsByCompanyInfoContactScheduleHeaderId(schedule.Id); would have been so much clearer!!!
|
|
|
|
|
How about stating it as below
GetAllCompanyScheduledContactInfo(schedule.Id) or
GetAllCompanyScheduledContactInfoByScheduleID(schedule.Id)
Thanks,
|
|
|
|
|
Steady on there.....should be simply;
var results = Engine.APIProxy.GetResults(...);
|
|
|
|
|
Ranjan.D wrote: ...GetAllCompanyScheduledContactInfo(schedule.Id) or...
The problem with that is that you don't get to get even for outrageously long names by making them even longer and more confusing.
|
|
|
|