|
It seems to me that this problem boils down to selecting the best projection of your problem space onto two axes. Do you already know how to create such a map? Are you just looking for cluster analysis algorithms? A search should turn up some 2D cluster analysis algorithms, it must be a solved problem.
|
|
|
|
|
You sent a shiver down my spine when I read your post. I've just being doing this as part of my Uni AI course, although we were doing it with analysis of birds contaminated by an oil spill, and had 17 inputs going to a 8x8 SOM grid.
We have been using JavaNNS for all the course work.
What you basically need to do is set up your network with each of the 3 input nodes (RGB) connected to every single node in your SOM.
You haven't said what language you are trying to create this with, but searching google there are heaps of results with libraries to use.
Here is one example article;
Implementing the Kohonen Neural Network[^]
|
|
|
|
|
I will try to formulate my question about Self Organizing Map (SOM) that I have. My research problem is that I want to classify the entries of a large dataset so that similar entries end up in clusters where the the variation within clusters are less than between the clusters. I know this can be accomplished with the SOM algorithm. Now to the point:
If the SOM is started with a large map lets say 10 by 10 nodes (100 nodes in total). So I map my data onto the 100 node map. Then I visually inspect the result and find that there is lets say 7 distinct clusters. Is it then approperiate to reduce the number of nodes to a map 3 by 3 nodes (9 nodes in total) and again run the SOM onto this much smaller map. And then use the nodes of this smaller map as my clusters?
|
|
|
|
|
Hi,
Interesting the way that one intellectual quest can lead to another: my connecting-nodes-in-hierachies quest led to thinking about property change notifications in WinForms, which took a detour into WPF where so much work has been done with INotifyPropertyChanged automation, which led to experimenting with auto-code generation with T4 and PostSharp.
Which led to: I wrote a (non-recursive) code-generator ... which I will write-up for CP as a Tip or something if there's any interest ... that will generate a complete C# class for evaluating all possible logical comparisons of #n boolean variables. Here's the 'guts' of what it produces if I call it with an integer argument of #3:
if (Is_A && Is_B && Is_C)
{
Console.WriteLine("Test0 passed");
return;
}
else if (!Is_A && Is_B && Is_C)
{
Console.WriteLine("Test1 passed");
return;
}
else if (Is_A && !Is_B && Is_C)
{
Console.WriteLine("Test2 passed");
return;
}
else if (!Is_A && !Is_B && Is_C)
{
Console.WriteLine("Test3 passed");
return;
}
else if (Is_A && Is_B && !Is_C)
{
Console.WriteLine("Test4 passed");
return;
}
else if (!Is_A && Is_B && !Is_C)
{
Console.WriteLine("Test5 passed");
return;
}
else if (Is_A && !Is_B && !Is_C)
{
Console.WriteLine("Test6 passed");
return;
}
else if (!Is_A && !Is_B && !Is_C)
{
Console.WriteLine("Test7 passed");
return;
} It has been many, many years since I once studied 'truth-tables' and 'decision-tables, etc. And, given a language ... C# ... that optimizes boolean expressions so that in "A && B && C" if 'A is not true, there is no evaluation of 'B and 'C ...
Assuming that any variable in the set of variables is as likely to be true or false in any given use case, it, of course, would not make any difference in what order you wrote out the tests. But, what interests me is how, when you have a sense of the probability distribution of evaluations, you might optimize.
So if you think that a test of 'A' is modal in your use-case, you move the evaluation of 'A to the front of the test; if you think evaluation of 'A is an edge-case infrequently evaluated, you write it last.
Where I am 'blocked' is in coming up with an algorithm to make a pass over the complete matrix of possibilities and optimize the tests based on some user supplied 'weighting.'
Perhaps the user of the code-generator should supply a string like "C,B,A" as its input parameter, and then interpret that as the order to write out the terms ?
Appreciate your opinions, and advice.
best, Bill
p.s. While the current code-generator is not Linquified, I'll probably end up going there. And I am also considering it might be better to generate a switch statement of the form:
switch (theSelector)
{
case "TTT:
break;
case "TFT:
break;
... more cases ...
case default;
break;
}
"In the River of Delights, Panic has not failed me." Jorge Luis Borges
modified on Wednesday, August 17, 2011 1:49 AM
|
|
|
|
|
Hey Bill, why not generate a pattern like:
if (Is_A)
{
if (Is_B)
{
if (Is_C)
{
Console.WriteLine("Test 0 passed");
}
else
{
Console.WriteLine("Test 1 passed");
}
}
else
{
if (Is_C)
{
Console.WriteLine("Test 2 passed");
}
else
{
Console.WriteLine("Test 3 passed");
}
}
}
else
{
if (Is_B)
{
if (Is_C)
{
Console.WriteLine("Test 4 passed");
}
else
{
Console.WriteLine("Test 5 passed");
}
}
else
{
if (Is_C)
{
Console.WriteLine("Test 6 passed");
}
else
{
Console.WriteLine("Test 7 passed");
}
}
}
I didn't check whether they have the same test numbers, but the idea should be clear - instead of evaluating O(2^k) things, evaluate only O(k)
The switch should be even faster, provided you use integers.
|
|
|
|
|
Hi David, excellent feedback, thanks !
I did consider writing out the nested form as you present here, but, given I was just getting into the arcane world of *.tt file editing in VS Studio, and T4 syntax (coming to it without a background in ASP.NET content/code escape syntax conventions), decided it was easier to write the more 'linear' output first. Nested 'style' is coming.
Yes, I like your idea of using integers for the conditions in a switch statement: I'm already using those values in the bit-shifting I'm doing in the code-generator, anyway. See any downside in nested switch statements ?
If the code-generator writes out a comment in each clause documenting implicit meaning of the integer in terms of the boolean variables, that's enough for code maintenance ?
I kind of like the idea of dual approach: perhaps during development and testing using the 'linear form' expression might be productive, while using the 'compiled-to-switch denser' form for possible greater execution speed ?
best, Bill
"In the River of Delights, Panic has not failed me." Jorge Luis Borges
|
|
|
|
|
Hi David,
Nothing like writing out by hand an example of what you want, and testing it against all possible inputs, to make you aware of issues you'll face in generating it auto-magically ...
Wish I had the IL skills to look at how the compiler renders this and evaluate for efficiency. Is falling through to the 'default case' of a switch statement exactly the same as skipping over an 'if clause that has a 'return inside it ? I'd guess so.
Some obvious refactoring 'cries out' to be done in the 'brute force' example included here: mmm ... dare I utter the forbidden name: 'G*T*' ?
Would you really want to have beasts like this at large in your applications:
public static class NestedSwitch
{
public static string Test4(bool A, bool B, bool C, bool D)
{
int intSelectorA = A ? 8 : 0;
int intSelectorB = B ? 4 : 0;
int intSelectorC = C ? 2 : 0;
int intSelectorD = D ? 1 : 0;
switch (intSelectorA)
{
case 8:
switch (intSelectorB)
{
case 4:
switch (intSelectorC)
{
case 2:
switch (intSelectorD)
{
case 1:
return "ABCD";
default:
return "ABC!D";
}
default:
switch (intSelectorD)
{
case 1:
return "AB!CD";
default:
return "AB!C!D";
}
}
default:
switch (intSelectorC)
{
case 2:
switch (intSelectorD)
{
case 1:
return "A!BCD";
default:
return "A!BC!D";
}
default:
switch (intSelectorD)
{
case 1:
return "A!B!CD";
default:
return "A!B!C!D";
}
}
}
default:
switch (intSelectorB)
{
case 4:
switch (intSelectorC)
{
case 2:
switch (intSelectorD)
{
case 1:
return "!ABCD";
default:
return "!ABC!D";
}
default:
switch (intSelectorD)
{
case 1:
return "!AB!CD";
default:
return "!AB!C!D";
}
}
default:
switch (intSelectorC)
{
case 2:
switch (intSelectorD)
{
case 1:
return "!A!BCD";
default:
return "!A!BC!D";
}
default:
switch (intSelectorD)
{
case 1:
return "!A!B!CD";
default:
return "!A!B!C!D";
}
}
}
}
}
}
"In the River of Delights, Panic has not failed me." Jorge Luis Borges
|
|
|
|
|
I'm pretty sure the compiler is going to be silly about it and not optimize it (edit: that is to say, the nested switches) at all, but I would suggest this:
int all = intSelectorA | intSelectorB | intSelectorC | intSelectorD;
switch (all)
{
case 0:
return "!A!B!C!D";
....
case 15:
return "ABCD";
}
|
|
|
|
|
Thanks, David, I'm going to take a break, and enjoy what's left of the day here at GMT + 7, and then think about your comments, and Luc's comments, tonight, or early am tomorrow.
I feel pretty clear that I want to implement something with both a 'verbose' form for development and testing, and a 'mean-lean' form for final use. So any further thoughts about what would be the form(s) you'd like to see would be welcome.
I am a devotee of the cult of making generic dictionaries with keyvalue pairs whose keys are whatever and whose values are executable code: I'll have to see if I can bootleg that in here
best, Bill
"In the River of Delights, Panic has not failed me." Jorge Luis Borges
|
|
|
|
|
For mean and lean I'd definitely go for the Single Big Switch, it will be compiled to a jump table so it's constant time and without the often-mispredicted branches that the Tree Of If's would have.
As for a dictionary with executable values, yes that'll work.. you could just make it an array though, because the key would just be an integer in the range 0 - 2^k without holes.
I don't expect that to be any faster than the Single Big Switch though because it's essentially doing the same thing, only the switch has less overhead because the jump table would contain the address directly rather than references to objects that contain the address (as in an array of delegates)
|
|
|
|
|
Hi David,
... edit #1 ...
An hour of reading on the web about what happens when switch statements get turned into IL turned out to be fascinating: in some cases if-then-else statements are generated ... in other cases jump-tables of the form you might expect when a continuous series of values are used as selector ... in other cases generic dictionaries may be generated and used as a look-up based solution, and even hashtables !
... end edit #1 ...
I took a look at the IL code generated by a switch statement with an integer selector, and I see code like this:
IL_003e: switch (IL_00c4, IL_00d8, IL_00ec, IL_0100, IL_0114, IL_0128, IL_0309, IL_013c, IL_0150, IL_0165, IL_017a, IL_018f, IL_01a4, IL_01b9, IL_01ce, IL_01e3, IL_01f8, IL_020d, IL_0222, IL_0237, IL_024c, IL_0261, IL_0276, IL_0309, IL_028b, IL_029d, IL_02af, IL_02c1, IL_02d3, IL_02e5, IL_02f7)
IL_00bf: br IL_0309
IL_00c4: ldloc.1
IL_00c5: ldloc.1
IL_00c6: ldc.i4.1
IL_00c7: ldc.i4.1
IL_00c8: ldstr "!DCBAE"
IL_00cd: callvirt instance class Test5Parameters/BigSwitchBoolIntString Test5Parameters/BigSwitchBoolIntString::SetValue(class Test5Parameters/BigSwitchBoolIntString, bool, int32, string)
IL_00d2: stloc.2
IL_00d3: br IL_0314
...
all the other cases
...
Seems reasonable to assume the IL instruction 'switch' is turned into a jump-table by the JIT-compiler, but are you certain of this ? It's beyond my abilities to view/grok what's produced by JIT.
thanks, Bill
"In the River of Delights, Panic has not failed me." Jorge Luis Borges
modified on Thursday, August 25, 2011 9:27 AM
|
|
|
|
|
assuming your number of bools is limited to 32 (or 64) I would pack them in an uint (or ulong), then work with that. So it could look like:
uint v=0;
if (Is_A) v|=0x00000001;
if (Is_B) v|=0x00000002;
...
followed by your decision tree, possibly a sequence of ifs:
if (v==someValue1) {
} else if (v==someValue2) {
}...
BTW1: the first code block could possibly be avoided altogether, either by using v everywhere, or by constructing an union/overlap structure (which is slightly harder in C# than it is in C/C++).
BTW2: the second code block could possibly be avoided also: when all your cases are so similar, you could simply write something like Console.WriteLine("Test"+v+" passed");
|
|
|
|
|
Thanks, Luc, good to see your 'light' shining in/on CP again ! Will look forward to thinking about your ideas.
I like the idea of a dual 'logic compiler' which can create either verbose or succinct output as desired.
best, Bill
"In the River of Delights, Panic has not failed me." Jorge Luis Borges
|
|
|
|
|
Turns out creating an auto-generator for the 'big-switch' form of the logic-test class, as suggested by David1987, was really pretty easy.
Here's a link to a usable sample of its output[^], which I'd appreciate suggestions on in terms of code, format, style, etc.
In comments in the file is code which will test the complete matrix of logical possibilities for a three condition case.
thanks, Bill
"In the River of Delights, Panic has not failed me." Jorge Luis Borges
|
|
|
|
|
Hi Bill,
I had a play with this tonight and thought I'd post it for what it's worth. If the code generator needs to support a various number of boolean values, it may be of some use.
void Start()
{
bool[] bools = new bool[] { true, false, true};
string testCase = "Test case " + FindTestCase(bools);
}
int FindTestCase(bool[] bools)
{
string arrayCode = BoolArrayToBinaryString(bools);
foreach (string code in BoolArrayToBinaryStringPermuations(bools.Length))
if (code == arrayCode)
return Convert.ToInt32(code, 2);
return -1;
}
string BoolArrayToBinaryString(bool[] bools)
{
StringBuilder sb = new StringBuilder();
foreach (bool b in bools)
sb.Append(Convert.ToInt32(b));
return sb.ToString();
}
IEnumerable BoolArrayToBinaryStringPermuations(int count)
{
double max = Math.Pow(2, count);
for (int i = 0; i < max; i++)
yield return Convert.ToString(i, 2).PadLeft(count, '0');
}
[modified] took into account David's comments but decided against << due to readablity. Changed code and assigned a max variable so I was only killing the kitten once instead of at every iteration
"You get that on the big jobs."
modified on Friday, August 19, 2011 1:37 PM
|
|
|
|
|
RobCroll wrote: Math.Pow(2, count)
Please no. Every time you do that, a kitten dies.
Use this instead: 1 << count
|
|
|
|
|
I've never liked cat's anyway but wow I never knew about <<, I thought it was c syntax. Can you provide a link, google isn't play ball with "<<".
"You get that on the big jobs."
|
|
|
|
|
|
Hi Rob,
thanks, I'll look forward to studying your code tomorrow (GMT + 7 here).
Right now the prototype is coming along nicely, and here's what the current calling convention looks like for generating a 5 element matrix of boolean combinations
List<string> ExcludedCases = new List<string> {"DCBAE", "!D!C!BAE", "DCB!A!E"};
List<string> t = PermutationsManager.MakeBigSwitch
(
IsVerboseOutput: true,
IsExcludesAsComments: true,
PriorityStr: "DCBAE",
ClassTitle: "Test5Params",
Excludes: ExcludedCases
);
0. ExcludedCases List<string>: cases to be either skipped or written out as comments: here defined separately for debugging, but could just as well have been written 'in-line' as the last argument to 'MakeBigSwitch.
1. IsVerboseOutput bool: controls whether output is minimal or fully commented.
2. IsExcludesAsComments bool: controls whether excluded cases are written as comments or skipped.
3. PriorityStr string: contains the order in which to write out 'switch 'case statements.
4. ClassTitle string: will be written into the generated class
5. Excludes is the List<string> of excluded cases
Given the input above, the output now looks something like this (excerpt):
using System;
using BigSwitch_0_0;
public static class Test5Params
{
public static BigSwitchBoolIntString BigSwitch3(bool D, bool C, bool B, bool A, bool E)
{
int intSelector = (D ? 0 : 16) + (C ? 0 : 8) + (B ? 0 : 4) + (A ? 0 : 2) + (E ? 0 : 1);
BigSwitchBoolIntString resultData = new BigSwitchBoolIntString();
resultData.IsSuccess = false;
switch (intSelector)
{
case 1:
return resultData.SetValue(true, 1, "!DCBAE");
break;
case 31:
return resultData.SetValue(true, 31, "!D!C!B!A!E : All Values False");
break;
default:
throw new BigSwitchFailException("Error In: Test5 parameters : DCBAE : DateTime: ");
}
}
}
The Type 'BigSwitchBoolIntString' is a helper class ... not shown here ... designed to 'package' the result of an evaluation. And 'BigSwitchFailException' ... not shown here ... is a custom Exception ... just in case demons strike.
In about another half-day, I should have all you see above fully implemented.
Appreciate any thoughts, ideas, you may have.
best, Bill
"In the River of Delights, Panic has not failed me." Jorge Luis Borges
modified on Friday, August 19, 2011 12:57 PM
|
|
|
|
|
I may have missed the point of what you are trying to achieve but:
taking 0 = false and 1 = true, all true/false combinations for n variables would be the numbers 0 to (2^n)-1 in binary read as an array of binary flags
e.g.
n = 2
2^n = 4
0 = 00 = false, false
1 = 01 = false, true
2 = 10 = true, false
3 = 11 = true, true
it should be, therefore, possible to devise a simple algorithm that runs a for loop of i = 0 to 2^n-1 and, at each iteration, deriving an array of bools from the binary representation of i
Pedis ex oris
|
|
|
|
|
Hi,
I am looking for an algorithm that given:
1. a bounding box 'bb => [x, y, width, height]
2. List<rectangle> 'ra => a set of rectangles r#1 ... r#n, all of which are guaranteed to be fully contained in bounding box 'bb, and all of which are guaranteed not to overlap any other rectangle
3. List
|
|
|
|
|
I don't know of any good way to solve it.
The way I do know, you're not going to like.
Branch & bound, where every "step" represents "extending a line in one of 3 directions (except back) by 1 unit" and the "cost" is the number of intersections. Finished when all lines have reached their targets (or when the previous best cost is exceeded, obviously).
It only works if the lines can only change direction on integer coordinates.
Bonus: expand lines in "likely good" directions first (ie. not immediately intersecting, roughly towards their goal)
Pro: it will give the right result.
Con: it will take all eternity to run.
Bonus 2: first estimate an upper bound for the cost this way:
For every pair of things that need to be connected, use A* to find the shortest path among the possible ways to connect them (A* is trivially extended that way - put all starting points on the Open list, test for "done" at all end points). Add extra cost at intersections (since the cost is positive, that will not affect the optimality of A*).
The result will be bad, but it will give a better initial bound at very little cost and thus save time.
It will still take all eternity to run.
|
|
|
|
|
Thanks, David, for the response.
I think you are right to (wittily) imply that this problem leads to runaway-recursion ! But I do see a parallel to this problem in algorithms used in the photo-typesetting business that arrange a variable-sized number of page-rectangles in a way that optimizes use of the space in each printing-plate generated, and I think it must be similar to some spreadsheet financial solutions that use multiple recursive "solvers."
Here's my working pseudo-code for a more practical approach:
1. for all the pairs of rectangles-to-be-connected, generate the set of the shortest rectangle-connections.
2. generate the set of intersections of the possible connections, ordered by rectangle-pairs with the highest number of intersections.
3. evaluate the set of intersections for each rectangle-pair with intersections > #someLimit, and then re-evaluate the complete set of connection points, compile the new set of intersections, and see if the number of intersections has increased or decreased: if increased return to the original set, if decreased use the new set of intersections.
4. and so on ...
I notice my eternitys are getting shorter these days.
best, Bill
p.s. Jonathan Swift: "A flea hath fleas that on him prey, and so on 'til infinity."
"In the River of Delights, Panic has not failed me." Jorge Luis Borges
|
|
|
|
|
Hm.. is that optimal? It doesn't seem optimal, unless I'm missing something..
|
|
|
|
|
Hi,
Yes, it's not optimal: both the hypothetical plan, and the shortening of eternitys
If I clearly knew what was optimal in this case, I would be busy implementing the algorithm, instead of here, humbly seeking your advice, and opinion.
best, Bill
"In the River of Delights, Panic has not failed me." Jorge Luis Borges
|
|
|
|
|