|
This is not a C# question. You should have asked in the General Database forum.
What you are looking for are called aliases
SELECT empno AS eno, emp AS employee, etc.
only two letters away from being an asset
|
|
|
|
|
I have to do it using c#. Because there is no direct interaction to database here. NO I don't want to alias the column name and table name. I have the replace/ create a new sql select statement by replacing the column names and table names with the new one (According to the condition's that I described above).
Suppose I write an sql select statement in a text box. After button click I want to do the above replacement of column name and table name in the same text box...
|
|
|
|
|
so, use some of the methods in the string class; what is holding you up?
Luc Pattyn
I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages
Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.
|
|
|
|
|
Suppose I got emp as table but if I replace all the emp inthe sql by employee then empname is also converting to employeename which is becoming to wrong column name
|
|
|
|
|
so you must replace words, not parts of words. That requires a parser, something that chops your text (SQL or other) into words based on whitespace and/or delimiters.
Luc Pattyn
I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages
Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.
|
|
|
|
|
so you must replace words, not parts of words. That requires a parser, something that chops your text (SQL or other) into words based on whitespace and/or delimiters.
How can I replace the word if I can't find the words? Because see the following string:
select empno,empname,deptname,sal from emp, dept where emp.did= dept.did
see here empno,empname,deptname,sal is one word. Because there is no space here. Also there may be column alias.It will be more difficult if i use group by, order by, having clause. In that case it will be more difficult.
Ok it is easy to find different column name and table name (Using a parser). But problem is how can I just replace the old table name to new table name and old column name to new column name.
again the problem is:
Suppose I got emp as table but if I replace all the emp by employee then empname is also converting to employeename which is becoming to wrong column name
|
|
|
|
|
dokhinahaoa wrote: empno,empname,deptname,sal is one word
not in my world.
as I said: a parser, something that chops your text (SQL or other) into words based on whitespace and/or delimiters.
Luc Pattyn
I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages
Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.
|
|
|
|
|
I got some idea. But do you know any good parsers name where i can easily find out the column names, table names, join tables names columns in join condition, column in where clause, column in group by clause and having clause. Also I can Replace the tables and columns names accordingly using the parser.
|
|
|
|
|
dokhinahaoa wrote: do you know any good parsers name
I don't name the parsers I create; all it takes is some ten lines of code: scan the string for begin and end of identifier, then look it up in a replacement dictionary.
Luc Pattyn
I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages
Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.
|
|
|
|
|
OOPs sorry...
Can you share your parser code an how to use it??
I will be grateful to you...
|
|
|
|
|
dokhinahaoa wrote: see here empno,empname,deptname,sal is one word. Because there is no space here
Then seperate it by comma not space
From all of you're posts it's clear you really have no clue what you are doing. I would suggest you turn this project over to someone who does.
only two letters away from being an asset
|
|
|
|
|
dokhinahaoa wrote: That means after operation my the sql should be as follows:
select eno, empname, deptname, sal from employee,department where employee.deptid=department.deptid
dokhinahaoa wrote: I have to do it using c#.
dokhinahaoa wrote: I have the replace/ create a new sql select statement by...
These statements contradict each other. What do you actually want?
only two letters away from being an asset
|
|
|
|
|
You can think the sql as a string here. Just the format of the string is like an sql select statement. Because i m not sending the sql to the database before formatting as i mentioned. And I have to do the formatting using c#.
|
|
|
|
|
Good day
This is a relatively advanced problem...
I am writing training software for image recognition.
When the training program starts, it loads a set of a few thousand (roughly 5000) images of size 24x24 pixels (greyscale).
I decided to load all images into my own class type, using a list. In other words, if my class is called "class Image", i have a list: List.
The image data is stored as a 2D double array in the class.
On each training round I need to process all images, and I am finding this is taking very long. Is there a way to make processing faster?
This is what I have already tried without much reduction in time:
1. Using unsafe code and pointers to access each image object.
2. Parrallel processing the images (but I ended up with errors).
3. Using a "for loop" instead of "foreach" to iterate through the list (not much improvement).
I am also considering using structs instead of classes to store the images, but not sure if this would overload the stack?
Please ask if any more information needed (I didn't give more since there is too much to say).
Any help would be appreciated please
tvb
|
|
|
|
|
Do you really need all those images in memory to be processed all at once? Even with just one byte per pixel, 24 x 24 x 5000 is almost 3GBMB
modified on Sunday, October 18, 2009 9:22 AM
|
|
|
|
|
Thanks for the reply.
I think it would be 2.8Meg? 24x24x5000= 2.8+E6
Memory isn't my biggest problem, since I'm using 4G of ram.
The reason I load them all into memory is that I assumed it would be faster. Do you have any other suggestions?
tvb
|
|
|
|
|
That's what happens when I try to think on a Sunday!
|
|
|
|
|
tvbarnard wrote: The image data is stored as a 2D double array in the class.
Something like this?
List<MyImage> myImages;
class MyImage
{
Double[,] data;
} How are these created, and how does the Image get into this array? Have you tried timing the code, to see where the bottlenecks are?
I are Troll
|
|
|
|
|
Yes, that would be how it's done.
I have used a profiler, and it shows that my bottleneck is when I am processing each image array.
The code is very abstract, but I'll post it in any case.
unsafe
{
fixed (int* pPosPoints = allFeatures[f].pointsPos, pNegPoints = allFeatures[f].pointsNeg)
{
for (int iP = 0; iP < boundPosImages; iP++)
{
featureValue = 0;
fixed (double* pIntVec = posTrainingSet[iP].imageIntegralVectorDouble)
{
for (int i = 0; i < boundPointsPos; i++)
{
featureValue += *( pIntVec + *( pPosPoints + i ) );
}
for (int i = 0; i < boundPointsNeg; i++)
{
featureValue -= *( pIntVec + *( pNegPoints + i ) );
}
if (allFeatures[f].polarity)
{
if (featureValue < allFeatures[f].threshold)
allFeatures[f].weightedError += posTrainingSet[iP].weight;
}
else if (allFeatures[f].threshold < featureValue)
allFeatures[f].weightedError += posTrainingSet[iP].weight;
}
}
To explain: the "imageIntegralVectorDouble" is my image array which I have reshaped to a vector (so the vector is size 24x24= 576).
What is happening here is I am using a classifer that contains two arrays of points ("allFeatures[f].pointsPos" & "allFeatures[f].pointsNeg") that need to be either added or subtracted in that image. In other words each point is a location reference.
In this specific code, I decided to use pointers to reference the images and points (as can hopefully be seen)
What I might have left out from my original question (in order to prevent confusion) is that the system uses at least 200,000 classifiers, implying - each image has to be evaluated by all of those classifiers. Which makes the reason for trying to optimize the code fairly obvious.
Most of the processing time is spent on the first two for-loops. I suppose this is the area to try and optimize.
tvb
|
|
|
|
|
tvbarnard wrote: Most of the processing time is spent on the first two for-loops. I suppose this is the area to try and optimize.
I have no idea how you'd optimize the code within those loops. Have you tried doing them in parallel[^]?
I are Troll
|
|
|
|
|
I actually did, but somehow my resutls are different when I use parallel, which means they are incorrect. I don't know how how one would go about debugging this...
(Btw, the parallel significantly improves performance).
Maybe you can tell me which would be better:
1. run for each image(5000 loops), compute classifiers(200,000 loops), or
2. for each classifer(200,000 loops), compute each image(5000 loops)
I don't see how it would make a difference, unless one of the two is stored on the heap instead of the stack.
tvb
|
|
|
|
|
Hi,
yes it can make a big difference, it is all about cache efficiency.
assume you have two arrays a[] and b[], and you are asked to count how many elements appear in both.
count=0;
foreach(int ia in a) {
foreach(int ib in b) {
if (ia==ib) {
count++;
break;
}
}
Lets first assume there aren't many matches, a[] is small (say 1KB) and b[] is large (say 100MB).
So the inner loop will read most of the large array on each outer iteration, hence all of b gets loaded from memory to cache over and over, the cache isn't really working.
Now assume a[] is large (100MB) and b[] is small (1KB); the outer loop works its way through the large array, but does so only once, while the inner loop is getting all of a[] from the cache. This will be faster by several orders of magnitude.
BTW: if you run the above code in a parallized way without taking any precautions, the result may be wrong, as the threads all are operating on the single variable "count". You would have to remedy by:
- locking the variable (bad idea, takes resources and will slow down);
- using Interlocked.Increment()
- or better yet have a counter for each thread, then accumulate them when all have finished
Luc Pattyn
I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages
Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.
|
|
|
|
|
Thanks, that's the best insight yet.
This is the equivalent of the code I'm using (hopefully more clear):
class Classifier
{
int[] posPoints
int[] negPoints
}
class Image
{
double[,] imageData;
}
void doCalculations(List<Image> images, List<Classifier> classifiers)
{
foreach(Image image in images)
{
foreach(Classifier classifier in classifiers)
{
double value = 0;
foreach(int p in posPoints)
{
value += image.imageData[p];
}
foreach(int p in negPoints)
{
value -= image.imageData[p];
}
}
}
}
This method "doCalculations" would typically be called 500 times during the entire process (so that's where the 500 came from )
In case you were wondering, the reason I need classes is because extra data associated with each image or classifier is stored in the class (which I have omitted here).
tvb
|
|
|
|
|
OK
The calculations are simple enough, it is all about getting the data in.
in its current form this a performance challenge as there is a lot of data involved, both the images (5000 * 576 * 8B = 3MB), and the classifiers (200,000 times 20 ints = 4MB) are almost filling the level2 cache of a modern CPU. And neither is size-dominant right now.
you may not like it much, but for maximum performance you need to reduce the amount of data bytes involved.
Hence:
1. why store pixels as double (i.e. 8B); I trust no physical system can discern more than 2^16 shades of gray, so use (u)shorts for storage and int for calculation? (if that is unacceptable, consider int or float)
2. when images are 576 pixels, why use int (i.e. 4B) indexes in posPoints/negPoints? short could do.
1. reduces the total image size to less than 1 MB, so that now is a prime candidate for your inner loop;
2. reduces the classifier data, making the outer loop faster.
even when classifiers would be somewhat smaller than total images, I would keep them in the outer loop as they have less "locality of reference", since they point to arrays (posPoints/negPoints) that could be anywhere in memory, so better load those only once.
Further ideas:
- drop all the 2D stuff; linearize your images once and for all (probably even when using pointers).
- merge posPoints/negPoints into a single array; maybe use negative indexes for negPoints. Slightly more calculations, better locality. May win, may loose, depends on your typical data.
- this you will hate: merge posPoints/negPoints from several/all classifiers; e.g. put them all in one big array, and let the classifier only hold a begin and end index into that array.
- you haven't revealed what is common among the 500 calculations; if a lot, reorganize to take advantage of that!
Conclusion: give up on some of the simple and clean design and gain a lot in performance.
Luc Pattyn
I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages
Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.
|
|
|
|
|
Wow, that's a big help!
Unfortunately since I've made so many versions of my code, some of the stuff you've said I've already done (like using the int's i.s.o. doubles).
What you've really helped with is using short's i.s.o. int's (hope this makes significant difference)
Questions:
1. The problem I have with merging the pos/neg points, is that those are index references to the image data (in other words to the pixels). I can't have a negative reference to an array... eg. image.data[-4] would be wrong. Unless you maybe meant something else?
2. Lol, you're right about that second last point. I can't merge my data, since I have for example a "errorRate (double)" associated with each classifier. Would an array[] of classifiers be faster than a List<> of classifiers?
3. Do you see any safe way of implementing parallelism? I've tried Parallel.For(0, bound, x=>{}) instead of the foreach statements for each of the respective loops, but I always get corrupted results. Could you expand on "or better yet have a counter for each thread, then accumulate them when all have finished"?
Oh, and a last point, each of the 500 iterations are called at independent times, so there's nothing in common between them.
tvb
|
|
|
|
|