C# Discussion Boards - CodeProject

Re: Generate sql select statement from another select statement

18-Oct-09 15:33

I have to do it using c#. Because there is no direct interaction to database here. NO I don't want to alias the column name and table name. I have the replace/ create a new sql select statement by replacing the column names and table names with the new one (According to the condition's that I described above).

Suppose I write an sql select statement in a text box. After button click I want to do the above replacement of column name and table name in the same text box...

Luc Pattyn18-Oct-09 16:14

Re: Generate sql select statement from another select statement

18-Oct-09 16:14

so, use some of the methods in the string class; what is holding you up?

Smile | :)

Luc Pattyn

I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages

Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.

AhmedMasum18-Oct-09 17:09

Re: Generate sql select statement from another select statement

18-Oct-09 17:09

Suppose I got emp as table but if I replace all the emp inthe sql by employee then empname is also converting to employeename which is becoming to wrong column name

Luc Pattyn18-Oct-09 18:04

Re: Generate sql select statement from another select statement

18-Oct-09 18:04

so you must replace words, not parts of words. That requires a parser, something that chops your text (SQL or other) into words based on whitespace and/or delimiters.

Smile | :)

Luc Pattyn

I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages

Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.

AhmedMasum18-Oct-09 18:41

Re: Generate sql select statement from another select statement

18-Oct-09 18:41

so you must replace words, not parts of words. That requires a parser, something that chops your text (SQL or other) into words based on whitespace and/or delimiters.

How can I replace the word if I can't find the words? Because see the following string:

select empno,empname,deptname,sal from emp, dept where emp.did= dept.did

see here empno,empname,deptname,sal is one word. Because there is no space here. Also there may be column alias.It will be more difficult if i use group by, order by, having clause. In that case it will be more difficult.
Ok it is easy to find different column name and table name (Using a parser). But problem is how can I just replace the old table name to new table name and old column name to new column name.

again the problem is:

Suppose I got emp as table but if I replace all the emp by employee then empname is also converting to employeename which is becoming to wrong column name

Luc Pattyn19-Oct-09 2:08

Re: Generate sql select statement from another select statement

19-Oct-09 2:08

dokhinahaoa wrote:
empno,empname,deptname,sal is one word

not in my world.
as I said: a parser, something that chops your text (SQL or other) into words based on whitespace and/or delimiters.

Smile | :)

Luc Pattyn

I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages

Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.

AhmedMasum19-Oct-09 4:28

Re: Generate sql select statement from another select statement

19-Oct-09 4:28

I got some idea. But do you know any good parsers name where i can easily find out the column names, table names, join tables names columns in join condition, column in where clause, column in group by clause and having clause. Also I can Replace the tables and columns names accordingly using the parser.

Luc Pattyn19-Oct-09 4:38

Re: Generate sql select statement from another select statement

19-Oct-09 4:38

dokhinahaoa wrote:
do you know any good parsers name

I don't name the parsers I create; all it takes is some ten lines of code: scan the string for begin and end of identifier, then look it up in a replacement dictionary.

Luc Pattyn

I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages

Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.

AhmedMasum19-Oct-09 19:22

Re: Generate sql select statement from another select statement

19-Oct-09 19:22

OOPs sorry...
Can you share your parser code an how to use it??
I will be grateful to you...

Not Active19-Oct-09 2:08

Not Active

19-Oct-09 2:08

dokhinahaoa wrote:
see here empno,empname,deptname,sal is one word. Because there is no space here

Then seperate it by comma not space Roll eyes | :rolleyes:

From all of you're posts it's clear you really have no clue what you are doing. I would suggest you turn this project over to someone who does.

only two letters away from being an asset

Re: Generate sql select statement from another select statement

Not Active18-Oct-09 16:24

Not Active

18-Oct-09 16:24

dokhinahaoa wrote:
That means after operation my the sql should be as follows:

select eno, empname, deptname, sal from employee,department where employee.deptid=department.deptid

dokhinahaoa wrote:
I have to do it using c#.

dokhinahaoa wrote:
I have the replace/ create a new sql select statement by...

These statements contradict each other. What do you actually want?

only two letters away from being an asset

Re: Generate sql select statement from another select statement

AhmedMasum18-Oct-09 18:34

18-Oct-09 18:34

You can think the sql as a string here. Just the format of the string is like an sql select statement. Because i m not sending the sql to the database before formatting as i mentioned. And I have to do the formatting using c#.

Code optimization

tvbarnard18-Oct-09 2:14

Re: Code optimization [modified]

18-Oct-09 2:14

Good day

This is a relatively advanced problem...
I am writing training software for image recognition.

When the training program starts, it loads a set of a few thousand (roughly 5000) images of size 24x24 pixels (greyscale).

I decided to load all images into my own class type, using a list. In other words, if my class is called "class Image", i have a list: List.

The image data is stored as a 2D double array in the class.

On each training round I need to process all images, and I am finding this is taking very long. Is there a way to make processing faster?

This is what I have already tried without much reduction in time:
1. Using unsafe code and pointers to access each image object.
2. Parrallel processing the images (but I ended up with errors).
3. Using a "for loop" instead of "foreach" to iterate through the list (not much improvement).

I am also considering using structs instead of classes to store the images, but not sure if this would overload the stack?

Please ask if any more information needed (I didn't give more since there is too much to say).

Any help would be appreciated please Smile | :)

tvb

DaveyM6918-Oct-09 2:32

DaveyM69

18-Oct-09 2:32

Do you really need all those images in memory to be processed all at once? Even with just one byte per pixel, 24 x 24 x 5000 is almost 3GBMB

Dave

Generic BackgroundWorker - My latest article!
BTW, in software, hope and pray is not a viable strategy. (Luc Pattyn)
Why are you using VB6? Do you hate yourself? (Christian Graus)

modified on Sunday, October 18, 2009 9:22 AM

tvbarnard18-Oct-09 3:10

18-Oct-09 3:10

Thanks for the reply.

I think it would be 2.8Meg? 24x24x5000= 2.8+E6

Memory isn't my biggest problem, since I'm using 4G of ram.

The reason I load them all into memory is that I assumed it would be faster. Do you have any other suggestions?

tvb

DaveyM6918-Oct-09 3:25

DaveyM69

18-Oct-09 3:25

That's what happens when I try to think on a Sunday!

Dave

Generic BackgroundWorker - My latest article!
BTW, in software, hope and pray is not a viable strategy. (Luc Pattyn)
Why are you using VB6? Do you hate yourself? (Christian Graus)

Eddy Vluggen18-Oct-09 2:33

Eddy Vluggen

18-Oct-09 2:33

tvbarnard wrote:
The image data is stored as a 2D double array in the class.

Something like this?

List<MyImage> myImages;

class MyImage
{
    Double[,] data;
}

How are these created, and how does the Image get into this array? Have you tried timing the code, to see where the bottlenecks are?

I are Troll Smile | :)

tvbarnard18-Oct-09 3:28

18-Oct-09 3:28

Yes, that would be how it's done.

I have used a profiler, and it shows that my bottleneck is when I am processing each image array.

The code is very abstract, but I'll post it in any case.

unsafe
{
	fixed (int* pPosPoints = allFeatures[f].pointsPos, pNegPoints = allFeatures[f].pointsNeg)
	{
		for (int iP = 0; iP < boundPosImages; iP++)
		{
			featureValue = 0;
			fixed (double* pIntVec = posTrainingSet[iP].imageIntegralVectorDouble)
			{

				for (int i = 0; i < boundPointsPos; i++)
				{
					featureValue += *( pIntVec + *( pPosPoints + i ) );
				}
				for (int i = 0; i < boundPointsNeg; i++)
				{
					featureValue -= *( pIntVec + *( pNegPoints + i ) );
				}

				//classify:
				if (allFeatures[f].polarity)
				{
					if (featureValue < allFeatures[f].threshold)
						allFeatures[f].weightedError += posTrainingSet[iP].weight; //false postive
				}
				else if (allFeatures[f].threshold < featureValue)
					allFeatures[f].weightedError += posTrainingSet[iP].weight; //false postive
				}
			}

To explain: the "imageIntegralVectorDouble" is my image array which I have reshaped to a vector (so the vector is size 24x24= 576).

What is happening here is I am using a classifer that contains two arrays of points ("allFeatures[f].pointsPos" & "allFeatures[f].pointsNeg") that need to be either added or subtracted in that image. In other words each point is a location reference.

In this specific code, I decided to use pointers to reference the images and points (as can hopefully be seen)

What I might have left out from my original question (in order to prevent confusion) is that the system uses at least 200,000 classifiers, implying - each image has to be evaluated by all of those classifiers. Which makes the reason for trying to optimize the code fairly obvious.

Most of the processing time is spent on the first two for-loops. I suppose this is the area to try and optimize.

tvb

Eddy Vluggen18-Oct-09 3:37

Eddy Vluggen

18-Oct-09 3:37

tvbarnard wrote:
Most of the processing time is spent on the first two for-loops. I suppose this is the area to try and optimize.

I have no idea how you'd optimize the code within those loops. Have you tried doing them in parallel[^]?

I are Troll Smile | :)

tvbarnard18-Oct-09 3:52

18-Oct-09 3:52

I actually did, but somehow my resutls are different when I use parallel, which means they are incorrect. I don't know how how one would go about debugging this...

(Btw, the parallel significantly improves performance).

Maybe you can tell me which would be better:

1. run for each image(5000 loops), compute classifiers(200,000 loops), or
2. for each classifer(200,000 loops), compute each image(5000 loops)

I don't see how it would make a difference, unless one of the two is stored on the heap instead of the stack.

tvb

Luc Pattyn19-Oct-09 4:21

19-Oct-09 4:21

Hi,

yes it can make a big difference, it is all about cache efficiency.
assume you have two arrays a[] and b[], and you are asked to count how many elements appear in both.

count=0;
foreach(int ia in a) {
    foreach(int ib in b) {
        if (ia==ib) {
            count++;
            break;
    }
}

Lets first assume there aren't many matches, a[] is small (say 1KB) and b[] is large (say 100MB).
So the inner loop will read most of the large array on each outer iteration, hence all of b gets loaded from memory to cache over and over, the cache isn't really working.

Now assume a[] is large (100MB) and b[] is small (1KB); the outer loop works its way through the large array, but does so only once, while the inner loop is getting all of a[] from the cache. This will be faster by several orders of magnitude.

BTW: if you run the above code in a parallized way without taking any precautions, the result may be wrong, as the threads all are operating on the single variable "count". You would have to remedy by:
- locking the variable (bad idea, takes resources and will slow down);
- using Interlocked.Increment()
- or better yet have a counter for each thread, then accumulate them when all have finished

Smile | :)

Luc Pattyn

I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages

Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.

tvbarnard19-Oct-09 5:09

19-Oct-09 5:09

Thanks, that's the best insight yet.

This is the equivalent of the code I'm using (hopefully more clear):

class Classifier
{
   int[] posPoints //for example = {1, 10, 15}
   int[] negPoints // {20, 30, 2} (both sets have maximum of 10 points)
}

class Image
{
   double[,] imageData; //typically 24x24 greyscale, thus 36KB
}

//example method that uses images and classifiers:


void doCalculations(List<Image> images, List<Classifier> classifiers)
{
   foreach(Image image in images)
   {
      foreach(Classifier classifier in classifiers)
      {
         double value = 0;

         foreach(int p in posPoints)
         {
            value += image.imageData[p];
         }
         foreach(int p in negPoints)
         {
            value -= image.imageData[p];
         }
         
         //More code here: decide on image type based on "value"...
      }
   }
}

This method "doCalculations" would typically be called 500 times during the entire process (so that's where the 500 came from Smile | :)

)

In case you were wondering, the reason I need classes is because extra data associated with each image or classifier is stored in the class (which I have omitted here).

tvb

Luc Pattyn19-Oct-09 5:37

19-Oct-09 5:37

OK
The calculations are simple enough, it is all about getting the data in.

in its current form this a performance challenge as there is a lot of data involved, both the images (5000 * 576 * 8B = 3MB), and the classifiers (200,000 times 20 ints = 4MB) are almost filling the level2 cache of a modern CPU. And neither is size-dominant right now.

you may not like it much, but for maximum performance you need to reduce the amount of data bytes involved.
Hence:
1. why store pixels as double (i.e. 8B); I trust no physical system can discern more than 2^16 shades of gray, so use (u)shorts for storage and int for calculation? (if that is unacceptable, consider int or float)
2. when images are 576 pixels, why use int (i.e. 4B) indexes in posPoints/negPoints? short could do.

1. reduces the total image size to less than 1 MB, so that now is a prime candidate for your inner loop;
2. reduces the classifier data, making the outer loop faster.

even when classifiers would be somewhat smaller than total images, I would keep them in the outer loop as they have less "locality of reference", since they point to arrays (posPoints/negPoints) that could be anywhere in memory, so better load those only once.

Further ideas:
- drop all the 2D stuff; linearize your images once and for all (probably even when using pointers).
- merge posPoints/negPoints into a single array; maybe use negative indexes for negPoints. Slightly more calculations, better locality. May win, may loose, depends on your typical data.
- this you will hate: merge posPoints/negPoints from several/all classifiers; e.g. put them all in one big array, and let the classifier only hold a begin and end index into that array.
- you haven't revealed what is common among the 500 calculations; if a lot, reorganize to take advantage of that!

Conclusion: give up on some of the simple and clean design and gain a lot in performance.

Smile | :)

Luc Pattyn

I only read code that is properly indented, and rendered in a non-proportional font; hint: use PRE tags in forum messages

Local announcement (Antwerp region): Lange Wapper? 59.24% waren verstandig genoeg om NEEN te stemmen; bye bye viaduct.

tvbarnard19-Oct-09 6:02