|
Google, the son of Altavista, with whom you are acquainted for sure.
|
|
|
|
|
|
Hi Shannon,
I am curious to ask you, given that Anchorage is not exactly a small city (300k people ?), if you have looked around Anchorage for junior colleges, or educational institutions, night courses, extension campuses ? Is there any mobility problem, or anything else, that inhibits your getting around and visiting places like these ?
Have you googled on "anchorage c#" : [^] : looks to me like a lot of local resources there. People, companies, you might contact.
One resource I have always found helpful in almost any situation is to contact a local librarian such as may be found at : [^] : librarians these days are usually really up to date on local resources, and on-line resources, not just books.
"C#, .NET, PHP, XML, SQL" is quite a whopping plate of technology ! And I think you may want to organize your approach to learning around first trying to clarify some goals about what you want to be able to do in the future. If your long-range goal is to become a fully qualified "computer scientist" with academic credentials, that's one thing; if your goal is to, within a year or so, begin to get paid for creating web-sites that talk to databases, and support e-commerce, that's another.
And by all means, if possible, explore the various computer clubs in your area ... [^]... go the meetings, bounce off people, socialize. See if you can find someone who's willing to sit down with you and discuss your goals who's familiar with local resources.
Don't hesitate to find out the open office hours of an instructor or professor at a local junior college or college in computer science and drop in, introduce yourself; most likely you'll get a friendly greeting and, possibly, some good advice.
good luck, Bill
"Many : not conversant with mathematical studies, imagine that because it [the Analytical Engine] is to give results in numerical notation, its processes must consequently be arithmetical, numerical, rather than algebraical and analytical. This is an error. The engine can arrange and combine numerical quantities as if they were letters or any other general symbols; and it fact it might bring out its results in algebraical notation, were provisions made accordingly." Ada, Countess Lovelace, 1844
|
|
|
|
|
I am learning further in C# and came across a special requirement of collections. I wonder if such implementations exists.
I need a list with a key and multiple values. After the first entry, next entries can be partial only to denote a change. Something like this:
index value1 value2 value3 Result value1 value2 value3
1 100 200 300 - 100 200 300
2 200 - 100 200 200
3 150 300 - 150 200 300
4 250 - 150 250 300
5 100 200 - 100 200 300
This can be achieved using DataTable and using select statements/compute etc. But, is there a simpler option available? Speed is the key here.
----------------------------Question Modified----------------------------------
Please understand that creating a list/dictionary/array/hashtable is not the problem here. Storing the data is easy. This could be done using numeric Key/Index (I can manage to keep indexes of the keys also).
The idea is to be able to look for the last fetched value against each key. The problem comes when I want to retrieve he value without having to loop again. Look at the same example with modified index values.
index value1 value2 value3 Result value1 value2 value3
100 100 200 300 - 100 200 300
200 200 - 100 200 200
300 150 300 - 150 200 300
400 250 - 150 250 300
500 100 200 - 100 200 300
Consider the next step. if I want to find out the value for key 250, I would want to look for 200, and then find results for it.
Is it possible without going through loops again?
Current Solution:
I created a class as:
class DataClass
{
int Value1; int Value2; int Value3
int ValueIndex1; int ValueIndex2; int ValueIndex3;
}
Then for each entry done, I need to run the loop once to find out the last index for which entry was done against the value and store it.
Dictionary<int, DataClass> entries = new Dictionary<int, DataClass>();
entries.Add(100, new DataClass(100,200,300, 100,100,100);
entries.Add(200, new DataClass( 0,150, 0, 100,200,100);
entries.Add(300, new DataClass( 0, 0,250, 100,200,300);
entries.Add(400, new DataClass(150, 0, 0, 400,200,300);
entries.Add(500, new DataClass( 0,200, 0, 400,500,300);
Now, whenever i need to look for 250, i search for closest smaller key (here that is 200). find values. whatever is available is taken as it is. For all others, find the key in the same record and get the new entries.
This solution works for the first time entries. However it creates problems when an entry is inserted in between. Another logic needs to be built in between.
Can there be something faster than this?
----------------------------------
modified on Thursday, December 17, 2009 12:58 AM
|
|
|
|
|
Have each object maintain a dictionary of values. In the samples you rpovided above:
Index 1 would contain
1,100
2,200
3,300
Index 2 would contain
3, 200
Index 3 would contain
1, 150
3, 300
Index 4 would contain
2,250
Index 5 would contain
1,100
2,200
Use google to learn how to use dictionary collections.
.45 ACP - because shooting twice is just silly ----- "Why don't you tie a kerosene-soaked rag around your ankles so the ants won't climb up and eat your candy ass..." - Dale Earnhardt, 1997 ----- "The staggering layers of obscenity in your statement make it a work of art on so many levels." - J. Jystad, 2001
|
|
|
|
|
Thanks for the attempt.
I have modified the question for all the responses. Answering the same answer to all replies was not a good idea.
Please give it a read.
|
|
|
|
|
Value1, value2 and value3 are distinct entities that you want to track? A dictionary consists of a value and a key. The key would be your index-column, the value could be a struct that holds the three columns;
System.Collections.Generic.Dictionary<int, MyValues> myListOfValues;
struct MyValues
{
int value1;
int value2;
int value3;
} You could consider the generic list as your datatable, and the struct as a single row in that table. Now you can add values to your dictionary;
MyValues oneRow = new MyValues();
oneRow.value1 = 100;
oneRow.value2 = 200;
oneRow.value3 = 300;
myListOfValues.Add(1, oneRow); Now, if you want the recreate those values for index 3, you would have to search in the list for the first non-empty value for the column value2.
If speed is important, then I'd create one row outside of that list, just to mirror the current values. That wouldn't help if you need the values from the middle of the index, you'd still have to walk the list for that.
How many records will be in that list? It might just be worth to store copies of the latest value, instead of looking for them.
I are Troll
|
|
|
|
|
I modified the question to answer all the people who replied. Please read that.
And yes. thanks for the attempt.
|
|
|
|
|
Som Shekhar wrote: Can there be something faster than this?
Yup; don't store the empty values, but do the looping on insert. Meaning that when you insert the values (null, 100, null), that you'll loop the list *at that moment* to fetch the null-values and store them at that index.
It's either a lookup on insert (fast, consumes more memory) or a lookup when you're reading (slower, less memory-pressure).
AFAIK, you can't have both (yet)
I are Troll
|
|
|
|
|
I prefer anything to be done at the insert stage. So, that's taken.
Storing null values depends if it is an object or a list. In case of an class object constructor may demand all values and hence the null value. It can be skipped if possible.
Doesn't any implementation exist in this format? A very practical example will be Configuration Plans which is time bound. We may only want to save the planned changes and not at every stage.
|
|
|
|
|
Som Shekhar wrote: Doesn't any implementation exist in this format?
I don't know of any in the Framework, but then again, I don't know the entire framework by heart yet.
Som Shekhar wrote: We may only want to save the planned changes and not at every stage.
Like I said, memory is cheap these days - and I doubt that you'll save a lot by inserting null values for a non-changed field. These kind of constructs are very common in SQL Server, where one would use a query to recreate the full list. The query does the lookups (over a speedy index) and you can read it as if it were the original table.
..but in-memory? Short lists should just store the values in a redundant way, and long lists shouldn't be kept in memory at all.
I are Troll
|
|
|
|
|
Eddy Vluggen wrote: Like I said, memory is cheap these days
Memory is not a trouble. Speed is.
This is for a single such list. When this kind of list exist in 100s and each has to be going through a for loop with multiple calculations in between, the calculations go in 0.2-0.3 sec range. this is not too much to look at but with drag and drop, 0.2-0.3 lag is not acceptable. I am just trying to go to 0.05-0.06 sec range.
These kind of change cannot come by simply tweaking a line or two. It can only come by changing the whole method of searching records.
By the way, this whole thing is not even happening in the server/sql. its happening in the memory... hence the speed trouble.
|
|
|
|
|
Som Shekhar wrote: By the way, this whole thing is not even happening in the server/sql. its happening in the memory... hence the speed trouble.
The good news is that memory is usually faster to access then a server
Som Shekhar wrote: each has to be going through a for loop with multiple calculations in between
The only option that I'm aware of is to precalculate the missing values. That way you only have a lookup in a list, which is quite fast. So, instead of inserting the nulls for the unchanged values, insert the value.
That would move the cost for inserting from the "read from the list"-part to the "add to the list"-part of the code.
I are Troll
|
|
|
|
|
If you read the modified question, i am saying an index along with as where to lookup again next time.
Saving values instead of null may have problems later one when more values are inserted or removed. The whole purpose of the list is to save the value where changed.
|
|
|
|
|
Som Shekhar wrote: Saving values instead of null may have problems later one when more values are inserted or removed. The whole purpose of the list is to save the value where changed.
It sounded like the main issue was to have a list with values that you can query at a high speed. If you save all the values, then you won't have the problems you mentioned when removing an item;
Col1 Col2 Col3
100 200 300
100 200 100
100 100 100 Remove the second line, all data will still be correct. On the other hand, if you store the null-values (to indicate a non-change), then you might run into trouble;
Col1 Col2 Col3
100 200 300
? ? 100
? 100 ? If you remove the second line now, you'll get this;
Col1 Col2 Col3
100 200 300
? 100 ? Which decodes to this when you try to track back the changes;
Col1 Col2 Col3
100 200 300
100 100 300 As you can see, the last item has changed. I don't see the advantage of storing only the changes in this particular structure, only more challenges
It seems that I don't understand the question well enough to provide you with an answer.
I are Troll
|
|
|
|
|
Hey there!!!
I really appreciate your efforts.
But the question is to know if I am missing something fundamentally? If there is an implementation already present? Like we have Hashtables, List, Dictionaries for various purposes. Is there any other tool that I missed which can handle such a case?
Or, may be there could be a need to develop such a list which could only record changes, do all the calculations internally, fast enough to match those of dictionary/indexed methods.
In any case, thanks once again.
|
|
|
|
|
Som Shekhar wrote: I really appreciate your efforts.
Nice to see someone who's biting into a subject, instead of just asking for code
Som Shekhar wrote: But the question is to know if I am missing something fundamentally? If there is an implementation already present? Like we have Hashtables, List, Dictionaries for various purposes. Is there any other tool that I missed which can handle such a case?
Not that I know. Yes, we got generic lists that can take all kinds of data, and we got observable lists that give you a notification if anything changes. But no list that's specialized in doing an incremental save.
I think that most of us would cache the result, storing redundant values. It's a waste of memory, I know, but we often make these kind of trades. If you got some spare CPU-time, then it might make sense to add this optimization. You'll lose a bit of speed reconstructing the data at a particular index, but return you'd have some extra memory.
The guys who work with Windows Mobile might have more experience with this, as they have less resources and actually need to think about using them effectively. On my desktop, I don't mind wasting a megabyte or so, if it means that I can spend my time on more critical issues.
Som Shekhar wrote: Or, may be there could be a need to develop such a list which could only record changes, do all the calculations internally, fast enough to match those of dictionary/indexed methods.
Again, recalculating the data will (logically seen) cost more processor-time than just reading it. Then again, the time that it takes might be neglectable, and it may also be true that you win back a fair amount of memory. That would depend on the amount of data, and the amount of 'holes' that one has to move through to get the 'last known values' for the columns on that particular index.
At the start of this thread, I would have advised against that on the assumption that there's not much to gain. I'm not that sure anymore. The only way to get a definite answer is by building a prototype and measuring the results. Therein lies another consideration; would it be worth to spend the time on building such a prototype?
I are Troll
|
|
|
|
|
Eddy Vluggen wrote: Nice to see someone who's biting into a subject, instead of just asking for code
Coding is easy. Concepts are difficult to grasp. If you know the direction, you can reach anywhere. If you only know the target,god save you.
Eddy Vluggen wrote: It's a waste of memory
Memory is not really an issue. I am building an application for a bigger use and hence using all kind of hardware resources onto it. I can tell my clients to use better hardware. This means a good speed CPU and a good amount of RAM. Hence I really don't mind 1-2 MB extra here.
I am already looping to create lookup-ready directory. Hence that is already covered.
As I mentioned, The trouble comes when multiple of such calculations happen together. I am currently working on multi-threading of different instances. Atleast to save some more time.
Let me give you a link of another problem that i posted. You would see the use of such a datatype there.
http://www.codeproject.com/Messages/3304858/What-will-be-the-height-of-fluid-columns-in-a-vari.aspx[]
In this problem, calculation of fluid height is needed. There are multiple fluid columns and many such tubes. With drag and drop functionality
Usually working with already implemented concepts is always better. Consider using a dictionary vs. implemented List with key.
Eddy Vluggen wrote: would it be worth to spend the time on building such a prototype?
You would be surprised that i have come across such situation more than 4-5 times already while designing my applications. I usually work on disconnected database system and speed is a primary concern in loading and saving data.
I initially worked with datatables which worked fine when my application was young. As it grew older, datatables are damn slow. I moved to dictionary. So far, they are fine. Even today, i experience a max lag of 0.5-0.6 sec on a drag drop operation which isn't too much to worry about.
By multi-threading, i hope to reduce it to around 0.1-0.2 which should be manageable. But it is good to keep up with concepts.
Usually a parallel solutions does wonders and thats what I was hoping here.
|
|
|
|
|
Som Shekhar wrote: Coding is easy.
I'm looking at a buglist right now which tells me that it's not as easy as English.
Som Shekhar wrote: As I mentioned, The trouble comes when multiple of such calculations happen together. I am currently working on multi-threading of different instances. Atleast to save some more time.
Have you seen the article[^] on the AForge.Parallel.For -class? It might help in building a prototype to measure against
Som Shekhar wrote: In this problem, calculation of fluid height is needed. There are multiple fluid columns and many such tubes. With drag and drop functionality
True, but it would also make an impressive interface
Som Shekhar wrote: Usually working with already implemented concepts is always better. Consider using a dictionary vs. implemented List with key.
I'd try to mirror the concept of a database in-memory; creating a list of the records, and the equivalent of an index. IQueryable[^] springs into mind.
Som Shekhar wrote: You would be surprised that i have come across such situation more than 4-5 times already while designing my applications. I usually work on disconnected database system and speed is a primary concern in loading and saving data.
I initially worked with datatables which worked fine when my application was young. As it grew older, datatables are damn slow. I moved to dictionary. So far, they are fine. Even today, i experience a max lag of 0.5-0.6 sec on a drag drop operation which isn't too much to worry about.
This post[^] confirms that although databases manipulate data very fast, your results are faster.
Som Shekhar wrote: By multi-threading, i hope to reduce it to around 0.1-0.2 which should be manageable. But it is good to keep up with concepts.
Usually a parallel solutions does wonders and thats what I was hoping here.
One could consider multiple ways to optimize, and I'm sure that would be some creative ways that'd get posted to do so. Using the Parallel.For class to lookup all the elements could be a good start.
Another, perhaps better implementation yet, would then be a readonly list, to describe a table like presented below. Instead of writing a null, your could launch a short lived thread to calculate it's distance to a yougest version in the list, that disctince gives you the index of the value that it actually stands for. This should be done when you load the data; you'd have to process it a bit of a time decoding it, but that also shortens the amount needed to retrieve data from that list. This would be an optimization on the readprocess, as you can forget about fetching it at all if it's really there. Moving this particular task to the method that's doing the initalization, lookups would be faster. The initialization-routine could also be (ab)used to dynamically enrich your data, if that would be required.
You could then do parallel lookups, each lookup falling back on it's PK - A Perhaps a HashTable<key, record="" [as="" struct!]="">. You'd would then already be pointing at all the correct values, for all correct columns, without having to worry for corruption. That's as long as the read is readonly and easily accessible by threads.
I'm of to bed, this kept going through my head all the time. I wonder if I'm now gonna dream about it?
I are Troll
|
|
|
|
|
Parallel.For looks promising. Will dig into it.
Currently, since i have already implemented multithreading, I guess no need to implement that for now.
Eddy Vluggen wrote: True, but it would also make an impressive interface
Oh you bet. These days, looks may not be everything but that is what sells the first.
I guess, that is it for now... I gotta be happy with multithreading for now. Since no other implementation already exist in this area.
It was great having some meaningful conversations.
|
|
|
|
|
Som Shekhar wrote: Oh you bet. These days, looks may not be everything but that is what sells the first.
Sad, but true.
Som Shekhar wrote: I guess, that is it for now... I gotta be happy with multithreading for now.
You're dividing your workload over multiple CPU's, there's not much room for improvement there. If you get unhappy in the future, try Brahma[^], that would give you the option to offload some work from the CPU to the GPU, abusing the graphics-card.
Som Shekhar wrote: It was great having some meaningful conversations.
Yup, engaging a conversation is simply more fun than posting an answer. Good luck with your venture
I are Troll
|
|
|
|
|
Brahma looks interesting!!!
Now, that we are talking of multi threading, why is it that we need to code for multi-threading?
If you look at any operating system that can support multiple processors, it automatically distributes work onto different cores. Can't there be a framework which can do the same without the need to code differently?
If a user defines a program priority as "High" or "RealTime" it only increases the share of thread time to current process. But no change in threading...
Am I missing something?
|
|
|
|
|
Som Shekhar wrote: Am I missing something?
The perversion to launch a new thread from a Visual Basic 6 application, I hope
Som Shekhar wrote: If a user defines a program priority as "High" or "RealTime" it only increases the share of thread time to current process. But no change in threading...
..and "realtime" isn't really realtime, but just the name of the highest level of priority.
Som Shekhar wrote: If you look at any operating system that can support multiple processors, it automatically distributes work onto different cores.
Though it feels that way, it's an illusion. A program is made up of a logical set of commands/instructions that get executed one after another. That's reflected in our applications; we expect that the second instruction won't be execute before the first instruction. A short example;
10 A$ = "Hello"
20 B$ = "World"
30 PRINT A$ + " " + B$ These three lines of code should be considered atomic, meaning that you don't want to distribute them over 2 different people to interpret. This is a task that can't be divided. One processor had access to it's own cache, and it's memory. Windows was created and started to fake multitasking. Applications would run (to line 20 in our example), get thrown into the deep-freeze, another application would be defrozen and run, ad infinitum. Do that very fast, and it seems to become a fluid movement.
Threads we're already there; it was preferred to launch your own thread instead of spawning a new process if you needed to do some additional tasks. Using a thread would cost less resources and they behaved like an additional proces, owned by some other (main)-thread. Fibers were introduced also, but those never gained popularity.
You wanted this processing to happen in "some other place" than the thread that ran your interface. Every Windows-application has a method that's called "WndProc", which Windows calls now and then to inform your application of mouse-movements that have occurred, or that certain parts of the form need to be repainted. Let's extend our example application;
10 REM Example :)
20 REM
30 WndProc:
40 MSG = GWBASIC_INTEROP.GETMESSAGE()
50 IF (MSG = WM_UIT) THEN
60 GOTO THE_END
70 END IF
80 IF (MSG = WM_PAINT) THEN
90 GOSUB SAY_HI
100 END IF
110 GOTO WndProc
120 SAY_HI:
130 FOR X = 1 TO 100
140 PRINT "Hello World, number " + X
150 NEXT X
160 RETURN
170 THE_END:
180 There's a loop that processes the messages, and there's code. This meant that if the processor was doing line 140, it would get frozen there in the middle of the job. This, as a consequence, means that the application wouldn't accept a "quit" message until it's finished doing those 100 iterations!
A thread is gets frozen with it's state, that's the reason why it's "illegal" to write into memory that another thread is working with, and the reason why the mainthread of any application is reserved to handle the UI.
The Parallel.For is an abstraction that creates multiple threads (let's take 4 as an example) to run a loop. One of the pre-requirements is that they shouldn't share variables that could mess up the way they work (because one has X=3, on X=4, and two have X=5). They should also say hello to the mainthread, before changing any of it's values. This model scaled to multiple processors.
SQL Express is limited to using a single processor, whereas SQL Server goes as far as making the processor-affinity a mere setting. Some applications still do their processing in the UI, easily recognizable by the white space that they show where a form should be. It's not a perfect situation, but it's often hard 'enough' to make an application run correctly with a single path-of-execution.
There is indeed a growing need for extra tools. The .NET Framework has a BackgroundWorker which makes it easy to manage a new line-of-execution, and you'll often find an Asynch-version for a method-call.
Som Shekhar wrote: Brahma looks interesting!!!
Sure does - there's a lot of GPU's onboard of the motherboards in the office without being used very much
I are Troll
|
|
|
|
|
Topic is long over, I am just loving the conversations
Eddy Vluggen wrote: and "realtime" isn't really realtime, but just the name of the highest level of priority
Yes, I know that... I guess you would know that Microsoft has told that one shouldn't use "RealTime" in their applications as it will freeze the OS. I wonder if they created a check to freeze the system if someone used "realtime" instead of letting the processor do the job.
Well, you are right about the process and threading. I agree fully when it comes to the concept of threading vs processes.
My question was a little different. I know that two processes are resource heave and thread do just the jobs quite well.
Let me try to suggest my concept here.
Lets say we got two classes "Car" and "Bike". Car has its own methods and so does the bike. We create two objects "Car1" and "Bike1" If these two are there in an application, all internal methods could be handled through a new thread and thus they will always be thread safe. Even two objects "Car1" and "Car2" will always be thread safe.
Instead of programmer creating such new threads and their completion events for each of the methods, the framework could automatically run them on new threads.
Is it that I have a plan for a new programming language? Am I talking weird?
|
|
|
|
|
Som Shekhar wrote: Topic is long over, I am just loving the conversations
Ditto, but we'd better move to the soapbox, or email
Som Shekhar wrote: Let me try to suggest my concept here.
Lets say we got two classes "Car" and "Bike". Car has its own methods and so does the bike. We create two objects "Car1" and "Bike1" If these two are there in an application, all internal methods could be handled through a new thread and thus they will always be thread safe. Even two objects "Car1" and "Car2" will always be thread safe.
Instead of programmer creating such new threads and their completion events for each of the methods, the framework could automatically run them on new threads.
Is it that I have a plan for a new programming language? Am I talking weird?
Not at all, sounds like a convenient way to distribute the load. One way to do so would be by instantiating a BackgroundWorker and passing the Bike to the RunWorkerAsync[^]-method. You'd only need locks where the objects need to share data.
Those objects still need to be 'invoked' from somewhere, and that somewhere is most likely going to be the mainthread. It doesn't make much sense to create an Async-version for every method or class, since threads still cost performance. Creating a thread to change Form.Visible is going to be rather inefficient.
If the Bike is a webserver-kind of class, then yes, it's the correct pattern. If the Bike is a DataGridViewColumn , then it might be wiser to keep the code rather short and simple anyway. If you need a long-running process happening in such a place, then move it to a backgroundworker and have them signalling status.
There's two other interesting places to visit;- Rx extensions[^], since collections are another example where easy multhithreading makes sense
- This cheatsheet[^] might provide some valuable tips on optimizing. It made me think twice about progressbars
I are Troll
|
|
|
|
|