
IIRC Integer Linear Programming can solve it optimally, but everyone seems to use heuristic algorithms because solving an ILP problem is too slow for them. But ILP is not an algorithm but a way of specifying a problem. You can solve it with Branch and Bound (I may give it a try and see how slow it would really be).
It's quite annoying to search for good information about job scheduling with full inadvance knowledge  most papers about job scheduling deal with the online version and/or throw in some extra unknowns and restrictions.





harold aptroot wrote: veryone seems to use heuristic algorithms because solving an ILP problem is too slow for them
This is exactly what i found. Further, all algorithms that are created in this area are more of customized or proprietary.
Job Scheduling is a major and vast topic and solving it by usage of rectangles, is breaking down the problem into smaller problem. I hope to see such algorithms...





Time for some code!
This code is clearly crap, I just wrote it quickly in a console app to see whether it would work. It calculates the optimal solution with a simple branch and bound algorithm (without any tricks that can make it faster)  all cases that I tested worked.
For anything above 16 jobs it becomes really really slow. As a quick guestimate, the time complexity is about O(m^j) where m is the number of machines and j is the number of jobs.
static int[] Schedule(int[] jobs, int machines)
{
int[] targets = new int[jobs.Length];
int[] lengths = new int[machines];
_bestFoundScore = int.MaxValue;
_bestSolution = null;
_schedule(jobs, targets, 0, lengths);
return _bestSolution;
}
static int _bestFoundScore = int.MaxValue;
static int[] _bestSolution = null;
static void _schedule(int[] jobs, int[] targets, int index, int[] lengths)
{
if (index == jobs.Length)
{
int max = 0;
for (int j = 0; j < lengths.Length; j++)
if (lengths[j] > max)
max = lengths[j];
if (max >= _bestFoundScore)
return;
_bestFoundScore = max;
int[] res = new int[targets.Length];
Array.Copy(targets, res, res.Length);
_bestSolution = res;
}
else
{
for (int i = 0; i < lengths.Length; i++)
{
targets[index] = i + 1;
lengths[i] += jobs[index];
int max = 0;
for (int j = 0; j < lengths.Length; j++)
if (lengths[j] > max)
max = lengths[j];
if (max < _bestFoundScore)
{
_schedule(jobs, targets, index + 1, lengths);
}
targets[index] = 0;
lengths[i] = jobs[index];
}
}
}
Edit: but I don't think using rectangles will help at all  if anything, it will just make it harder without having any benefits. The problem would still be the same, but you'd have "the width of a rectangle" instead of just "some integer or float" which is really just the same thing.
modified on Sunday, April 18, 2010 8:44 AM





Hey Harold,
Thanks for the code. This piece of code is basically an implementation of nested recursions. Usually, algorithms have been developed to use some shortcuts in order to achieve the optimum.
harold aptroot wrote: For anything above 16 jobs it becomes really really slow
It will get even slower if you add the type of jobs, type of machines, duration of job, efficiencies and duration of machine operation. I wonder what speed would be then.
The idea of rectangles is not to complicate the problem but to be able to see the problem differently.
Consider solving a maze from within the maze or sitting on the top of it. The perspective makes the whole of the difference. And hence the rectangle example. Another benefit is to represent the scheduling graphically and allow manual placement of pieces into a box of available resources.
Lets share in case we come across something. I have been trying to find something useful for a long time. When I saw your comment, I got optimistic
Cheers.





It has one shortcut  as soon as a partial solution becomes worse than the currentlybestfound solution it bails out and starts on the next partial solution.
I tried adding a greedy approximation so it could discard more partial results in the beginning when it hasn't found a good potential solution yet, but on random problem instances with m=3 and j=20 it only helped slightly more than a third of a percentage on average (which is basically nothing)





buddy. we both know that nested recursions will ultimately be slow and they have one inherent problem "SCALING". you can't scale them. They are only good for small problems.





Yea I know, do you know anything better though?
4.5 seconds (in release mode) on average for m=3 j=20, btw (with that naive implementation)





Oh. I totally appreciate the effort. The fact is that you created a working code in a nick of time.
However it is not an algorithm that we both are looking for. Because it cannot be scaled to real life problems. However fast you make it. With every single increase in job or machine, the time will double up.
I work with a range of around 500+ machines and around 1000 jobs to be allocated. And then there are multiple variations and factors to calculate the duration.
May be we can collaborate and work on an algorithm. I am very interested in working on it with a set objective and deliverables. If we succeed, its gonna be a huge success.
What say?
Som





Wow: Looks like my question is akin to smacking a wasp nest with a cricket bat. Glad to see it sparked some convo.
So with what I found that was suggested by Harold and Luc, it seems like I could get away with just a simple priority list and just do the fat jobs first. There are no dependencies (job x has to be done before job y), so there's nothing stopping me from doing the jobs in any order.
But I'm not sold that this is the way to achieve the quickest completion time. I might be wrong but there might be some cases where this wouldn't work. So I'll have to look around a little morefortunately the links and ideas given were a good step toward what I was looking for.





fishing season ain't over yet.





Here is a simple example where biggest job first does not yield the optimum:
two machines; five jobs: 3,3,2,2,2





What will be the typical size of the instances of the problem that you want to solve?
For up to about 20 jobs solving the problem optimally with branch and bound is still doable..
Otherwise, it seems that in practice it is usual to use a genetic algorithm or simulated annealing  both have a tweakable time/quality tradeoff and without too much trickery you can make it so that the result is guaranteed to be no worse than that of the greedy algorithm.





Eh... maybe 50 jobs. And the number of jobs keeps on growing too...





IMO the more jobs the more "biggest job first" gets nearer to the optimal solution.





Luc Pattyn wrote: IMO the more jobs the more "biggest job first" gets nearer to the optimal solution
I vote for "pays most".
The wonderful thing about the Darwin Awards is that everyone wins, especially the members of the audience.





Ok with the code I gave earlier that would take more than a million years
With a larger number of jobs, the greedy algorithm comes closer to the optimal solution, and there is a semigreedy algorithm that is always* at least as good as the simple greedy algorithm but usually better:
1) Let s be the sum of all joblengths and let bound be (s+m1)/m (where m is the number of machines)
2) For each job in descending order (by length), assign it to the first machine where it wouldn't cause the sum of the lengths of the jobs assigned to that machine so far to exceed bound , if there is no such machine mark the job as "non scheduled".
3) If there are no jobs marked "non scheduled", return the schedule.
4) Otherwise, take the length (mlength) of the schedule of the machine with the shortest schedule, and take the length (jlength) of the smallest job.
5) Change the bound to bound+(jlength(boundmlength)) (IOW, make it so that you could add the smallest job to the smallest schedule without exceeding the bound )
6) Make all jobs nonmarked and go back to step 2.
It is, of course, slower than the usual greedy approach, but (usually, at least) very fast compared to the optimal approach.
It always terminates, because jlength(boundmlength) can not be zero. If it were, that job would have been scheduled in step 2, so it wouldn't be the smallest "non scheduled" job. So the bound will grow until everything fits (potentially in a suboptimal way).
* I have no proof, but after a large number of random tests I still had no disproof. But you could always run the usual greedy algorithm as well and take the best result, it takes essentially no time anyway.
edit: I didn't make this up myself, btw.
modified on Monday, April 19, 2010 7:26 PM





An aerosol of solvent is my first choice for a wasps nest.
No Smoking.
No Electric.
etc
Tadeusz Westawic
An ounce of Clever is worth a pound of Experience.





Hello All,
I was just stoppingby to make a rare post and I poked into this thread.
In my mind I think of these kinds of problems as 'least energy state' problems. These very often correlate to conditions in analog setttings. By thinking one's way forthandback through analogdigital conversions, some unnoticed feature of the problem often presents itself. Your analogue thougthtpicture of blocks in a tall wall is an example.
I imagined myself with a large appliance box and spheres of varying diameter representing the jobs. With spheres in the box I shake the box, stop, note the arrangement of spheres, and repeat.
Anyone else with a thought picture?
Tadeusz Westawic
An ounce of Clever is worth a pound of Experience.





Need C# codez urgentzzzzz!!!!!
Not really, I have code. However, I am after some advice: under embedded .NET, is this better implemented under eventdriven code or in a more linear fashion with the dreaded Application.DoEvents() ? An eventdriven implementation looks a bit of a brainf*** IMO...






I have this 3rd party software that spits out a list of numbers formatted into text. They are integers but of varying number of digits. I usually get around 10,000 numbers in one string.
I'm turning all these numbers into an array of integers in C++ but it's taking me over 200 milliseconds.
I was told that LabView can do the same thing in less than a millisecond.
Then a kind CP patron ran an experiment on C# and told me it only takes up to 20 milliseconds.
I'm using CString::Find() to look for the space that separates each number in the string and CString::Mid() to get a piece of string per number and finally atoi() to turn it into integer. I know CString is slow, but I'm a bit surprised at the difference in the time of execution.
What is LabView doing that is so wonderful?
How can I improve my code in C++? Someone said something about not using Mid() to create new string each time but to put a section of char* directly into atoi() . I'm not sure how to do that except for doing things like memcpy in which case I might as well create a new CString as I do now.
Any ideas appreciated!
Almost, but not quite, entirely unlike... me...





One major observation:
 Get a profiler, and profile your code.
One minor observation:
 it is likely the overallocation and deleting of memory of CString::Mid killing your performance.
Two options:
 modify the string in place, replacing spaces with \x0's, and use atoi to convert values.
 parse the digits by hand. Pseudopseudo code:
set n to 0.
Look at next character
Is it a space? store the integer in the result list.
Is it a number. set n = n * 10 + (int)character  48.
Also:
 presize the resulting list to roughly input string length / estimated digit size, if you happen to know the string length up front.





My C# implementation, as posted in the Lounge... takes 1ms or less for 10000 numbers. This shows you how to avoid Mid() and therefore the overhead of creating strings, as well as a custom stringtonumber function to avoid the overhead of a more generalized function.
static void Main(string[] args)
{
StringBuilder sb = new StringBuilder();
int j = 100;
for (int i = 100; i < 101000; i++)
{
if (j > 999) j = 100;
sb.Append((j*10).ToString() + " ");
j++;
}
string s = sb.ToString();
int idx = 0;
int startidx = 0;
int length = s.Length;
List<int> nrs = new List<int>((int)(length*0.25));
var watch = Stopwatch.StartNew();
do
{
if (s[idx] == ' ')
{
if(idx>startidx)
nrs.Add(atoi(s, startidx, idx));
startidx = idx + 1;
}
idx ++;
}
while (idx < length);
Console.WriteLine(watch.ElapsedMilliseconds);
Console.ReadLine();
}
static int atoi(string c, int start, int len) {
int res = 0;
int i = start;
len += start;
while (i<len && c[i] >= '0' && c[i] <= '9')
res = res * 10 + c[i++]  '0';
return res;
}





PaulowniaK wrote: How can I improve my code in C++?
Next time, show us your code so we can answer that.
PaulowniaK wrote: CString::Find()
Depending on the overload, this could be your problem. You searching from the beginning of the string each time?
Those superficial observations aside, the above post (by J. Dun... something) looks like a great answer. Though it would have been nice to see your code for comparison.





This (plain C ) takes 1 ms circa on my system (Dual Core 1.8 GHz).
int get_all_numbers( const char * txt, int txt_size, int * a, int max_count)
{
const char * p0 = txt;
const char * p1 = NULL;
const char * guard = p0+txt_size1;
int count = 0;
while (p1 < guard && count < max_count)
{
a[count++] = (int) strtol(p0, (char**)&p1, 10);
p0 = p1;
}
return count;
}
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
 Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
 Iain Clarke
[My articles]



