|
Persisting tirelessly to be fit - and agile, perhaps? (13)
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
indefatigable ( not sure about the spelling )
Edit I just checked and amended it
"We can't stop here - this is bat country" - Hunter S Thompson - RIP
|
|
|
|
|
You are up tomorrow.
Persisting tirelessly
be fit - and agile BEFITANDAGILE
perhaps? (anag)
INDEFATIGABLE
(Your spelling is fine)
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
OriginalGriff wrote: INDEFATIGABLE
Just Googled it and apparently we had a ship named that... seems like a strange choice of name for a ship though?
I wonder if the plan was to distract the enemy while they worry about how to try and pronounce it...
Radar-boy: "Sir, I'm picking up the HMS In... In... Ind... Ind... Inde..."
*10 minutes later*
"...Inde... Indefatigable on our scanners!"
Captain: "I know, boy... it sunk us 5 minutes ago..."
|
|
|
|
|
It's a very Victorian name for a ship, isn't it? "An Implacable-class aircraft carrier" is rather Victorian as well, if you know what I mean - kinda makes you wonder how we actually won WWII sometimes ...
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
kinda makes you wonder how we actually won WWII sometimes
Did we though?
"We can't stop here - this is bat country" - Hunter S Thompson - RIP
|
|
|
|
|
L0001: jmp L0002, L0010, L0021, L0029, L0040, L0050, L0059, L0070, L0082, L0092, L0101, L0109, L0115, L0121, L0133, L0142, L0149, L0157, L0167, L0174, L0185, L0190, L0198, L0209, L0213, L0219, L0223, L0233, L0241, L0247, L0256, L0264, L0272, L0278, L0285, L0292, L0297, L0302, L0306, L0311, L0315, L0319, L0323, L0327, L0331, L0335, L0339, L0343, L0347, L0352, L0357, L0361, L0366, L0371, L0375, L0380, L0384, L0389, L0393, L0398, L0402, L0407, L0412, L0416, L0421, L0426, L0430, L0435, L0440, L0444, L0449, L0453, L0458, L0492, L0499, L0506, L0513, L0520, L0527, L0537, L0546, L0554, L0561, L0568, L0574, L0583, L0591, L0598, L0606, L0616, L0625, L0633, L0640, L0647, L0656, L0665, L0671, L0681, L0690, L0696, L0702, L0708, L0716, L0728, L0753, L0768, L0773
Each JMP operand spawns a fiber (basically a thread). I haven't counted how many are spawned here, but 70-80 or so?
I think maybe this code is a bit heavy handed. This is just to match a (quite complicated) regular expression
Real programmers use butterflies
|
|
|
|
|
honey the codewitch wrote: Each JMP operand spawns a fiber (basically a thread). A light-weight form of a thread.
Given the amount of threads that chrome spawns, I wouldn't mind a bit if there's a few fibers running. Do you have a test-project to test the performance?
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
Yes I do and the performance is god awful. This is for a compound regular expression rather than a web browser, so this is more than a little excessive. Normally the machine will spawn like 2 or 3 while it's doing normal character scans, but when it has to split it quickly grows.
The reason it spawns more than one is disjunctions in the regex, like foo|bar - it spawns a fiber to scan each one. In truth it spawns slightly more than 1 fiber on average because save points spawn a fiber. Plus each fiber only lives for the duration of one character.
Real programmers use butterflies
|
|
|
|
|
honey the codewitch wrote: Yes I do and the performance is god awful. I agree, and that's why I run FireFox
honey the codewitch wrote: Plus each fiber only lives for the duration of one character. So, light weight threads that are short-lived?
How would it compare to a threadpool, cutting back on creation cost and feeding the threads as they become available?
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
"If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.
|
|
|
|
|
They're already allocated since they're simple structs sitting inside an array. The only field that gets set are two simple 32 bit fields on the struct =) Since they're allocated this way, at least unless .NET sucks in this arena (i haven't checked the IL) they don't need to be recycled - they're permanent instances.
Furthermore, the fibers get used at maximum - they are never idle, ergo, a threadpool won't benefit me.
Real programmers use butterflies
|
|
|
|
|
Eddy Vluggen wrote: So, light weight threads that are short-lived?
Kinda. Their primary purpose is non-preemptive/cooperative multitasking instead of preemptive multitasking like threads. The best analogy I've seen is co-routines.
|
|
|
|
|
Yep. That's about the long and short of it.
private struct _Fiber
{
public readonly int[][] Program;
public readonly int Index;
public int[] Saved;
public _Fiber(int[][] program, int index,int[] saved)
{
Program = program;
Index = index;
Saved = saved;
}
public _Fiber(_Fiber fiber, int index,int[] saved)
{
Program = fiber.Program;
Index = index;
Saved = saved;
}
}
All it contains is a pointer to the program array which all fibers share, the current instruction pointer, and any saved cursor position (only used in the event of the "save" instruction)
Creating them is cheap since I just use a straight array to hold them all and it basically never gets resized, so all of them are already "live" just waiting to have their fields filled in.
Real programmers use butterflies
|
|
|
|
|
honey the codewitch wrote: the performance is god awful Ah yes, the rarely appropriate Thread Per Request pattern. Almost always better is a work queue served by a single thread, or a pool if blocking is an issue. Threads eat up memory, add context switching overhead, and introduce critical regions. I recently discovered how often Windows schedules a new thread, and I'm still flabbergasted.
Fibers, being lighter weight, shouldn't be as bad, but evidently it's still plenty bad.
honey the codewitch wrote: each fiber only lives for the duration of one character
|
|
|
|
|
The fibers are already allocated since they're simple structs sitting inside an array. The only field that gets set are two simple 32 bit fields on the struct =) Since they're allocated this way, at least unless .NET sucks in this arena (i haven't checked the IL) they don't need to be recycled - they're permanent instances.
So a threadpool doesn't buy me anything. These aren't traditional threads.
Real programmers use butterflies
|
|
|
|
|
These sound like really lightweight fibers, so .NET must suck at handling them.
|
|
|
|
|
No, the issue is most fibers resolve to examination of a single character in the input so if you have 10 of them the same character gets examined as much as 10 times.
This is a byproduct of the design of a Pike VM, itself an artifact of the way NFA expressions work so there's very little to be done about it except convert to a DFA (the optimization process)
Reduce the fibers and it speeds right up:
NFA ran with 10 max fibers and 3.5 average char passes
NFA+DFA (optimized) ran with 6 max fibers and 2.5 average char passes
DFA ran with 2.5 max fibers and 1 average char passes
Pass #1
NFA: Lexed in 1.575287 msec
NFA+DFA (optimized): Lexed in 1.054843 msec
DFA: Lexed in 0.901254 msec
Pass #2
NFA: Lexed in 1.529819 msec
NFA+DFA (optimized): Lexed in 1.100836 msec
DFA: Lexed in 0.830835 msec
Pass #3
NFA: Lexed in 1.523334 msec
NFA+DFA (optimized): Lexed in 1.049213 msec
DFA: Lexed in 0.851737 msec
Pass #4
NFA: Lexed in 1.400265 msec
NFA+DFA (optimized): Lexed in 1.03485 msec
DFA: Lexed in 0.829009 msec
Real programmers use butterflies
|
|
|
|
|
If I have understood it correctly: yield uses fibers, foreach uses yield.
So try to swap a few well chosen foreach loops for classic for loops and see what happens.
|
|
|
|
|
I'm not using any foreach loops. I've already optimized the VM itself to within an inch of its life
Real programmers use butterflies
|
|
|
|
|
|
I thought the goal was to speed this up?
Real programmers use butterflies
|
|
|
|
|
I didn't tell you to use it, I'm just looking for problems.
|
|
|
|
|
Look away. Here's almost all of it. The stuff you don't see is very thin
public static int Run(int[][] prog,LexContext input)
{
input.EnsureStarted();
int i,match=-1;
_Fiber[] currentFibers, nextFibers, tmp;
int currentFiberCount=0, nextFiberCount=0;
int[] pc;
int sp=0;
var sb = new StringBuilder(64);
int[] saved, matched;
saved = new int[2];
currentFibers = new _Fiber[prog.Length];
nextFibers = new _Fiber[prog.Length];
_EnqueueFiber(ref currentFiberCount, ref currentFibers, new _Fiber(prog,0, saved), 0);
matched = null;
var cur = -1;
if (LexContext.EndOfInput != input.Current)
{
var ch1 = unchecked((char)input.Current);
if (char.IsHighSurrogate(ch1))
{
if (-1 == input.Advance())
throw new ExpectingException("Expecting low surrogate in unicode stream. The input source is corrupt or not valid Unicode", input.Line, input.Column, input.Position, input.FileOrUrl);
var ch2 = unchecked((char)input.Current);
cur = char.ConvertToUtf32(ch1, ch2);
}
else
cur = ch1;
}
while(0<currentFiberCount)
{
bool passed = false;
for (i = 0; i < currentFiberCount; ++i)
{
var t = currentFibers[i];
pc = t.Program[t.Index];
saved = t.Saved;
switch (pc[0])
{
case Compiler.Switch:
var idx = 1;
while(idx<pc.Length && -2<pc[idx])
{
if (_InRanges(pc, ref idx, cur))
{
while (-1!=pc[idx])
++idx;
++idx;
passed = true;
_EnqueueFiber(ref nextFiberCount, ref nextFibers, new _Fiber(t, pc[idx], saved), sp + 1);
idx = pc.Length;
break;
}
else
{
while (-1!=pc[idx])
++idx;
++idx;
}
++idx;
}
if(idx<pc.Length&&-2==pc[idx])
{
++idx;
while(idx<pc.Length)
{
_EnqueueFiber(ref currentFiberCount, ref currentFibers, new _Fiber(t, pc[idx], saved), sp);
++idx;
}
}
break;
case Compiler.Char:
if (cur!= pc[1])
{
break;
}
goto case Compiler.Any;
case Compiler.Set:
idx = 1;
if (!_InRanges(pc,ref idx, cur))
{
break;
}
goto case Compiler.Any;
case Compiler.NSet:
idx = 1;
if (_InRanges(pc, ref idx,cur))
{
break;
}
goto case Compiler.Any;
case Compiler.UCode:
var str = char.ConvertFromUtf32(cur);
if (unchecked((int)char.GetUnicodeCategory(str,0) != pc[1]))
{
break;
}
goto case Compiler.Any;
case Compiler.NUCode:
str = char.ConvertFromUtf32(cur);
if (unchecked((int)char.GetUnicodeCategory(str,0)) == pc[1])
{
break;
}
goto case Compiler.Any;
case Compiler.Any:
if (LexContext.EndOfInput==input.Current)
{
break;
}
passed = true;
_EnqueueFiber(ref nextFiberCount, ref nextFibers, new _Fiber(t, t.Index+1, saved), sp+1);
break;
case Compiler.Match:
matched = saved;
match = pc[1];
i = currentFiberCount;
break;
}
}
if (passed)
{
sb.Append(char.ConvertFromUtf32(cur));
input.Advance();
if (LexContext.EndOfInput != input.Current)
{
var ch1 = unchecked((char)input.Current);
if (char.IsHighSurrogate(ch1))
{
input.Advance();
if (-1 == input.Advance())
throw new ExpectingException("Expecting low surrogate in unicode stream. The input source is corrupt or not valid Unicode", input.Line, input.Column, input.Position, input.FileOrUrl);
++sp;
var ch2 = unchecked((char)input.Current);
cur = char.ConvertToUtf32(ch1, ch2);
}
else
cur = ch1;
}
else
cur = -1;
++sp;
}
tmp = currentFibers;
currentFibers = nextFibers;
nextFibers = tmp;
currentFiberCount = nextFiberCount;
nextFiberCount = 0;
}
if (null!=matched)
{
var start = matched[0];
var len = matched[1];
input.CaptureBuffer.Append(sb.ToString(start, len - start));
return match;
};
return -1;
}
static void _EnqueueFiber(ref int lcount,ref _Fiber[] l, _Fiber t, int sp)
{
if(l.Length<=lcount)
{
var newarr = new _Fiber[l.Length * 2];
Array.Copy(l, 0, newarr, 0, l.Length);
l = newarr;
}
l[lcount] = t;
++lcount;
var pc = t.Program[t.Index];
switch (pc[0])
{
case Compiler.Jmp:
for (var j = 1; j < pc.Length; j++)
_EnqueueFiber(ref lcount,ref l, new _Fiber(t.Program, pc[j],t.Saved),sp);
break;
case Compiler.Save:
var slot = pc[1];
var max = slot > t.Saved.Length ? slot : t.Saved.Length;
var saved = new int[max];
for (var i = 0;i<t.Saved.Length;++i)
saved[i]=t.Saved[i];
saved[slot] = sp;
_EnqueueFiber(ref lcount,ref l, new _Fiber(t,t.Index+1, saved), sp);
break;
}
}
private struct _Fiber
{
public readonly int[][] Program;
public readonly int Index;
public int[] Saved;
public _Fiber(int[][] program, int index,int[] saved)
{
Program = program;
Index = index;
Saved = saved;
}
public _Fiber(_Fiber fiber, int index,int[] saved)
{
Program = fiber.Program;
Index = index;
Saved = saved;
}
}
Real programmers use butterflies
|
|
|
|
|
Oh, no wonder then, you're doing it on purpose...
|
|
|
|
|
Doing what on purpose? I'm a little slow this morning.
Real programmers use butterflies
|
|
|
|
|