|
It's an interesting discussion, and to really get into it, you have to define what random means. Every physical system has various numbers attached to it, which are the things you can "measure" about that system. So a flying plane, for example, has numbers like "momentum", "position of the z axis", "acceleration", etc. Furthermore, these values aren't completely independent. If you know the "momentum" and "acceleration" at time 0, you can make a really good guess for the momentum at future times (with guesses for sooner times better than future times).
The implication would be that for a physical system, there is a "purely mathematical" part, that describes the abstract system. There is also a random part added in though. This implies that the physical system is somewhat random. So if the velocity of a falling block is 0 at time =0 and it's acceleration is -g, then you can make a solid guess as to the velocity at t=1.0 seconds -- it won't be random at all. However, if you factor out the mathematical part, you'll get a purely random part. So if, for instance, you expected the velocity at t=1 to equal -g meters/second, you could sample the actual velocity at t=1 and would get a Gaussian out.
In the real world, this particular example has a very strong physical/mathematical component and a very small random component. On the other hand, with various quantum mechanical systems, it has been proven that the "physical/mathematical" component is 0, and the answer you will get out is completely random.
So I disagree with you that people are a good source of randomness. You can ask someone to give you 100,000 random numbers between 0 and 100 and you'll find that the output is very non-random. In my interpretation, this reflects some "laws of the mind" that are totally non-random (the fact that I really "feel" 17 is random, and "1" is not would be one example). That said, if you were able to figure out the physical laws, you can factor those out and get something purely random out. So a physical law of the brain would say that I feel "17" is so random, I will pick it 10% of the time, as opposed to 1%, but whether in a particular run I pick it more or less than exactly 10% will be completely random.
So if you want to find purely random numbers, pick a physical system. Either pick one that has no relationships between it's variables -- these are found mostly in quantum mechanical systems -- and are really completely random, or pick one where you know the physical system perfectly and factor out the predictable physicality to get something random.
But there are truly random entities in math. If you're familiar with the halting problem, the probability that a random program (basically a program that consists of a random 1 or 0 everytime the computer needs some input) is completely random (after the first few digits). But there isn't any way to calculate it, and so it can't be used. Practically, using a formula that isn't random, but for which you can prove it looks random is the best way to go. (You can actually buy a physical random number generator, but unless you are worried about cryptographic stuff, it won't do any better for you than a software RNG, and if it breaks, you might not be able to tell, which isn't a problem with software RNGS).
|
|
|
|
|
My reply was rather tongue in cheek - as much of a comment on the original post as anything. Clearly there are potentially major problems with using this sort of thing to generate random numbers - predictable diurnal variations to do with time-zones and culture shifts come to mind. That said, if we use a good hash function - the hashes of the posts are probably pretty random.
Nathan Addy wrote: you have to define what random means
I agree.
Nathan Addy wrote: for a physical system, there is a "purely mathematical" part, that describes the abstract system. There is also a random part added in though
I don't know what you mean by this - are you simply implying the Heisenberg uncertainty principle, or the uncertainty in your measurement or both?
Nathan Addy wrote: So if, for instance, you expected the velocity at t=1 to equal -g meters/second, you could sample the actual velocity at t=1 and would get a Gaussian out
Why, and why Gaussian?
Nathan Addy wrote: So if you want to find purely random numbers, pick a physical system
I agree, sytems based on thermal noise or shot noise I believe are good examples of this - and seem to be the best types of generators of randomness that we have. Interesting question - are they truly random or is it just that our theories at this stage are not good enough?
Nathan Addy wrote: But there are truly random entities in math. If you're familiar with the halting problem, the probability that a random program (basically a program that consists of a random 1 or 0 everytime the computer needs some input) is completely random
It's been a long time since I studied Turing machines - but you example sounds circular - do you really need random data to generate random output?
Nathan Addy wrote: You can actually buy a physical random number generator, but unless you are worried about cryptographic stuff, it won't do any better for you than a software RNG, and if it breaks, you might not be able to tell, which isn't a problem with software RNGS
I agree - I have done a reasonable amount of Monte-Carlo simulation and the pseudo-random generators these days are pretty good, just don't use the basic rand() function. If I was a bank generating their RSA key I'd probably want something better.
Peter
"Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."
|
|
|
|
|
|
I was aware of random.org services providing random numbers based on atmospheric
noise. There again, we see that the folks there have (not so surprisingly)
tapped into a natural resource for acquiring true random numbers.
The fact is, using a machine (Pc, if you will) in its puristic sense (i.e. without
temperature or other high-tech sensors for measuring some physical phenomena) cannot
generate true random numbers. PRNGS are good - agreed, they serve the purpose well - agreed,
they exhibit properties of True Random numbers - agreed to some extent from a utilitarian
point of view. but at the end of the day, THEY ARE NOT TRNGs.
Regarding the usage of people for generation of random numbers - If we grab 100 people
(who will obviously unaware of each other), and ask them to select any number between
1 and 100, those numbers will be truly random.
One might argue by saying that there might be a high degree of correlation in the
resulting set, but that too, is a part of randomness. If 20 of those hundred people
choose 50 as their number, why not? We are not following a series/sequence/pattern,
so anything that comes through is acceptable, and truly random. So randomness can
bring in high degrees of correlation, because true randomness is free from all bias.
If I said, "Oh my random set will not have any numbers close to each other", I've
just killed the essence of "true randomness" in it.
A lot of people are yelling by now and frowning when I said 'acceptable'. Most systems
that make use of random numbers for simulation/testing actually require 'disparate',
or 'eclectic' numbers for testing/simulating varied scenarios. The need for 'random'
numbers isn't the exact requirement, because the very definition of 'random' in a
puristic sense is ambiguous (we don't know what God is doing and how he is doing that).
Once again, what most systems these days need is a large(usually), well spaced out
range of data. PRNGs are superb for that, no doubt. But when we consider the behaviour
of such systems, i.e. their responses to different sets of data, let's say, one
PRNG generated set and one TRNG set, and we compare such responses side by side,
we test/simulate the system's ability for real.
In an ideal world, the system responses should be uniform across all sets of data
fed into it.
ASP - AJAX is SEXY. PERIOD.
|
|
|
|
|
Bulky Fellow wrote: a natural resource for acquiring true random numbers
I don't think that atmospheric noise has any purer randomness that other physical sources of noise, in fact I suspect it is probably worse. It is impossible to receive simply 'atmospheric' noise, what your radio picks up is a combination of atmospheric, man-made, thermal and cosmic noise. If you choose a frequency band that has limited propagation characteristics you are listening to noise generated close by - potentially a few sources, in particular possibly a few sources of man-made interference - these often are far from random (car ignition systems etc). If you choose a frequency bad that propagates a long way (generally with the assistance of the ionosphere) then you potentially pick up lots of noise from all around the world - thunderstorms in the tropics are potentially received at the poles. Sounds good - except that there are often multiple propagation paths so you receive the signal and multiple echoes (in-build correlations), and also possibly the signal that goes the long way around the earth! These frequency bands also abound with man-made propaganda transmissions (Voice of America, Radio Peking etc) which are depressingly predictable. What about the dependence on the local weather, say you are getting regular afternoon thunderstorms.
I guess that if you combine it with some other form of randomness, sample the signal at a time driven by a PRBS, or adding some sort of PRBS interleaving, may remove some of the problems.
Based on my current understanding of physics, I think I'd prefer to use thermal noise from a resistor to the type of atmospheric noise receiver described in the article. Much simpler to build, but probably doesn't have the same marketing flair.
Peter
"Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."
|
|
|
|
|
|
The background is terrible - I gave up reading after a few lines.
Peter
"Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."
|
|
|
|
|
Thanks for the comment.
Try kicking up the text size. (^+ in Mozilla.) I will have to try reading my stuff in IE on a display set well below the 1600x1200 I use on my machine. I use ^+ with many sites. High res, small display 65yo eyeballs....
|
|
|
|
|
I use IE with 1280 x 1024
Peter
"Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."
|
|
|
|
|
Yes any food bill with have a random total independant of item or cost of each item. Bristromatics is its offical name.
|
|
|
|
|
In the classic producer/consumer problem using a bounded buffer, is it typical that a producer signals the consumer immediately after producing an item and adding it to the buffer, or does the producer wait until the buffer has become full, i.e. the producer checks after producing an item to see if the buffer is full, if so, it signals the consumer?
I'm trying to wrap my head around this pattern. In my multithreaded programming book[^], it illustrates the producer/consumer architecture with a picture of a conveyer belt. At one end is the producer putting items on the belt. At the other end is a consumer taking items off the belt.
However, in looking at the accompanying code, it looks as though the producer signals the consumer after each item is produced (I may be misreading the code). If so, it seems to me that there would only be at most one item on the belt [edit] assuming that they work at the same speed [/edit].
I'm basically trying to figure out how there can be multiple objects in the buffer. A typical scenario, I think, would be that the producer places an item in the buffer and signals the consumer. The consumer removes the item from the buffer and, erm, consumes it.
It releases the buffer immediately after removing the item and signals the producer so that the producer can add more items. As the consumer is comsuming the item, the producer, if it is faster than the consumer, can add several items to the buffer. Either the buffer fills up and the producer has to wait, or the consumer finishes consuming the item and removes another from the buffer.
Is this correct?
The above is how my DelegateQueue[^] works, except that it uses an unbounded buffer, which is a little dangerous, I know.
|
|
|
|
|
Hi,
AFAIK the producer/consumer pattern aims at maximum decoupling of producer and consumer,
that means each of them should be given maximum opportunity to work; hence, the
producer should signal its consumer(s) as soon as something becomes available
(=queue not empty), and the consumer should signal its producer(s) when more could
be produced (=queue not full).
One interesting implementation has two queues, an "idle" one with empty buckets,
and a "work" one with full buckets. A producer takes an empty bucket from the idle
queue, fills it and puts it in the work queue; a consumer takes a full bucket from
the work queue, executes it, and returns the empty bucket to the idle queue.
If both queues have equal capacity, this results in a symmetric situation where a
producer gets triggered by idle queue no longer empty, and consumer gets triggered
by work queue no longer empty.
Hope this helps.
|
|
|
|
|
Luc Pattyn wrote: One interesting implementation has two queues, an "idle" one with empty buckets,
and a "work" one with full buckets. A producer takes an empty bucket from the idle
queue, fills it and puts it in the work queue; a consumer takes a full bucket from
the work queue, executes it, and returns the empty bucket to the idle queue.
If both queues have equal capacity, this results in a symmetric situation where a
producer gets triggered by idle queue no longer empty, and consumer gets triggered
by work queue no longer empty.
Very interesting. I had not heard of this approach before.
Luc Pattyn wrote: Hope this helps.
All of your replies to my posts have been helpful, thank you.
|
|
|
|
|
Well, this seems like a basic data structure question that most experienced programmers would know the answer to, but I've been kinda going in circles on it.
If you look at the members of the IList[^] interface, you get the impression that some of these operations would best be implemented using a linked list and others an array. It's as if the interface wants to have its cake and eat it, too.
The ArrayList[^] class uses a resizeable array to implement the interface.
Arrays are great for accessing elements with O(1) efficiency, while linked lists provide O(n) efficiency. However, insertions into the middle of an array will cause you to have to move the subsequent items further down the array. And if you run out of room, the entire array will need to be copied into a larger array. Linked lists don't suffer from this problem.
Seems to me that index access is the only advantage arrays have in this context over linked lists. I've been trying to think of a way to combine the two data structures to get the advantages of both. A binary tree or skip list with index access algorithms seems to be the closest I can get.
|
|
|
|
|
Leslie Sanford wrote: If you look at the members of the IList[^] interface, you get the impression that some of these operations would best be implemented using a linked list and others an array. It's as if the interface wants to have its cake and eat it, too
Well that often happens: an interface hides the possible implementations;
it is rather user oriented, and not implementation oriented.
So it offers what the user may need, and depending on some circumstances you may
choose a different implementation (e.g. a real linked-list if insertion becomes
predominant, a simple array/arraylist if items mainly get replaced/added and/or
order is not important).
If there were one type of list that would make every one happy, then arrays,
arraylists (and the whole IList interface) would have been obsolete for a long time...
|
|
|
|
|
Leslie Sanford wrote: I've been trying to think of a way to combine the two data structures to get the advantages of both.
Hashtable. Of course, it has its own disadvantages.
----
It appears that everybody is under the impression that I approve of the documentation. You probably also blame Ken Burns for supporting slavery.
--Raymond Chen on MSDN
|
|
|
|
|
Shog9 wrote: Hashtable. Of course, it has its own disadvantages.
I was thinking hashtable in conjunction with a linked list might come close to giving the best of both worlds. If you need in order traversal, you iterate through the linked list. If you need to remove an item, you look it up in the hashtable first (the items serve as keys while the linked list nodes are the values for each entry into the hashtable), get the linked list node, and remove it from the list. This saves having to traverse the list to find the node that needs to be removed.
This has the problem of duplicating the list in the hashtable, which may not be reasonable if the list is large and memory use is a concern. But more importantly for me is that I can't think how the hashtable/linked list combo gives you O(1) index access as an array does. In fact, I can't think of how to access the items in the hashtable by index at all.
|
|
|
|
|
Leslie Sanford wrote: In fact, I can't think of how to access the items in the hashtable by index at all.
A perfect hash would give you O(1) - the hash would be guaranteed to correspond to only one entry in the table. Whether or not a perfect hash is feasible depends a lot on what you're using for keys, as well as how much time is acceptable for calculating the hash.
But i was thinking more along the lines of chained hash tables[^], which are essentially an array of linked lists - this is the technique used by MFC's various CMap classes. Their performance can be very consistent but iteration tends to be rather poor unless table size and hash functions are chosen carefully.
----
It appears that everybody is under the impression that I approve of the documentation. You probably also blame Ken Burns for supporting slavery.
--Raymond Chen on MSDN
|
|
|
|
|
Shog9 wrote: A perfect hash would give you O(1) - the hash would be guaranteed to correspond to only one entry in the table. Whether or not a perfect hash is feasible depends a lot on what you're using for keys, as well as how much time is acceptable for calculating the hash.
Understood. But just to be clear, what I meant by not knowing how to access items in a hashtable by index was that I don't know how you could treat a hashtable as an array. Say you're using keys of a certain type. To get the value associated with the a key in a hashtable, you do this:
Value someValue = hashTable[someKey];
However, say we want the 10th item added to the hashtable. I don't know how we do this:
Value someValue = hashTable[9];
We can't treat the index itself as a key because as items are removed or inserted somewhere at the beginning or middle, we'd have to update all of the subsequent indexes; we're back where we started from.
Shog9 wrote: But i was thinking more along the lines of chained hash tables[^], which are essentially an array of linked lists - this is the technique used by MFC's various CMap classes. Their performance can be very consistent but iteration tends to be rather poor unless table size and hash functions are chosen carefully.
I could see how this would work for simulating an array. Have a linked list with each node containing a linked list. Limit the number of items each list to some arbitrary number like 16. Then if you need to reach the nth item, you can skip over a large number of items at a time. I can see problems with this as items are deleted; you'd wind up with some nodes not having very many items in their list. At any rate, the more I think about this, the more I get the feeling that I'm just reinventing the tree and my brain starts to short circuit.
|
|
|
|
|
Leslie Sanford wrote: But just to be clear, what I meant by not knowing how to access items in a hashtable by index was that I don't know how you could treat a hashtable as an array.
Of course... i didn't read carefully.
I suppose you could still manage this with a hash table, provided the data was known ahead of time... but that's hardly a general-purpose solution. Realistically, i'd probably go with an array + some sort of associative container (hashtable or tree) and just rebuild the array after adding or removing items. But it'd depend a lot on the situation; if constant-time lookups weren't absolutely essential, i might just fall back on iterating to the nth item; in the case of a sorted hash table, this could be optimized by storing the number of items in each bucket (or even an index offset), reducing the operation to iterating to the correct bucket then through that list to the correct item. Honestly can't say that this has been a requirement for anything i've actually done though.
----
It appears that everybody is under the impression that I approve of the documentation. You probably also blame Ken Burns for supporting slavery.
--Raymond Chen on MSDN
|
|
|
|
|
Maybe there is scope for improving things with threading, as time goes on we will have more and more parallel processing available. So why not a list type structure with an associated indexing array, simply of pointers, which updates in a secondary thread. This would be completely transparent to the user, allowing indexed access in O(1) time, unless the primary thread blocks when the user tries to access the array using indexing and the indexing array is being updated. There are clearly some threading issues to be resolved, but it should be possible to come up with a simple clean system. There are probably other types of structures that could benefit from some behind the scenes processing, lists that keep themselves sorted comes to mind. Maybe we should patent this - someone probably has aleady.
Peter
"Until the invention of the computer, the machine gun was the device that enabled humans to make the most mistakes in the smallest amount of time."
|
|
|
|
|
I wrote a small class implementing the CollectionBase class that uses a hashtable for the indexes into the list, that are accessed by some key.
The benefit is O(logn) access coupled with an array that can be bound to a combo, list, or grid.
This statement was never false.
|
|
|
|
|
public class HybridList: System.Collections.CollectionBase
{
Hashtable _indicesTable = new Hashtable();
public int IndexOf(object key)
{
if(_indicesTable.ContainsKey(key))
return (int)_indicesTable[key];
else
return -1;
}
public int Add(object key, object val)
{
int index = List.Add(new DictionaryEntry(key, val));
_indicesTable[key] = index;
return index;
}
public void Remove(object key)
{
if(_indicesTable.ContainsKey(key))
{
List.RemoveAt((int)_indicesTable[key]);
_indicesTable.Remove(key);
int i = 0;
foreach(DictionaryEntry entry in List)
_indicesTable[entry.Key] = i++;
}
}
public DictionaryEntry this[int index]
{
get { return ((DictionaryEntry)List[index]); }
}
public object this[object key]
{
get
{
if(_indicesTable.ContainsKey(key))
return ((DictionaryEntry)List[(int)_indicesTable[key]]).Value;
else
return null;
}
}
public ICollection AllKeys
{
get { return _indicesTable.Keys; }
}
protected override void OnClear()
{
base.OnClear();
_indicesTable.Clear();
}
}
This statement was never false.
|
|
|
|
|
Thanks for the source code. The Remove operation makes me a little nervous in that it appears to be an O(n) operation, and that's what I'm trying to get away from. However, if items are rarely removed from the collection, then this looks like an interesting approach.
|
|
|
|
|
Yeah, its definately not perfect. That remove operation was a hasty addition too. But I rarely use it. But having the indices in the table and having them change upon removal of one demands it. It might even be worse than O(n). Maybe even n<super>2.
This statement was never false.
|
|
|
|
|