|
I was tempted to say that but thought I would give OP the benefit of the doubt.
|
|
|
|
|
yup image recognition is half good dsp and half black magic still
|
|
|
|
|
well i wanted other views from the codeproject community on computer/machine vision algorithm limitations, i have been researching on the current developments in vision systems for 2 years now and have been iteratively refining my design over time based on new and promising heuristics of vision.
|
|
|
|
|
Bernhard Hiller wrote: Write a software which will detect that tree in a bitmap. And then, in a picture
of the same tree taken from a different place, and recognize that that's the
same tree...
it was found that neurons called view-tuned-units exists in animal/human brains that encode only one view of a given object(in this case a tree) and these feed into a view - invariant unit. the principal design criterion for my vision system is based on that same principal, but the secret is to encode those views in time and space (memory) efficient algorithm.simalar to an algorithm by
S. Hinterstoisser, V. Lepetit, S. Ilic, P. Fua, and N. Navab, “Dominant orientation templates for real-time detection of texture-less objects,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
they used different views of the same object encoded in a very compact and efficient way but their method works for texture-less objects but is efficient even for a very large database of objects
|
|
|
|
|
This is in response to all your posts.
I gather that you haven't started or are in early stages of your project. I'm also working on some vision recognition stuff but I'm pretty far along and I can tell you you'll find a lot more complications than you realize as you go. That's the reason why a lot of systems are domain specific. It allows them to take advantage of certain known facts and "cheat" so to speak since no one has created a general purpose system yet. In addition to the difficulty that one posters already mentioned here are just a few of the other things you need to consider:
1) Defining the edge of objects: Most objects in the real world will have areas where the edges are blurred rather than sharp color changes. Look up canny edge detection and it will explain some of this stuff.
2) Recognizing 2 areas are part of the same object: Consider a cat with black and white patches. How is a vision system supposed to know that 2 areas with radically different colors are part of the same object.
3) Depth Perception: If you use 2 cameras similar to our 2 eyes you can match 2 objects and then compare the parallax shift. However, this only works at certain distances. Our brains probably only use this at short distances, several other methods are used at long distances where the parallax shift isn't large enough to judge.
Also why are you worried about patents at this stage? I doubt you are going to get sued for simply experimenting with something. If your system does end up working and you want to commericialize it then buy/license the rights from the existing patent holders that are in your way. In addition you may find your idea changes a lot as you work on it and run into difficulties, it did with me.
|
|
|
|
|
mikemarquard wrote: 1) Defining the edge of objects: Most objects in the real world will have areas
where the edges are blurred rather than sharp color changes. Look up canny edge
detection and it will explain some of this stuff. 2) Recognizing 2
areas are part of the same object: Consider a cat with black and white patches.
How is a vision system supposed to know that 2 areas with radically different
colors are part of the same object. 3) Depth Perception: If you use
2 cameras similar to our 2 eyes you can match 2 objects and then compare the
parallax shift. However, this only works at certain distances. Our brains
probably only use this at short distances, several other methods are used at
long distances where the parallax shift isn't large enough to judge.
1) I would agree that my ideas will change in time because they already have, but for the better, at first i started off trying edge detection methods but later on realised that edge detection is not necessary, descriptors such as SIFT,SURF,DOT,HOG and many more use orientation and not contours. This is supported by biological vision in simple and complex cells, my system follows this trend. orientation is not affected by blurring thus more robust and descriptive.
2) My system uses local image patches and a part based recognition infrastructure without segmentation since segmentation is a by-product of recognition then the vision system is not supposed to segment out scenes or potential objects before recognizing them.
3) My system is not currently designed to use stereo cameras it uses a single camera and does not need depth or capturing a 3D representation to aid recognition.
my project as evolved in actual sense and i'am using my on vision library to implement the system and i have figured out how to encode image data in an efficient and robust manner for building a generic object recognition system. How do i know that it will work?well i have been progressively testing simple building blocks of the system and now i'am certain that this will work when the whole system is put together. i am optimizing my vision library for the final implementation and probably months remaining before completion.
|
|
|
|
|
It sounds like your ideas and my ideas are a lot different. Actually my ideas ideas are a lot different than any of the ideas I've read about and my image segmentation is technically not an edge detection algorithm either. I wish you the best of luck and if you have some big successes I'd love to hear about it.
BCDXBOX360 wrote: 3) My system is not currently designed to use stereo cameras it uses a single camera and does not need depth or capturing a 3D representation to aid recognition.
You might have more limited aims than I do but I have to question this one. If you're trying to build something that is capable of doing what a human or animal can do I don't see how this can work cause clearly humans and animals see in 3d. Also if you choose this path keep in mind objects look radically different from different views. Without some sort of 3d perception it is going to be difficult to get the system to recognize multiple views as being part of the same object.
|
|
|
|
|
mikemarquard wrote: objects look radically different from different views. Without some sort of 3d
perception it is going to be difficult to get the system to recognize multiple
views as being part of the same object.
that's why my system uses a multi-view representation as i explained earlier,during the learning phase
multi-views of the same object are learned and efficiently encoded for fast retrieval, this is supported in biological vision, neurons called view-tuned-units can only respond to a single view of a given 3d object but a collection of them gives a view-invariant behaviour. my system also implements a knowledge transfer technique for one-short learning (this reduces training sets as the system learns more and more things, just like humans!) and animals/humans can see effectively with a single eye proving that depth adds very little information(maybe little enough to be ignored for now). we see in what i call false 3d (it's only out of experience with this world that enables the brain to encode multi-views of various objects and cheats us that we see in 3d) the truth of the matter is that we see in 2d representation especially for recognition purposes. i think depth is used to tell how far the recognized object is from your eyes more accurately but this information is not used in actual recognition of the object.
|
|
|
|
|
BCDXBOX360 wrote: my system also implements a knowledge transfer technique for one-short learning (this reduces training sets as the system learns more and more things, just like humans!) and animals/humans can see effectively with a single eye proving that depth adds very little information(maybe little enough to be ignored for now)
Actually its been found that if a person is born blind in one eye they never develop proper depth perception. The reason you and I can see in 3D if we cover an eye is because as children we learned other ques for judging depth. However, we needed 2 eyes to learn these ques because without them we have very little information to accurately gauge where an object is at and thus know how other ques corresponds to a particular location.
I would say this; I don't know of any animal that has only 1 eye so I think depth perception must be important and parallax shift or I think the proper term is stereopsis is important.
PS If you were the person who downvoted me I'm not trying to be critical or discourage you, I just enjoy debating these topics with people of similar interest and hearing their opinions.
|
|
|
|
|
mikemarquard wrote: Actually its been found that if a person is born blind in one eye they never
develop proper depth perception
But they can recognize objects effectively,right? my main interest is recognition, the question is how much does depth perception affect recognition of objects? well i do know of 3d face recognition being more accurate than the 2d counterparts but this requires 3d sensing putting a strain on cpu's. and what about all the 2d images and videos available, how will your 3d system make use of them?
mikemarquard wrote: The reason you and I can see in 3D if we cover an eye is because as children we
learned other ques for judging depth
Thank you, because thats my solution, my system learns those ques during the learning phase by presenting multi-view training sets like i wrote earlier.
mikemarquard wrote: PS If you were t\he person who downvoted me I'm not trying to be critical or
discourage you, I just enjoy debating these topics with people of similar
interest and hearing their opinions.
Don't worry i'm not like that, i also enjoy discussing with people of similar interest, besides there are'nt many daredevil's to go down this path, it's gud to always hear from people like you. i wish you luck in your endeavor.
modified 20-Oct-11 20:34pm.
|
|
|
|
|
BCDXBOX360 wrote: But they can recognize objects effectively,right? my main interest is recognition, the question is how much does depth perception affect recognition of objects? well i do know of 3d face recognition being more accurate than the 2d counterparts but this requires 3d sensing putting a strain on cpu's. and what about all the 2d images and videos available, how will your 3d system make use of them?
I would assume yes because I've never seen anything written on that subject. So you are probably right that it is possible, but I still suspected it will learn faster with 3D.
BCDXBOX360 wrote: mikemarquard wrote: The reason you and I can see in 3D if we cover an eye is because as children we
learned other ques for judging depth
Thank you, because thats my solution, my system learns those ques during the learning phase by presenting multi-view training sets like i wrote earlier.
Yeah but that's the point I was making earlier. Without some preexisting method for judging distances you have nothing to use as a measuring stick when your system would learn those cues. That's why people born blind in one eye don't learn those cues, because they cannot use stereopsis as a measuring stick. They have no way to see how things like for instance the size of an object corresponds with it's distance from the viewer because they never actually know how far the object is from them.
BCDXBOX360 wrote: Don't worry i'm not like that, i also enjoy discussing with people of similar interest, besides there are'nt many daredevil's to go down this path, it's gud to always hear from people like you. i wish you luck in your endeavor.
Thanks
|
|
|
|
|
mikemarquard wrote: So you are probably right that it is possible, but I still suspected it will
learn faster with 3D.
According to wikipedia "Stereopsis appears to be processed in the visual cortex in binocular cells having receptive fields in different horizontal positions in the two eyes. Such a cell is active only when its preferred stimulus is in the correct position in the left eye and in the correct position in the right eye, making it a disparity detector." you are right 3d vision is useful, i will consider using two cameras,but i will start with a single camera then move to 3d implementation this will enable my system to take advantage of both worlds, thanks for the advice. I thought through this and realised that i left stereopsis out but now i have considered using it, i have found that there is room for it in my vision system, i do'nt have to modify the whole library but just add additional functions to support stereopsis, thanks again for stereopsis.
|
|
|
|
|
|
My answer will be simple: artificial intelligence is still nowhere.
|
|
|
|
|
YvesDaoust wrote:
My answer will be simple: artificial intelligence
is still nowhere.
i can't fully agree with you, look at chess, can you beat a computer at it's highest skill level, i don't think so! the problem with machines of today is that they just lack perceptual skills or what is called sensory perception, they are given buttons for people to push rather than a complex sensing device. And the belief among humans that there is no such a thing as artificial intelligence discourages researchers. Remember people in the old days thought that people will never fly but we have very heavy man-made machines called planes that can fly. all we need is a break through especially in machine perception to have everybodies jaw dropped.
|
|
|
|
|
BCDXBOX360 wrote: i can't fully agree with you, look at chess, can you beat a computer at it's highest skill level, i don't think so!
That's only cause computers can look at millions or even billions of times as many moves as a human can. It's actual understanding of the game is extremely weak. Look at some other games where brute force methods are less practical. Go and Arimaa are good examples. There was even a million dollar prize offered if anyone could write a good Go program by the year 2000 and nobody collected it http://senseis.xmp.net/?IngPrize[^] and there is currently a prize for Arimaa http://arimaa.com/arimaa/[^]
IBM's watson was much more impressive http://www.youtube.com/results?search_query=watson&aq=f[^] but even it is only a souped up search engine, it doesn't really understand the words it's talking about it just looks at how words are used together.
BCDXBOX360 wrote: the problem with machines of today is that they just lack perceptual skills or what is called sensory perception
I totally agree with you. True AI will never happen till the AI actually understands the meaning of the words it is talking about.
|
|
|
|
|
mikemarquard wrote: That's only cause computers can look at millions or even billions of times as
many moves as a human can.
Then we definately need better solutions, i have also realised that computer processing power has increased drastically but algorithms have'nt gotten smarter but just running on powerful cpu's makes them appear smart, this is definately disappointing.
|
|
|
|
|
To help keep your ground-breaking revolutionary work at the fever-pitch required to personally transcend several centuries of person-years' work on computer vision ...
I think you need a good moniker, a rallying cry ... a logophone ...
May I suggest: GrandEyeOsity.
best, Bill
"Last year I went fishing with Salvador Dali. He was using a dotted
line. He caught every other fish." Steven Wright
|
|
|
|
|
BillWoodruff wrote: To help keep your ground-breaking revolutionary work at the fever-pitch required
to personally transcend several centuries of person-years' work on computer
vision ... I think you need a good moniker, a rallying cry ... a
logophone ... May I suggest: GrandEyeOsity. best,
Bill
It happens to be that "GrandEyeOsity" also sounds like "Grandiosity" which is not a good moniker,and by stating that i have designed a computer-vision system capable of out-performing current state of the art systems does'nt mean i did not re-use some of the earlier ideas, you should know that i did not have to re-invent the wheel, but to look at what the other researchers have missed-out or overlooked which happens to be ground-breaking. If you are a true developer, you should be able to understand what research and development means. What others have failed to achieve, i have managed, simple as that and no nicknames please!
|
|
|
|
|
I sincerely wish you all the best in your Quest, Sir Knight !
And, when the "Grail of Vision" thing is done, I'll be happy to grant Thee the boon of being your Squire-Errant ...
... if you provide me with a donkey to ride named "El Rucio," and you ride a knight-worthy-nag whose name must be "Rocinante" ...
to accompany you on the Quest for the "recognition of sense of humor."
Windmills ahoy !
best, Bill
"Last year I went fishing with Salvador Dali. He was using a dotted
line. He caught every other fish." Steven Wright
|
|
|
|
|
You talk the talk, but can you walk the walk?
Unrequited desire is character building. OriginalGriff
I'm sitting here giving you a standing ovation - Len Goodman
|
|
|
|
|
|
|
BCDXBOX360 wrote: vision by machines seems 2 lag behind the simplest animal u can think of (like a cat or something else).
I doubt that a cat qualifies as the simplest animal I can think of; besides, if they had such a simple visual system, they wouldn't probably be the formidable predators that they are....
|
|
|
|
|
sgorozco wrote: I doubt that a cat qualifies as the simplest animal I can think of; besides, if
they had such a simple visual system, they wouldn't probably be the formidable
predators that they are....
You are right cats are not the simplest animals. But "simple" in this context is a relative measure (let's say relative to computer/machine vision potential) and that computers of today are capable of running a vision system advanced enough to make a cat look like a cockroach but no such vision system exists because there are'nt many efforts to do that as many people like doing the easy stuff.
|
|
|
|