Computer Vision

Introduction

Computer vision is a field of artificial intelligence that trains computers to interpret and understand the visual world. Using digital images from cameras and video and deep learning models, machines can accurately recognize and classify objects and then react to what they “see“. We can say that computer vision is the science of how machines can “see” as humans do.

Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images, as well as methods for extracting data from the real world to produce numerical or symbolic information. In this context, understanding means converting visual images (input from the retina) into descriptions of the world that are meaningful to thought processes and can trigger appropriate actions. This image understanding can be seen as the separation of symbolic information from image data using models constructed with the help of geometry, physics, statistics, and learning theory.^[1]

Brief History of Computer Vision

1959, David Hubel and Torsten Wiesel

In 1959, neurophysiologists David Hubel and Torsten Wiesel published “Receptive fields of single neurons in the cat’s striate cortex”. They first discovered that visual primary cortical neurons were sensitive to moving edge stimuli through visual experiments on cats, and discovered the structure of the visual function column, which laid the foundation for visual neural research.

1963, Lawrence Roberts

In 1963, Lawrence Roberts‘ “Machine Perception of Three-Dimensional Solids” described the process of deriving three-dimensional information from two-dimensional pictures. It is one of the precursors of modern computer vision, pioneered computer vision research aimed at understanding three-dimensional scenes.^[2]

1982, David Marr

In 1982, neuroscientist David Marr‘s students help him published the influential book “Vision: A computational investigation into the human representation and processing of visual information”. Based on Hubel and Wiesel’s idea that visual processing does not start with the whole object, David introduced a vision framework in which low-level algorithms for detecting edges, curves, corners, etc. were used as a pavement for a high-level understanding of visual data. This book marks computer vision as a stand-alone discipline.^[2]

1989, Yann LeCun

In 1989, Yann LeCun at AT&T Bell Labs developed a Convolutional Neural Network that was successfully used for postal code recognition. This was the beginning of convolutional neural networks, which are widely used in computer vision today.^[2]

What is a Conventional Neural Network?

1990

In the late 1990s, computer vision as a field shifted its focus to a large extent. Around 1999, many researchers don’t try to reconstruct objects by creating 3D models of them (the path proposed by Marr) and instead shifted their efforts to feature-based object recognition.^[2]

21st Century

By the 21st century, there were no significant developments in computer vision and the focus shifted to improving and creating new convolutional neural network algorithms and various deep learning algorithms. With the help of deeping learning, researchers no longer had to provide data samples by hand. According to Forbes, users share more than 3 billion images online every day, and this data is used to train computer vision systems.

This has also led to amazing advances in computer vision, with object recognition and classification accuracy rates rising from 50 percent to 99 percent in less than a decade. Today’s systems are more accurate than humans at quickly detecting and responding to visual input.^[6]

How Computer Vision affects our life

Facial recognition

Computer vision is now integrated into all aspects of our lives, such as facial recognition. Facial recognition technology is used to match a face on a photograph to its identity. The most common application is facial unlocking devices. More advanced uses include residential or commercial security systems that use an individual’s unique physical characteristics to verify their identity.

Autonomous driving

Another major direction is autonomous driving, where computer vision allows cars to sense their surroundings or road conditions and make automatic route planning and decisions.

Medical field

Computer vision can also be used in the medical field, where it is often used to analyze medical images such as X-rays, MRIs, and, according to Google, to detect cancer metastases more accurately than a human doctor.^[7]

Negative Sides of CV

However, we also need to be wary of the bad aspects of computer vision, such as personal privacy issues. Big companies like Facebook will collect our face information for their services, but if this critical information like face information is leaked it can cause a lot of trouble in our lives.

What is more, we need to worry about the use of facial recognition by the government to catch criminals as well, although this will improve the efficiency of the police to solve crimes and give us a safer community to live in. But the same will also violate our privacy, our travel records will be recorded by the camera.

Computer Vision in the future

Corporating with other industries

Computer vision is an important area of machine learning, and more and more industries are incorporating computer vision into their products to improve productivity, such as agriculture, where computer vision can be used to monitor the status of crops all the time and implement appropriate measures.

Or manufacturing, where computer vision and robotic arms can be used to efficiently complete operations, more efficiently and cheaper than human labor. It can also be combined with Natural Language Generation to help visually impaired people to understand their surroundings.

Artificial General Intelligence

However, I think the biggest future direction for computer vision is as the eyes of Artificial General Intelligence or Strong AI. An AGI is an AI that has the same intelligence as humans, or beyond, and can exhibit all the intelligent behaviors that normal humans have. For such an AI, how he sees the world is extremely important, if only according to the key features to see things, then they see the world will be very different from the real world. They would not be able to see through the essence of things, but only the appearance. So, computer vision needs to give AGI a new mode of analysis that is not limited to the features of objects.

Conclusion

We often say that the eyes are the windows to the soul, and this is also true for computers. Computer vision gives computers the ability to “see” and understand the world around us. In the 70 years of development since the 1950s, computer vision has changed dramatically. Computer vision can be seen in all aspects of life now. Even if we do not understand what is in the black box of AI, that doesn’t stop computers give us a more convenient life.

Reference

[1].

Wikipedia contributors. “Computer vision.” Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 22 Jun. 2021. Web. 22 Jun. 2021.

[2].

Demush, Rostyslav. “A Brief History of Computer Vision (and Convolutional Neural Networks).” Hacker Noon, 26 Feb. 2019, hackernoon.com/a-brief-history-of-computer-vision-and-convolutional-neural-networks-8fe8aacc79f3.

[3].

Amy, and Vivian . Stanford Artificial Intelligence Laboratory, 2016, ai.stanford.edu/~syyeung/cvweb/index.html.

[4].

Babich, Nick. “What Is Computer Vision & How Does It Work? An Introduction: Adobe XD Ideas.” Ideas, 28 July 2020, xd.adobe.com/ideas/principles/emerging-technology/what-is-computer-vision-how-does-it-work/.

[5].

Joshi, Naveen. “The Present And Future Of Computer Vision.” Forbes, Forbes Magazine, 27 June 2019, www.forbes.com/sites/cognitiveworld/2019/06/26/the-present-and-future-of-computer-vision.

[6].

Marr, Bernard. “7 Amazing Examples Of Computer And Machine Vision In Practice. Forbes, Forbes Magazine, 8 Apr. 2019, ” www.forbes.com/sites/bernardmarr/2019/04/08/7-amazing-examples-of-computer-and-machine-vision-in-practice.

[7].

Stumpe, Martin, and Lily Peng. “Assisting Pathologists in Detecting Cancer with Deep Learning.” Google AI Blog, 3 Mar. 2017, ai.googleblog.com/2017/03/assisting-pathologists-in-detecting.html.