Book Review by Frank Cerwin:
The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI
By Fei-Fei Li – published by Flatiron Books, 2023
This book is heralded as Financial Time’s best books of 2023 and the author, Dr. Fei-Fei Li, is noted as one of Time magazine’s 100 most influential people in AI. Wired magazine called Dr. Li one of the few scientists who are responsible for AI’s recent advances.
The initial chapters take you through the author’s life as an immigrant, through her family’s difficult times assimilating to life in the U.S., her interest, and studies in physics, to the breakthroughs and contributions she made in the field of AI. Her studies and experiments in sensory brain activity in the visual cortex translated to audio signals led her to the creation of ImageNet, a key catalyst of modern AI. The author takes you thru the development of ImageNet from its creation at Princeton University and initial hand-curating of 15 million images through the annual ImageNet Large Scale Visual Recognition Challenge that ultimately was won by a convolutional neural network entry.
Text and numerical data have been searchable for many years. Visual understanding is extremely complex. Digital imagery is stored in the form of pixels whereby individual points of color are encoded numerically and appears to a machine as nothing more than a long list of integers. To see an image as a human does, in terms of meaningful concepts of people, places, and things, an algorithm must sift thru this list of integers and identify numeric patterns that somehow correspond. Consider the complexity of a human face in all its colors and proportions across an infinite range of angles, lighting conditions, and backgrounds. An essential first step toward computer vision and machine intelligence was a better understanding of the human mind. To achieve this understanding requires a combination of computer science, psychology, and neuroscience. We perceive our surroundings not as an assemblage of colors and contours, but in terms of categories. Therefore, these patterns must further be categorized thru curation to form a hierarchy of branches in a tree structure. A neural network, like our brain, traverses the hierarchy from broad to fine-grained classification for the algorithm to identify a specific object. In her book, Dr. Li takes you through the challenges of object recognition, identifying a single object in a field of vision containing multiple objects, and the need to understand relationships across the objects. She presents examples of how these relationships can be leveraged to gain new insights.
I honestly could not put this book down once I started it. It is extremely well-written and provides an understanding of the complexity of computer vision that is integral to autonomous vehicles, robots, and image search. Where did I find this book? In my local library. Re-discover your local library and library-sharing app for many of the most currently published books.