When Erika Yeung arrived at Princeton, she knew she was drawn to the intersection of hardware and intelligence, the idea that physical systems like chips and sensors could not only compute, meaning process information, but also perceive the world, learn from data, and adapt over time. As a sophomore in the Electrical and Computer Engineering department, she took a bold step into that space through independent research with Professor Hossein Valavi. Her work focused on how neural networks, which are computer models inspired by the way the human brain processes information, can be redesigned to run efficiently on edge devices. These are small, local devices such as phones, sensors, or embedded systems that operate without relying on distant cloud servers.
At the heart of Erika’s work was quantization, a technique that reduces the numerical precision of neural network weights, which are the internal values that determine how the model makes decisions. Instead of using highly precise numbers, quantization simplifies them into smaller, more compact representations. This allows the model to take up less memory and run faster while still maintaining strong performance. This idea is central to fields like Edge AI and TinyML, which aim to move machine learning out of large data centers and into everyday devices, from wearable health monitors to autonomous systems operating far from the cloud. Running AI locally means the models must be not only accurate, but also lightweight, fast, and energy efficient. Quantization offers one of the most promising ways to make that possible.
Continue reading Tiny Models, Big Impact: Erika Yeung’s Work in Edge AI









