As the holiday season approaches an end, all of us are familiar with online shopping. To shop on websites, we typically string a few words together to search for the products we want. However, what happens behind the scenes is coding on the next level that allows us to match these words with the right product. Such a process remains one of the biggest challenges in the information retrieval world, particularly among online shopping.
To address these challenges, scientists from Rice University have partnered with Amazon to leverage the power of compressed sensing and ‘slash’ the amount of time it takes to train computers for product search. Researchers began testing Amazons data set of more than 70 million queries for more than 49 million using an approach referred to as MACH which stands for ‘merged-average classifiers via hashing’ (MACH).
“Our training times are about 7-10 times faster, and our memory footprints are 2-4 times smaller than the best baseline performances of previously reported large-scale, distributed deep-learning systems,” said lead researcher Anshumali Shrivastava, an assistant professor of computer science at Rice.
“There are about 1 million English words, for example, but there are easily more than 100 million products online,” commented Tharun Medini, a Ph.D. student at Rice University regarding the challenges with product search.
Not just Amazon but other Tech companies like Google and Microsoft have immense data on successful and unsuccessful product searches. So, researchers have utilized their stored data by combing it with deep learning—which is a potentially effective way to get the best result to searchers.
Deep learning systems, sometimes known as neural network models, are collections of math equations that take in certain numbers called input vectors and transform them into a different numbers called output vectors.
“A neural network that takes search input and predicts from 100 million outputs, or products, will typically end up with about 2,000 parameters per product,” Medini said. “So you multiply those, and the final layer of the neural network is now 200 billion parameters. And I have not done anything sophisticated. I’m talking about a very, very dead simple neural network model.”
Watch this video below to learn more:
“It would take about 500 gigabytes of memory to store those 200 billion parameters,” Medini said. “But if you look at current training algorithms, there’s a famous one called Adam that takes two more parameters for every parameter in the model, because it needs statistics from those parameters to monitor the training process. So, now we are at 200 billion times three, and I will need 1.5 terabytes of working memory just to store the model. I haven’t even gotten to the training data. The best GPUs out there have only 32 gigabytes of memory, so training such a model is prohibitive due to massive inter-GPU communication.”
Source: Rice University