09:15 to 10:00 |
Paul Hand (Rice University, Texas) |
Inverse Problems under a Learned Generative Prior (Lecture-1) Deep generative modeling has led to new and state of the art approaches for enforcing structural priors in a variety of inverse problems. In contrast to priors given by sparsity, deep models can provide direct low-dimensional parameterizations of the manifold of images or signals belonging to a particular natural class, allowing for recovery algorithms to be posed in a low-dimensional space. This dimensionality may even be lower than the sparsity level of the same signals when viewed in a fixed basis. In this talk, we will show rigorous recovery guarantees for solving inverse problems under a learned generative prior. First, we will discuss convergence guarantees for compressive sensing under random neural network priors. Then, we will show that generative priors allow for a significant advance to be made in the problem of compressive phase retrieval. To date, no known computationally efficient algorithm exists for solving phase retrieval under a sparsity prior at sample complexity proportional to the signal complexity. With generative priors, we establish a new approach for compressive phase retrieval and establish rigorous guarantees with sample complexity proportional to the signal complexity.
|
|
|
10:00 to 10:30 |
-- |
Break |
|
|
10:30 to 10:50 |
Saketha Nath (IIT Hyderabad, India) |
Privacy vs learning trade-off in class-ratio estimation We present algorithms with guarantees for learning and privacy-preservation in the context of the problem of class ratio estimation. More interestingly, we derive learning bounds for the estimation with p rivacy constraints, which lead to important insights for the data-publisher. Such results motivate the need to looking at the privacy vs. learning trade-off in other ML applications.
|
|
|
10:55 to 11:15 |
Arijit Biswas (Amazon, India) |
A Generative Adversarial Network for E-commerce E-commerce websites such as Amazon, Alibaba, and Walmart typically process billions of orders every year. Semantic representation and understanding of these orders is extremely critical for an eCommerce company. Each order can be represented as a tuple of <customer, product, price, date>. Exploring the space of all plausible orders could help us better understand the relationships between the various entities in an e-commerce ecosystem, namely the customers and the products they purchase. In this work, we propose a Generative Adversarial Network (GAN) for orders made in e-commerce websites. Once trained, the generator in the GAN could generate any number of plausible orders. Our contributions include: (a) creating a dense and low-dimensional representation of e-commerce orders, (b) train an ecommerceGAN (ecGAN) with real orders to show the feasibility of the proposed paradigm, and (c) train an ecommerce-conditional-GAN (ec^2GAN) to generate the plausible orders involving a particular product. We propose several qualitative methods to evaluate ecGAN and demonstrate its effectiveness.
Bio: Arijit Biswas is currently a Senior Machine Learning Scientist at the India machine learning team in Amazon, Bangalore. His research interests are mainly in deep learning, machine learning, Natural Language Processing and Computer Vision. Earlier he was a research scientist at Xerox Research Centre India (XRCI) from June, 2014 to July, 2016. He received his PhD in Computer Science from University of Maryland, College Park in April 2014. His PhD thesis was on Semi-supervised and Active Learning Methods for Image Clustering. His thesis advisor was David Jacobs and he closely collaborated with Devi Parikh and Peter Belhumeur during his stay at UMD. While doing his PhD, Arijit also did internships at Xerox PARC and Toyota Technological Institute at Chicago (TTIC). He has published papers in CVPR, ECCV, ACM-MM, BMVC, IJCV, ECML-PKDD and CVIU. Arijit is also a recipient of the MIT Technology Review Innovators under 35 award from India in 2016.
|
|
|
11:15 to 12:00 |
Prateek Jain (Microsoft Research, India) |
Resource Efficient ML in 2KB of RAM Several critical applications require ML inference on resource-constrained devices. In this talk, we will discuss two new methods FastGRNN and EMI-RNN that can enable time-series inference on devices as small as Arduino Uno that have 2KB of RAM. Our methods can provide as much as 70x speed-up and compression over state-of-the-art methods like LSTM, GRU, while also providing strong theoretical guarantees.
Bio: Prateek Jain is a member of the Machine Learning and Optimization and the Algorithms and Data Sciences Group at Microsoft Research, Bangalore, India. He is also an adjunct faculty member at the Computer Science department at IIT Kanpur. His research interests are in machine learning, non-convex optimization, high-dimensional statistics, and optimization algorithms in general. He is also interested in applications of machine learning to privacy, computer vision, text mining and natural language processing. He completed his PhD at the University of Texas at Austin under Prof. Inderjit S. Dhillon
|
|
|
12:00 to 13:30 |
-- |
Lunch |
|
|
13:30 to 14:15 |
Swagatam Das (ISI Kolkata, India) |
Machine Learning in Face of Data Irregularities: Some Practical as well as Theoretical Challenges Most of the machine learning systems rely on some implicit regularity assumptions about the data. For example, many classifiers assume that all classes have equal number of representatives, all the sub-concepts within the classes are characterized by equal number of representatives, all classes have similar class-conditional distributions. Further, both classifiers as well as clustering methods assume that all features are defined and observed for all data instances. However, many real-world datasets violate one or more of these assumptions, giving rise to data irregularities which can induce undue bias in the learning systems or even render the systems inapplicable to the data. Starting with a taxonomy of the various data irregularities, in this talk, we peruse some key practical difficulties of the learning systems to handle one or a combination of such data irregularities, given that all of these cannot be remedied through pre-processing. We will also highlight some major theoretical challenges in analyzing the behavior of the learning systems (e.g. in terms of test error bounds for classifiers on imbalanced datasets) in face of irregular datasets.
|
|
|
14:15 to 14:35 |
Karthik Gurumoorthy (Amazon, India) |
ProtoDash: Fast Interpretable Prototype Selection Estimating causal impact in terms of change in spending patterns for various customer events is fundamental to a large E-commerce company like Amazon. Questions like "Do (treatment) customers who sign up for prime membership or makes a first purchase in a category say books spends 'x' dollars more in the subsequent one year compared to (control) customers who do not perform these events" are invaluable in terms of boosting revenues, streamlining business operations, inventory planning, marketing and recommendations. Computing such causal inferences of customer events from observational data requires one to reduce the bias due to confounding variables that could be found in an estimate of the treatment effect obtained from simply comparing outcomes (one year spends) among units that received the treatment versus those that did not. One approach to reduce bias is to identify weighted prototypical examples from control population that match the treatment distribution. To this end, we will describe a fast algorithm ProtoDash for selecting prototypical examples. We associate non-negative weights for the selected prototypes which aids in interpreting the importance of each prototype in matching the treatment distribution. Though the non-negative requirement sacrifices (strong) submodularity, we show that the problem is weakly submodular and derive approximation guarantees for our ProtoDash algorithm. We demonstrate the efficacy of our method on diverse domains such as digit recognition (MNIST), publicly available 40 health questionnaires obtained from the Center for Disease Control (CDC) website and retail.
Bio: Karthik Gurumoorthy graduated with a dual master's degree in Mathematics and Computer Science in 2009 and 2010 respectively and earned a doctorate degree in Computer Science in 2011 from the University of Florida, Gainesville. He continued at the same institution for a year in the capacity of a post-doctoral researcher and later joined GE Global Research, Bangalore as a Research Scientist in 2012 pursuing research in the field of medical image analysis. After completing a year and 3 months at GE, he accepted an AIRBUS post-doctoral fellowship position at the International Center for Theoretical Sciences, Tata Institute of Fundamental Research (ICTS-TIFR), Bangalore where he conducted research in data assimilation and filtering theory for over a year and 6 months. He currently works at Amazon Development Center, Bangalore as a Machine Learning Scientist. He has worked on a wide gamut of problems covering domains like signal processing, machine learning, density estimation, filtering theory, computer vision and image compression and is motivated by problems which are mathematical in nature.
|
|
|
14:15 to 14:35 |
Niloy Ganguly (IIT Kharagpur, India) |
A Deep Generative Model for Molecular Graphs Deep generative models have been praised for their ability to learn smooth latent representation of images, text, and audio, which can then be used to generate new, plausible data. However, current generative models are unable to work with molecular graphs due to their unique characteristics—their underlying structure is not Euclidean or grid-like, they remain isomorphic under permutation of the nodes labels, and they come with a different number of nodes and edges. In this paper, we propose NeVAE, a novel variational autoencoder for molecular graphs, whose encoder and decoder are specially designed to account for the above properties by means of several technical innovations. In addition, by using masking, the decoder is able to guarantee a set of valid properties in the generated molecules. Experiments reveal that our model can discover plausible, diverse and novel molecules more effectively than several state of the art methods. Moreover, by utilizing Bayesian optimization over the continuous latent representation of molecules our model finds, we can also find molecules that maximize certain desirable properties more effectively than alternatives.
|
|
|
15:00 to 15:30 |
-- |
Break |
|
|
15:30 to 16:15 |
Abhijnan Chakraborty (Max Planck Institute for Software Systems, Germany) |
Fairness in Algorithmic Decision Making: Classification and Beyond In recent years, machine learning has been increasingly used to predict, enhance, and even replace human decision making in a wide variety of offline and online applications. Often the design goal is to maximize some performance metric of the system (for example, overall prediction accuracy). However, there is a growing concern that these automated decisions can lead, even in the absence of intent, to a lack of fairness, i.e., their outcomes can disproportionately impact particular groups of social groups (e.g., Blacks, Women). In this talk, I'll cover a set of techniques designed in recent years to ensure that machine learning methods fueling algorithmic decisions are fair to all while satisfying the performance requirements.
|
|
|
16:15 to 17:00 |
Shankar Krishnan (IIT Bombay, India) |
The Loss Landscape of Deep Neural Networks Many complex systems can be analyzed by examining the spectra of matrices central to them. The graph Laplacian of large social networks and the Hessian of the empirical loss function arising from fitting of models to data in machine learning are prime examples. Naive estimation of the eigenvalues is computationally infeasible even if the matrices are explicitly given; most often the only access to these matrices is only through matrix-vector products. In this talk, I will describe a scalable and accurate algorithm for estimating the eigenvalue density of large, symmetric matrices with a particular focus on the Hessian of loss function in deep neural networks. We use our algorithm to study how the loss landscape changes throughout training process, and how re-parameterizations like Batch Normalization affect the conditioning of the surface.
|
|
|
17:00 to 17:45 |
Rajeev Rastogi (Amazon, India) |
Machine Learning @ Amazon In this talk, I will first provide an overview of key problem areas where we are applying Machine Learning (ML) techniques within Amazon such as product demand forecasting, product search, and information extraction from reviews, and associated technical challenges. I will then talk about two specific applications where we use a variety of methods to learn semantically rich representations of data: question answering where we use deep learning techniques and product size recommendations where we use probabilistic models.
Bio: Rajeev Rastogi is a Director of Machine Learning at Amazon where he is developing ML platforms and applications for the e-commerce domain. Previously, he was Vice President of Yahoo! Labs Bangalore and the founding Director of the Bell Labs Research Center in Bangalore, India. Rajeev is an ACM Fellow and a Bell Labs Fellow. He is active in the fields of databases, data mining, and networking, and has served on the program committees of several conferences in these areas. He currently serves on the editorial board of the CACM, and has been an Associate editor for IEEE Transactions on Knowledge and Data Engineering in the past. He has published over 125 papers, and holds over 50 patents. Rajeev received his B. Tech degree from IIT Bombay, and a PhD degree in Computer Science from the University of Texas, Austin.
|
|
|
17:45 to 18:00 |
-- |
Break |
|
|
18:00 to 19:00 |
-- |
Industry Panel |
|
|
19:00 to 21:00 |
-- |
Catered Dinner |
|
|