Publications, Preprints and Submissions --

Audio-Visual Waypoints for Navigation

Accepted in TBD

In audio-visual navigation, an agent intelligently travels through a complex, unmapped 3D environment using both sights and sounds to find a sound source (e.g.,a phone ringing in another room). Existing models learn to act at a fixed granularity of agent motion and rely on simple recurrent aggregations of the audio observations. We introduce a reinforcement learning approach to audio-visual navigation with two key novel elements 1) audio-visual waypoints that are dynamically set and learned end-to-end within the navigation policy, and 2) an acoustic memory that provides a structured, spatially grounded record of what the agent has heard as it moves. Both new ideas capitalize on the synergy of audio and visual data for revealing the geometry of an unmapped space. We demonstrate our approach on the challenging Replica environments of real-world 3D scenes. Our model improves the state of the art by a substantial margin, and our experiments reveal that learning the links between sights, sounds, and space is essential for audio-visual navigation.

Changan Chen, Sagnik Majumder, Ziad Al-Halah, Ruohan Gao, Santhosh K. Ramakrishnan, Kristen Grauman. "Audio-Visual Waypoints for Navigation" In: arXiv. https://arxiv.org/abs/2008.09622

Model Agnostic Answer Reranking System forAdversarial Question Answering

Accepted in TBD

Numerous methods have been proposed as defenses against adversarial examples in question answering (QA) tasks. Despite appealing to various principles, these techniques are often model specific, require retraining of the model, and give only marginal improvements in performance over vanilla models on popular adversarial QA datasets. In this work, we present a simple model-agnostic approach to this problem that can be applied directly to any QA model without any retraining. Our method employs an explicit answer candidate reranking mechanism that scores candidate answers on the basis of their content overlap with the question before making the final prediction. Combined with a strong base QA model, our method can outperform state-of-the-art baselines, calling into question how strong these adversarial testbeds are and how well sophisticated defense techniques are actually doing.

Citation TBD

Meta-Learning Convolutional Neural Architectures for Multi-Target Concrete Defect Classification With the COncrete DEfect BRidge IMage Dataset

Accepted in CVPR 2019, Long Beach

Recognition of defects in concrete infrastructure, especially in bridges, is a costly and time consuming crucial first step in the assessment of the structural integrity. Large variation in appearance of the concrete material, changing illumination and weather conditions, a variety of possible surface markings as well as the possibility for different types of defects to overlap, make it a challenging real-world task. In this work we introduce the novel COncrete DEfect BRidge IMage dataset (CODEBRIM) for multi-target classification of five commonly appearing concrete defects. We investigate and compare two reinforcement learning based meta-learning approaches, MetaQNN and efficient neural architecture search, to find suitable convolutional neural network architectures for this challenging multi-class multi-target task. We show that learned architectures have fewer overall parameters in addition to yielding better multi-target accuracy in comparison to popular neural architectures from the literature evaluated in the context of our application.

Martin Mundt, Sagnik Majumder, Sreenivas Murali, Panagiotis Panetsos, Visvanathan Ramesh, "Meta-Learning Convolutional Neural Architectures for Multi-Target Concrete Defect Classification With the COncrete DEfect BRidge IMage Dataset" In: International Conference on Computer Vision and Pattern Recognition (CVPR) 2019. https://arxiv.org/abs/1904.08486

Open Set Recognition Through Deep Neural Network Uncertainty: Does Out-of-Distribution Detection Require Generative Classifiers?

Accepted in ICCV SDLCV 2019, Seoul

We present an analysis of predictive uncertainty based out-of-distribution detection for different approaches to estimate various models’ epistemic uncertainty and contrast it with extreme value theory based open set recognition. While the former alone does not seem to be enough to overcome this challenge, we demonstrate that uncertainty goes hand in hand with the latter method. This seems to be particularly reflected in a generative model approach, where we show that posterior based open set recognition outperforms discriminative models and predictive uncertainty based outlier rejection, raising the question of whether classifiers need to be generative in order to know what they have not seen.

Martin Mundt, Iuliia Pliushch, Sagnik Majumder, Visvanathan Ramesh, "Open Set Recognition Through Deep Neural Network Uncertainty: Does Out-of-Distribution Detection Require Generative Classifiers?" In: International Conference on Computer Vision (ICCV) 2019, Stastical Deep Learning for Computer Vision (SDLCV) Workshop. https://arxiv.org/abs/1908.09625

Rethinking Layer-wise Feature Amounts in Convolutional Neural Network Architectures

Accepted in NeurIPS CRACT 2018, Canada

We characterize convolutional neural networks with respect to the relative amount of features per layer. Using a skew normal distribution as a parametrized framework, we investigate the common assumption of monotonously increasing feature-counts with higher layers of architecture designs. Our evaluation on models with VGG-type layers on the MNIST, Fashion-MNIST and CIFAR-10 image classification benchmarks provides evidence that motivates rethinking of our common assumption: architectures that favor larger early layers seem to yield better accuracy.

Martin Mundt, Sagnik Majumder, Tobias Weis, Visvanathan Ramesh, "Rethinking Layer-wise Feature Amounts in Convolutional Neural Network Architectures" In: International Conference on Neural Information Processing Systems (NeurIPS) 2018, Critiquing and Correcting Trends in Machine Learning (CRACT) Workshop. https://arxiv.org/abs/1812.05836

Modeling and simulation of temperature drift for ISFET-based pH sensor and its compensation through machine learning techniques

Accepted in IJCTA 2019

The paper presents modeling and simulation of ion-sensitive field-effect transistor (ISFET)-based pH sensor with temperature-dependent behavioral macromodel and proposes to compensate the temperature drift in the sensor using intelligent machine learning (ML) models. The macromodel is built using SPICE by introducing electrochemical parameters in a metal-oxide-semiconductor field-effect transistor (MOSFET) model to simulate ISFET characteristics. We account for the temperature dependence of electrochemical and semiconductor parameters in our macromodel to increase its robustness. The macromodel is then exported as a subcircuit element, which is used to design the readout interface circuit. A simple constant-voltage, constant-current (CVCC) topology is utilized to generate the data for temperature drift in ISFET pH sensor, which is used to train and test state-of-the-art ML-based regression models in order to compensate the drift behavior. The experimental results demonstrate that the random forest (RF) technique achieves the best performance with very high correlation and low error rate. Corresponding curves for output signal using the trained models show highly temperature-independent characteristics when tested for pH 2, 4, 7, 10, and 12, and we obtained a root mean squared error (RMS) variation of ΔpH ≤ 0.024 over a temperature range of 15◦C to 55◦C in comparison with ΔpH ≤ 1.346 for uncompensated output signal. This work establishes the framework for integration of ML techniques for drift compensation of ISFET chemical sensor to improve its performance.

Rishabh Bhardwaj, Soumendu Sinha, Nishad Sahu, Sagnik Majumder, Pratik Narang, Ravindra Mukhiya, "Modeling and simulation of temperature drift for ISFET-based pH sensor and its compensation through machine learning techniques" In: International Journal of Circuit Theory and Applications. https://onlinelibrary.wiley.com/doi/abs/10.1002/cta.2618

Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition

Accepted in TBD

We introduce a unified probabilistic approach for deep continual learning based on variational Bayesian inference with open set recognition. Our model combines a probabilistic encoder with a generative model and a generative linear classifier that get shared across tasks. The open set recognition bounds the approximate posterior by fitting regions of high density on the basis of correctly classified data points and balances open-space risk with recognition errors. Catastrophic inference for both generative models is significantly alleviated through generative replay, where the open set recognition is used to sample from high density areas of the class specific posterior and reject statistical outliers. Our approach naturally allows for forward and backward transfer while maintaining past knowledge without the necessity of storing old data, regularization or inferring task labels. We demonstrate compelling results in the challenging scenario of incrementally expanding the single-head classifier for both class incremental visual and audio classification tasks, as well as incremental learning of datasets across modalities.

Martin Mundt, Sagnik Majumder, Iuliia Pliushch, Visvanathan Ramesh, "Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition" In: arXiv. https://arxiv.org/abs/1905.12019

Handwritten Digit Recognition by Elastic Matching

Accepted in JCP 2018, Australia

A simple model of MNIST handwritten digit recognition is presented here. The model is an adaptation of a previous theory of face recognition. It realizes translation and rotation invariance in a principled way instead of being based on extensive learning from large masses of sample data. The presented recognition rates fall short of other publications, but due to its inspectability and conceptual and numerical simplicity, our system commends itself as a basis for further development.

Sagnik Majumder, C. von der Malsburg, Aashish Richhariya, Surekha Bhanot, "Handwritten Digit Recognition by Elastic Matching" Journal of Computers vol. 13, no. 9, pp. 1067-1074, 2018. http://www.jcomputers.us/index.php?m=content&c=index&a=show&catid=201&id=2862

Temperature compensation of ISFET based pH sensor using artificial neural networks

Accepted in IEEE RSM 2017, Malaysia

This paper presents a new Machine Learning based temperature compensation technique for Ion-Sensitive Field-Effect Transistor (ISFET). The circuit models for various electronic devices like MOSFET are available in commercial Technology Computer Aided Design (TCAD) tools such as LT-SPICE but no built-in model exists for ISFET. Considering SiO2 as the sensing film, an ISFET circuit model was created in LT-SPICE and simulations were carried out to obtain characteristic curves for SiO2 based ISFET. A Machine Learning (ML) model was trained using the data collected from the simulations performed using the ISFET macromodel in the read-out circuitry. The simulations were performed at various temperatures and the temperature drift behavior of ISFET was fed into the ML model. Constant pH (predicted by the system) curves were obtained when the device is tested for various pH (7 and 10) solutions at different ambient temperatures.

Rishabh Bharadwaj, Sagnik Majumder, Pawan K. Ajmera, Soumendu Sinha, Rishi Sharma, R. Mukhiya, Pratik Narang, "Temperature compensation of ISFET based pH sensor using artificial neural networks" In: Micro and Nanoelectronics (RSM), 2017 IEEE Regional Symposium on. IEEE. 2017, pp. 155–158. https://ieeexplore.ieee.org/document/8069141/