Proceedings of the Northern Lights Deep Learning Workshop https://eludamos.org/index.php/nldl <p>Deep learning is an emerging subfield in machine learning that has in recent years achieved state-of-the-art performance in image classification, object detection, segmentation, time series prediction and speech recognition to name a few. This workshop will gather researchers both on a national and international level to exchange ideas, encourage collaborations and present cutting-edge research.</p> en-US sigurd.lokse@uit.no (Sigurd Løkse) septentrio@ub.uit.no (Septentrio Academic Publishing) Mon, 28 Mar 2022 08:59:17 +0200 OJS 3.3.0.7 http://blogs.law.harvard.edu/tech/rss 60 The effect of dataset confounding on predictions of deep neural networks for medical imaging https://eludamos.org/index.php/nldl/article/view/6302 <p>The use of Convolutional Neural Networks (CNN) in medical imaging has often outperformed previous solutions and even specialists, becoming a promising technology for Computer-aidedDiagnosis (CAD) systems. However, recent works suggested that CNN may have poor generalisation on new data, for instance, generated in different hospitals. Uncontrolled confounders have been proposed as a common reason. In this paper, we experimentally demonstrate the impact of confounding data in unknown scenarios. We assessed the effect of four confounding configurations: total, strong, light and balanced. We found the confounding effect is especially prominent in total confounder scenarios, while the effect on light and strong confounding scenarios may depend on the dataset robustness. Our findings indicate that the confounding effect is independent of the architecture employed. These findings might explain why models can report good metrics during the development stage but fail to translate to real-world settings. We highlight the need for thorough consideration of these commonly unattended aspects, to develop safer CNN-based CAD systems.</p> Beatriz Garcia Santa Cruz, Andreas Husch, Frank Hertel Copyright (c) 2022 Beatriz Garcia Santa Cruz, Andreas Husch, Frank Hertel https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6302 Mon, 18 Apr 2022 00:00:00 +0200 Multi-input segmentation of damaged brain in acute ischemic stroke patients using slow fusion with skip connection https://eludamos.org/index.php/nldl/article/view/6223 <p>Time is a fundamental factor during stroke treat-ments. A fast, automatic approach that segmentsthe ischemic regions helps treatment decisions. In clinical use today, a set of color-coded parametric maps generated from computed tomography per-fusion (CTP) images are investigated manually to decide a treatment plan. We propose an automatic method based on a neural network using a set of parametric maps to segment the two ischemic regions (core and penumbra) in patients affected by acute ischemic stroke. Our model is based on a convolution-deconvolution bottleneck structure with multi-input and slow fusion. A loss function based on the focal Tversky index addresses the data imbalance issue. The proposed architecture demon-strates effective performance and results comparable to the ground truth annotated by neuroradiologists. A Dice coefficient of 0.81 for penumbra and 0.52 for core over the large vessel occlusion test set is achieved. The full implementation is available at: <a href="https://git.io/JtFGb">https://git.io/JtFGb</a>.</p> Luca Tomasetti, Mahdieh Khanmohammadi, Kjersti Engan, Liv Jorunn Høllesli, Kathinka Dæhli Kurz Copyright (c) 2022 Luca Tomasetti, Mahdieh Khanmohammadi, Kjersti Engan, Liv Jorunn Høllesli, Kathinka Dæhli Kurz https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6223 Mon, 28 Mar 2022 00:00:00 +0200 Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples https://eludamos.org/index.php/nldl/article/view/6237 <p>In this article, we propose a novel algorithm for deep reinforcement learning named Expert Q-learning. Expert Q-learning is inspired by Dueling Q-learning and aims to incorporate semi-supervised learning into reinforcement learning through splitting Q-values into state values and action advantages. We require that an offline expert assesses the value of a state in a coarse manner using three discrete values. An expert network is designed in addition to the Q-network, which updates each time following the regular offline minibatch update whenever the expert example buffer is not empty. Using the board game Othello, we compare our algorithm with the baseline Q-learning algorithm, which is a combination of Double Q-learning and Dueling Q-learning. Our results show that Expert Q-learning is indeed useful and more resistant to the overestimation bias. The baseline Q-learning algorithm exhibits unstable and suboptimal behavior in non-deterministic settings, whereas Expert Q-learning demonstrates more robust performance with higher scores, illustrating that our algorithm is indeed suitable to integrate state values from expert examples into Q-learning.</p> Li Meng, Anis Yazidi, Morten Goodwin, Paal Engelstad Copyright (c) 2022 Li Meng, Anis Yazidi, Morten Goodwin, Paal Engelstad https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6237 Mon, 28 Mar 2022 00:00:00 +0200 LayeredCNN: Segmenting Layers with Autoregressive Models https://eludamos.org/index.php/nldl/article/view/6254 <p>We address a subclass of segmentation problems where the labels of the image are structured in layers. We propose applying autoregressive CNNs which, when given an image and a partial segmentation of layers, complete the segmentation. Initializing the model with a user-provided partial segmentation allows for choosing which layers the model should segment. Alternatively, the model can produce an automatic initialization, albeit with some performance loss. The model is trained exclusively on synthetic data from our data generation algorithm. It yields impressive performance on the synthetic data and generalizes to real data it has never seen.</p> Jakob L. Christensen, Patrick Møller Jensen, Morten Rieger Hannemose, Anders Bjorholm Dahl, Vedrana Andersen Dahl Copyright (c) 2022 Jakob L. Christensen, Patrick Møller Jensen, Morten Rieger Hannemose, Anders Bjorholm Dahl, Vedrana Andersen Dahl https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6254 Mon, 28 Mar 2022 00:00:00 +0200 Deep Reinforcement Learning for Detection of Abnormal Anatomies https://eludamos.org/index.php/nldl/article/view/6280 <p>Automatic detection of abnormal anatomies or malformations of different structures of the human body is a challenging task that could provide support for clinicians in their daily practice. Compared to normative anatomies, there is a low presence of anatomical abnormalities in patients, and the great variation within malformations make it challenging to design deep learning frameworks for automatic detection. We propose a framework for anatomical abnormality detection, which benefits from using a deep reinforcement learning model for landmark detection trained in normative data. We detect the abnormalities using the variability between the predicted landmarks configurations in a subspace based on a point distribution model of landmarks using Procrustes shape alignment and principal component analysis projection from normative data. We demonstrate the performance of this implementation on clinical CT scans of the inner ear, and show how synthetically created abnormal cochlea anatomy can be detected using the prediction of five landmarks around the cochlea. Our approach shows a Receiver Operating Characteristics (ROC) Area Under The Curve (AUC) of 0.97, and 96% accuracy for the detection of abnormal anatomy on synthetic data.</p> Paula López Diez, Kristine Aavild Juhl, Josefine Vilsbøll Sundgaard, Hassan Diab, Jan Margeta, François Patou, Rasmus R. Paulsen Copyright (c) 2022 Paula López Diez, Kristine Aavild Juhl, Josefine Vilsbøll Sundgaard, Hassan Diab, Jan Margeta, François Patou, Rasmus R. Paulsen https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6280 Tue, 29 Mar 2022 00:00:00 +0200 Add a SideNet to your MainNet https://eludamos.org/index.php/nldl/article/view/6286 <p>As the performance and popularity of deep neural networks has increased, so too has their computational cost. There are many effective techniques for reducing a network’s computational footprint--quantisation, pruning, knowledge distillation--, but these lead to models whose computational cost is the same regardless of their input. Our human reaction times vary with the complexity of the tasks we perform: easier tasks--e.g. telling apart dogs from boats--are executed much faster than harder ones--e.g. telling apart two similar-looking breeds of dogs. Driven by this observation, we develop a method for adaptive network complexity by attaching a small classification layer, which we call SideNet, to a large pretrained network, which we call MainNet. Given an input, the SideNet returns a classification if its confidence level, obtained via softmax, surpasses a user-determined threshold, and only passes it along to the large MainNet for further processing if its confidence is too low. This allows us to flexibly trade off the network’s performance with its computational cost. Experimental results show that simple single hidden layer perceptron SideNets added onto pretrained ResNet and BERT MainNets allow for substantial decreases in compute with minimal drops in performance on image and text classification tasks.</p> Adrien Morisot Copyright (c) 2022 Adrien Morisot https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6286 Mon, 28 Mar 2022 00:00:00 +0200 Continuous Metric Learning For Transferable Speech Emotion Recognition and Embedding Across Low-resource Languages https://eludamos.org/index.php/nldl/article/view/6300 <p>Speech emotion recognition (SER) refers to the technique of inferring the emotional state of an individual from speech signals. SERs continue to garner interest due to their wide applicability. While the domain is mainly founded on signal processing, machine learning and deep learning methods, generalizing over languages continues to remain a challenge. To improve performance over languages, in this paper we propose a denoising autoencoder with semi-supervision using a continuous metric loss. The novelty of this work lies in our proposal for continuous metric learning, which is among the first proposals on the topic to the best of our knowledge. Furthermore, we contribute labels corresponding to the dimensional model, that were used to evaluate the quality of embedding (the labels will be made available by the time of the publication). We show that the proposed method consistently outperforms the baseline method in terms of the classification accuracy and correlation with respect to the dimensional variables.</p> Sneha Das, Nicklas Leander Lund, Nicole Nadine Lønfeldt, Anne Katrine Pagsberg, Line H. Clemmensen Copyright (c) 2022 Sneha Das, Nicklas Leander Lund, Nicole Nadine Lønfeldt, Anne Katrine Pagsberg, Line H. Clemmensen https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6300 Wed, 06 Apr 2022 00:00:00 +0200 Re-Identification of Giant Sunfish using Keypoint Matching https://eludamos.org/index.php/nldl/article/view/6234 <p>We present the first work where re-identification ofthe Giant Sunfish (Mola alexandrini) is automated using computer vision and deep learning. We propose a pipeline that scores an mAP of 60.34% on a full rank of the novel TinyMola dataset which includes 31 IDs and 91 images. The method requires no domain-adaptation or training which makes it especially suited for low-budget or volunteer-based projects, like <em>Match My Mola</em>, as part of a human-in-the-loop model.</p> <p>The pipeline includes segmentation, keypoint detection and description, keypoint matching, and ranking. The choice of feature descriptor has the largest impact on the performance and we show that the deep learning based SuperPoint descriptor greatly outperforms handcrafted descriptors like SIFT and RootSIFT independent of the segmentation level and matching method. Combining SuperPoint and the graph neural network based SuperGlue matching method produces the best results.</p> Malte Pedersen, Joakim Bruslund Haurum, Thomas B. Moeslund, Marianne Nyegaard Copyright (c) 2022 Malte Pedersen, Joakim Bruslund Haurum, Thomas B. Moeslund, Marianne Nyegaard https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6234 Mon, 28 Mar 2022 00:00:00 +0200 Detection of forest roads in Sentinel-2 images using U-Net https://eludamos.org/index.php/nldl/article/view/6246 <p>This paper presents a new method for semi-automatic detection of nature interventions in<br>Sentinel-2 satellite images with 10 m spatial resolution. The Norwegian Environment Agency is<br>maintaining a map of undisturbed nature in Norway. U-Net was used for automated detection of<br>new roads, as these are often the cause whenever the area of undisturbed nature is reduced. The<br>method was able to detect many new roads, but with some false positives and possibly some false<br>negatives (i.e., missing new roads). In conclusion, we have demonstrated that automated detection of<br>new roads, for the purpose of updating the map of undisturbed nature, is possible. We have also<br>suggested several improvements of the method to improve its usefulness.</p> Øivind Trier, Arnt-Børre Salberg, Ragnvald Larsen, Ole Torbjørn Nyvoll Copyright (c) 2022 Øivind Trier, Arnt-Børre Salberg, Ragnvald Larsen, Ole Torbjørn Nyvoll https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6246 Mon, 28 Mar 2022 00:00:00 +0200 An analysis of over-sampling labeled data in semi-supervised learning with FixMatch https://eludamos.org/index.php/nldl/article/view/6269 <p>Most semi-supervised learning methods over-sample labeled data when constructing training mini-batches. This paper studies whether this common practice improves learning and how. We compare it to an alternative setting where each minibatch is uniformly sampled from all the training data, labeled or not, which greatly reduces direct supervision from true labels in typical low-label regimes. However, this simpler setting can also be seen as more general and even necessary in multitask problems where over-sampling labeled data would become intractable. Our experiments on semi-supervised CIFAR-10 image classification using FixMatch show a performance drop when using the uniform sampling approach which diminishes when the amount of labeled data or the training time increases. Further, we analyse the training dynamics to understand how over-sampling of labeled data compares to uniform sampling. Our main finding is that over-sampling is especially beneficial early in training but gets less important in the later stages when more pseudo-labels become correct. Nevertheless, we also find that keeping some true labels remains important to avoid the accumulation of confirmation errors from incorrect pseudo-labels.</p> Miquel Marti, Sebastian Bujwid, Alessandro Pieropan, Hossein Azizpour, Atsuto Maki Copyright (c) 2022 Miquel Marti, Sebastian Bujwid, Alessandro Pieropan, Hossein Azizpour, Atsuto Maki https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6269 Thu, 07 Apr 2022 00:00:00 +0200 CryoSat-2 waveform classification for melt event monitoring https://eludamos.org/index.php/nldl/article/view/6284 <p>Measuring the mass balance of ice sheets is important with respect to understanding among others sea level rise, glacier dynamics, global ocean circulation and marine ecosystems. One important parameter of the mass balance is surface melt, which can be estimated from different satellite data sources. In this study we investigate the potential of utilizing machine learning techniques for CryoSat-2 (CS2) radar altimeter waveform classification in order to derive melt information. Training data is derived by spatio-temporally matching of CS2 measurements with MODIS land surface temperature measurements. We propose a time convolution network with a fully connected classifier tail for CS2 waveform classifcation. In addition a non-deep learning model is implemented, providing a baseline. One of the main challenges is the high class imbalance, as surface temperatures on the interior of Greenland rarely reach the freezing point. The model performance is measured by several metrics: F1 score, average recall and Matthews correlation coefficient. The results of this proof of concept study indicate feasibility.</p> Martijn Vermeer, David Völgyes, Malcolm McMillan, Daniele Fantin Copyright (c) 2022 Martijn Vermeer, David Völgyes, Malcolm McMillan, Daniele Fantin https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6284 Mon, 28 Mar 2022 00:00:00 +0200 Towards Understanding of User Perceptions for Smart Border Control Technologies using a Fine-Tuned Transformer Approach https://eludamos.org/index.php/nldl/article/view/6292 <p>Smart Border Control (SBC) technologies became a hot topic in recent years when the European Union (EU) Commission announced the Smart Borders Package to improve the efficiency and security of the border crossing points (BCPs). Although, BCPs technologies have potential benefits in terms of enabling traveller' data processing, they still lead to acceptability and usability challenges when used by travelers. Success of technologies depends on user acceptance. Sentiment analysis is one of the primary techniques to measure user acceptance. Although, there exists variety of studies in literature where sentiment analysis has been used to understand user acceptance in different domains. To the best of our knowledge, there is no study where sentiment analysis has been used for measuring the user acceptance of SBC technologies. Thus, in this study, we propose a fine-tuned transformer model along with an automatic sentiment labels generation technique to perform sentiment analysis as a step towards getting insights into user acceptance of BCPs technologies. The results obtained in this study are promising; given the condition that there is no training data available from BCPs. The proposed approach was validated against IMDB reviews dataset and achieved weighted F1-score of 79% for sentiment analysis task.</p> Sarang Shaikh, Sule Yildirim Yayilgan, Erjon Zoto, Mohamed Abomhara Copyright (c) 2022 Sarang Shaikh, Sule Yildirim Yayilgan, Erjon Zoto, Mohamed Abomhara https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6292 Mon, 28 Mar 2022 00:00:00 +0200 Modelling Phytoplankton Behaviour in the North and Irish Sea with Transformer Networks https://eludamos.org/index.php/nldl/article/view/6229 <p>Climate change will affect how water sources are managed and monitored. Continuous monitoring of water quality is crucial to detect pollution, to ensure that various natural cycles are not disrupted by anthropogenic activities and to assess the effectiveness of beneficial management measures taken under defined protocols. One such disruption is algal blooms in which population of phytoplankton increase rapidly affecting biodiversity in marine environments. The frequency of algal blooms will increase with climate change as it presents favourable conditions for reproduction of phytoplankton. Machine learning has been used for early detection of algal blooms previously, with the focus mostly on single closed bodies of water in Far East Asia with short time ranges. In this work, we study four locations around the North Sea and the Irish Sea with different characteristics predicting activity with longer time-spans and explaining the importance of the input with regard to the output of the prediction model. This work aids domain experts to monitor potential changes to the ecosystem over longer time ranges and to take action when necessary.</p> Onatkut Dagtekin, Nina Dethlefs Copyright (c) 2022 Onatkut Dagtekin, Nina Dethlefs https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6229 Mon, 28 Mar 2022 00:00:00 +0200 Fast accuracy estimation of deep learning based multi-class musical source separation https://eludamos.org/index.php/nldl/article/view/6241 <p>Music source separation represents the task of extracting all the instruments from a given song. Recent breakthroughs on this challenge have gravitated around a single dataset, MUSDB, only limited to four instrument classes. Larger datasets and more instruments are costly and time-consuming in collecting data and training deep neural networks (DNNs). In this work, we propose a fast method to evaluate the separability of instruments in any dataset without training and tuning a DNN.<br>This separability measure helps to select appropriate samples for the efficient training of neural networks. Based on the oracle principle with an ideal ratio mask, our approach is an excellent proxy to estimate the separation performances of state-of-the-art deep learning approaches such as TasNet or Open-Unmix.<br>Our results contribute to revealing two essential points for audio source separation: 1) the ideal ratio mask, although light and straightforward, provides an accurate measure of the audio separability performance of recent neural nets, and 2) new end-to-end learning methods such as Tasnet, that operate directly on waveforms, are, in fact, internally building a Time-Frequency (TF) representation, so that they encounter the same limitations as the TF based-methods when separating audio pattern overlapping in the TF plane. </p> Alexandru Mocanu, Benjamin Ricaud, Milos Cernak Copyright (c) 2022 Alexandru Mocanu, Benjamin Ricaud, Milos Cernak https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6241 Mon, 28 Mar 2022 00:00:00 +0200 Mutual information estimation for graph convolutional neural networks https://eludamos.org/index.php/nldl/article/view/6257 <p>Measuring model performance is a key issue for deep learning practitioners. However, we often lack the ability to explain why a specific architecture attains superior predictive accuracy for a given data set. Often, validation accuracy is used as a performance heuristic quantifying how well a network generalize to unseen data. Mutual information can be used as a measure of the quality of internal representations in deep learning models, and the information plane provide insights into whether the model exploits the available information in data.</p> <p>The information plane has previously been explored for fully connected neural networks and convolutional architectures. We present an architecture-agnostic method for tracking a network's internal representations during training, which are then used to create the mutual information plane. The method is exemplified for a graph convolutional neural network fitted on the Cora citation data. We compare how the inductive bias introduced in the graph convolutional architecture changes the mutual information plane relative to a fully connected neural network.</p> Marius Cervera Landsverk, Signe Riemer-Sørensen Copyright (c) 2022 Marius Cervera Landsverk, Signe Riemer-Sørensen https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6257 Mon, 28 Mar 2022 00:00:00 +0200 SparseMeshCNN with Self-Attention for Segmentation of Large Meshes https://eludamos.org/index.php/nldl/article/view/6281 <p>In many clinical applications, 3D mesh models of human anatomies are important tools for visualization, diagnosis, and treatment planning. Such 3D mesh models often have a high number of vertices to capture the complex shape, and processing these large meshes on readily available graphic cards can be a challenging task. To accommodate this, we present a sparse version of MeshCNN called SparseMeshCNN, which can process meshes with more than 60 000 edges. We further show that adding non-local attention in the network can mitigate the small receptive field and improve the results. The developed methodology was applied to separate the Left Atrial Appendage (LAA) from the Left Atrium (LA) on 3D mesh models constructed from medical images, but the method is general and can be put to use in any application within mesh classification or segmentation where memory can be a concern.</p> Bjørn Hansen, Mathias Lowes, Thomas Ørkild, Anders Dahl, Vedrana Dahl, Ole de Backer, Oscar Camara, Rasmus Paulsen, Christian Ingwersen, Kristine Sørensen Copyright (c) 2022 Bjørn Hansen, Mathias Lowes, Thomas Ørkild, Anders Dahl, Vedrana Dahl, Ole de Backer, Oscar Camara, Rasmus Paulsen, Christian Ingwersen, Kristine Sørensen https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6281 Fri, 08 Apr 2022 00:00:00 +0200 Optimizing Slimmable Networks for Multiple Target Platforms https://eludamos.org/index.php/nldl/article/view/6288 <div class="page" title="Page 1"> <div class="layoutArea"> <div class="column"> <p>In this work, we extend platform-aware adaptive training to the weighted average of multiple target platforms, where the weighting is determined e.g. by the market share of the target platform. To simulate different market regimes, we generate different weight settings by a Chinese restaurant process to benchmark optimization strategies. We use a neural architecture search framework based on Markov Random Fields to efficiently find the optimal channel configurations for each platform, and investigate different sampling strategies to train a single slimmable network that can be deployed to multiple platforms at the same time. Empirical results on CIFAR-100 demonstrate improved performance over the original slimmable network across different weight settings, while maintaining efficient training.</p> </div> </div> </div> Zifu Wang, Matthew B. Blaschko Copyright (c) 2022 Zifu Wang, Matthew B. Blaschko https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6288 Fri, 17 Jun 2022 00:00:00 +0200 Extracting Rules from Neural Networks with Partial Interpretations https://eludamos.org/index.php/nldl/article/view/6301 <p>We investigate the problem of extracting rules, expressed in Horn logic, from neural network models.<br>Our work is based on the exact learning model, in which a learner interacts with a teacher (the neural network model) via queries in order to learn an abstract target concept, which in our case is a set of Horn rules. We consider partial interpretations to formulate the queries. These can be understood as a representation of the world where part of the knowledge regarding the truthness of propositions is unknown. We employ Angluin’s algorithm for learning Horn rules via queries and evaluate our strategy empirically.</p> Cosimo Persia, Ana Ozaki Copyright (c) 2022 Cosimo Persia, Ana Ozaki https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6301 Tue, 29 Mar 2022 00:00:00 +0200 Forecasting Aquaponic Systems Behaviour With Recurrent Neural Networks Models https://eludamos.org/index.php/nldl/article/view/6236 <p>Aquaponic systems provide a reliable solution to grow vegetables while cultivating fish (or other aquatic organisms) in a controlled environment. The main advantage of these systems compared with traditional soil-based agriculture and aquaculture installations is the ability to produce fish and vegetables with low water consumption. Aquaponics requires a robust control system capable of optimizing fish and plant growth while ensuring a safe operation. To support the control system, this work explores the design process of Deep Learning models based on Recurrent Neural Networks to forecast one hour of pH values in small-scale industrial Aquaponics. This implementation guides us through the machine learning life-cycle with industrial time-series data, i.e. data acquisition, pre-processing, feature engineering, architecture selection, training, and model verification.</p> Juan Cardenas-Cartagena, Mohamed Elnourani, Baltasar Beferull-Lozano Copyright (c) 2022 Juan Cardenas-Cartagena, Mohamed Elnourani, Baltasar Beferull-Lozano https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6236 Mon, 28 Mar 2022 00:00:00 +0200 Unsupervised Time Series Classification for Climate Data https://eludamos.org/index.php/nldl/article/view/6250 <div class="page" title="Page 1"> <div class="layoutArea"> <div class="column"> <p>Outstanding success of Convolutional Neural Net- work image classification in the last few years in- fluenced application of this technique to a vari- ety of embeddable entities. CNN image classifica- tion methods are getting high accuracies but they are based on supervised machine learning that re- quires labeling of input data and do not help to understand unknown data. In this study we in- troduce unsupervised machine learning model that categorizes entity pairs to classes of similar and non similar pairs by converting pairs of entities to mirror vectors, transforming mirror vectors to Gramian Angular Fields (GAF) images and clas- sifying images using CNN transfer learning classi- fication. Based on climate data we demonstrated several scenarios that show when this model is re- liable for time series pair classification.</p> </div> </div> </div> Alex Romanova Copyright (c) 2022 Alex Romanova https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6250 Mon, 28 Mar 2022 00:00:00 +0200 Learnable filter-banks for CNN-based audio applications https://eludamos.org/index.php/nldl/article/view/6279 <p>We investigate the design of a convolutional layer where kernels are parameterized functions. This layer aims at being the input layer of convolutional neural networks for audio applications or applications involving time-series. The kernels are defined as one-dimensional functions having a band-pass filter shape, with a limited number of trainable parameters. Building on the literature on this topic, we confirm that networks having such an input layer can achieve state-of-the-art accuracy on several audio classification tasks. We explore the effect of different parameters on the network accuracy and learning ability. This approach reduces the number of weights to be trained and enables larger kernel sizes, an advantage for audio applications. Furthermore, the learned filters bring additional interpretability and a better understanding of the audio properties exploited by the network.</p> Helena Peic Tukuljac, Benjamin Ricaud, Nicolas Aspert, Laurent Colbois Copyright (c) 2022 Helena Peic Tukuljac, Benjamin Ricaud, Nicolas Aspert, Laurent Colbois https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6279 Mon, 28 Mar 2022 00:00:00 +0200 Photo-Realistic Continuous Image Super-Resolution with Implicit Neural Networks and Generative Adversarial Networks https://eludamos.org/index.php/nldl/article/view/6285 <p>The implicit neural networks (INNs) can represent images in the continuous domain. They consume raw (X, Y) coordinates and output a color value. Therefore they can represent and generate images at arbitrarily high resolutions in contrast to convolutional neural networks (CNNs) that output a constant-sized array of pixels. In this work, we show how to super-resolve a single image using an INN to produce sharp and photo-realistic images. We employ a random patch-based coordinate sampling method to obtain patches with context and structure; we use these patches to train the INN in an adversarial setting. We demonstrate that the trained network retains the desirable properties of INNs while the output is sharper compared to previous work. We also show qualitative and quantitative comparisons with INN and CNN baselines on benchmark datasets of DIV2K, Set5, Set14, Urban100, and B100. Our code will be made public.</p> Muhammad Sarmad, Leonardo Ruspini, Frank Lindseth Copyright (c) 2022 Muhammad Sarmad, Leonardo Ruspini, Frank Lindseth https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6285 Mon, 28 Mar 2022 00:00:00 +0200 Uncertainty Quantification of Surrogate Explanations: an Ordinal Consensus Approach https://eludamos.org/index.php/nldl/article/view/6294 <p>Explainability of black-box machine learning models is crucial, in particular when deployed in critical applications such as medicine or autonomous cars. Existing approaches produce explanations for the predictions of models, however, how to assess the quality and reliability of such explanations remains an open question. In this paper we take a step further in order to provide the practitioner with tools to judge the trustworthiness of an explanation. To this end, we produce estimates of the uncertainty of a given explanation by measuring the ordinal consensus amongst a set of diverse bootstrapped surrogate explainers. While we encourage diversity by using ensemble techniques, we propose and analyse metrics to aggregate the information contained within the set of explainers through a rating scheme. We empirically illustrate the properties of this approach through experiments on state-of-the-art Convolutional Neural Network ensembles. Furthermore, through tailored visualisations, we show specific examples of situations where uncertainty estimates offer concrete actionable insights to the user beyond those arising from standard surrogate explainers.</p> Jonas Schulz, Raul Santos-Rodriguez, Rafael Poyiadzi Copyright (c) 2022 Jonas Schulz, Raul Santos-Rodriguez, Rafael Poyiadzi https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6294 Mon, 28 Mar 2022 00:00:00 +0200 Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning https://eludamos.org/index.php/nldl/article/view/6231 <p>Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English. This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language. DialoGPT, an English language pre-trained model, is adapted by training on three different Swedish language conversational datasets obtained from publicly available sources. Perplexity score (an automated intrinsic language model metric) and surveys by human evaluation were used to assess the performances of the fine-tuned models, with results that indicate that the capacity for transfer learning can be exploited with considerable success. Human evaluators asked to score the simulated dialogue judged over 57% of the chatbot responses to be human-like for the model trained on the largest (Swedish) dataset. We provide the demos and model checkpoints of our English and Swedish chatbots on the HuggingFace platform for public use.</p> Tosin Adewumi, Rickard Brännvall, Nosheen Abid, Maryam Pahlavan, Sana Sabah, Foteini Liwicki, Marcus Liwicki Copyright (c) 2022 Tosin Adewumi, Rickard Brännvall, Nosheen Abid, Maryam Pahlavan, Sana Sabah, Foteini Liwicki, Marcus Liwicki https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6231 Mon, 28 Mar 2022 00:00:00 +0200 A mammography classification model trained from image labels only https://eludamos.org/index.php/nldl/article/view/6244 <p>The Cancer Registry of Norway organises a population-based breast cancer screening program, where 250 000 women participate each year. The interpretation of the screening mammograms is a manual process, but deep neural networks are showing potential in mammographic screening. Most methods focus on methods trained from pixel-level annotations, but these require expertise and are time-consuming to produce. Through the screenings, image level annotations are however readily available. In this work we present a few models trained from image level annotations from the Norwegian dataset: a holistic model, an attention model and an ensemble model. We compared their performance with that of pretrained models based on pixel-level annotations, trained on international datasets. From this we found that models trained on our local data with image-level annotation gave considerably better performance than the pretrained models from external data, although based on pixel-level annotations.</p> Fredrik Dahl, Marit Holden, Olav Brautaset, Line Eikvil Copyright (c) 2022 Fredrik Dahl, Marit Holden, Olav Brautaset, Line Eikvil https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6244 Mon, 28 Mar 2022 00:00:00 +0200 Surrogate-data-enriched Physics-Aware Neural Networks https://eludamos.org/index.php/nldl/article/view/6268 <p>Neural networks can be used as surrogates for PDE models. They can be made physics-aware by penalizing underlying equations or the conservation of physical properties in the loss function during training. Current approaches allow to additionally respect data from numerical simulations or experiments in the training process. However, this data is frequently expensive to obtain and thus only scarcely available for complex models. In this work, we investigate how physics-aware models can be enriched with computationally cheaper, but inexact, data from other surrogate models like Reduced-Order Models (ROMs). In order to avoid trusting too-low-fidelity surrogate solutions, we develop an approach that is sensitive to the error in inexact data. As a proof of concept, we consider the one-dimensional wave equation and show that the training accuracy is increased by two orders of magnitude when inexact data from ROMs is incorporated.</p> Raphael Leiteritz, Patrick Buchfink, Bernard Haasdonk, Dirk Pflüger Copyright (c) 2022 Raphael Leiteritz, Patrick Buchfink, Bernard Haasdonk, Dirk Pflüger https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6268 Mon, 28 Mar 2022 00:00:00 +0200 Compressing CNN Kernels for Videos Using Tucker Decompositions: Towards Lightweight CNN Applications https://eludamos.org/index.php/nldl/article/view/6282 <p>Convolutional Neural Networks (CNN) are the state-of-the-art in the field of visual computing. However, a major problem with CNNs is the large number of floating point operations (FLOPs) required to perform convolutions for large inputs. When considering the application of CNNs to video data, convolutional filters become even more complex due to the extra temporal dimension. This leads to problems when respective applications are to be deployed on mobile devices, such as smart phones, tablets, micro-controllers or similar, indicating less computational power. <br>Kim et al. proposed using a Tucker-decomposition to compress the convolutional kernel of a pre-trained network for images in order to reduce the complexity of the network, i.e. the number of FLOPs. In this paper, we generalize the aforementioned method for application to videos (and other 3D signals) and evaluate the proposed method on a modified version of the THETIS data set, which contains videos of individuals performing tennis shots. We show that the compressed network reaches comparable accuracy, while indicating a memory compression by a factor of 51. However, the actual computational speed-up (factor 1.4) does not meet our theoretically derived expectation (factor 6).</p> Tobias Engelhardt Rasmussen, Line KH Clemmensen, Andreas Baum Copyright (c) 2022 Tobias Engelhardt Rasmussen, Line KH Clemmensen, Andreas Baum https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6282 Mon, 28 Mar 2022 00:00:00 +0200 Self-Communicating Deep Reinforcement Learning Agents Develop External Number Representations https://eludamos.org/index.php/nldl/article/view/6291 <p>Symbolic numbers are a remarkable product of human cultural development. The developmental process involved the creation and progressive refinement of material representational tools, such as notched tallies, knotted strings, and counting boards. In this paper, we introduce a computational framework that allows the investigation of how material representations might support number processing in a deep reinforcement learning scenario. In this framework, agents can use an external, discrete state to communicate information to solve a simple numerical estimation task. We find that different perceptual and processing constraints result in different emergent representations, whose specific characteristics can facilitate the learning and communication of numbers.</p> Silvester Sabathiel, Trygve Solstad, Alberto Testolin, Flavio Petruzzellis Copyright (c) 2022 Silvester Sabathiel, Trygve Solstad, Alberto Testolin, Flavio Petruzzellis https://creativecommons.org/licenses/by/4.0 https://eludamos.org/index.php/nldl/article/view/6291 Thu, 16 Jun 2022 00:00:00 +0200