These are the year's most intriguing AI research publications. It integrates innovations in artificial intelligence (AI) and data science. It is chronologically organized and includes a link to a lengthier article.

Target-Independent Domain Adaptation for WBC Classification Using Generative Latent Search

Automating the sorting of camera-taken microscopic pictures of White Blood Cells (WBCs) and related cell groups has become vital because it speeds up the time-consuming process of reviewing and diagnosing by hand. Domain shift is a problem that many State-Of-The-Art (SOTA) methods that use Deep Convolutional Neural Networks have when they are tried on data (target) that was collected in a setting that is different from the training set. It makes them work much worse. The goal data changed because of different types of cameras or microscopes, lenses, lighting, etc. Standard algorithms assume there is enough unlabeled target data, which is only sometimes the case with medical pictures. However, Unsupervised Domain Adaptation (UDA) techniques can solve this problem. 

Researchers have devised a way to do UDA that doesn't need target data in this study. From the source data, we get the "closest clone" of a test picture from the target data. It is used as a stand-in in the classifier. They show that there is such a clone because they can pick an unlimited number of data points from the source distribution. The researchers suggest using a generative model for latent variables based on variational inference to sample and simultaneously find the "closest clone" from the source distribution. It will be done through an optimization process in the latent space. Furthermore, they show that the proposed method works better than several SOTA UDA methods for classifying WBCs on datasets with different imaging methods and in other settings.

Streaming Coresets for Symmetric Tensor Factorization

In recent months, factorizing tensors has become crucial to many machine learning processes, mainly in latent variable models. The experts show how to do this fast and well in the streaming setting. They offer ways to pick a sublinear number of vectors from a set of n vectors, each in Rd, to be the coreset. These methods ensure that the CP decomposition of the p-moment tensor of the coreset is close to the corresponding decomposition of the p-moment tensor computed from the full data. The experts developed two new ways to use algorithms: online filtering and kernelization. They show six algorithms that use these two to get different trade-offs between coreset size, update time, and working space, beating or meeting other state-of-the-art algorithms.

On Coresets for Regularized Regression

The researchers are looking into the effects of norm-based regularization on the dimensions of coresets used in regression problems. In particular, they examine the dimensions of coresets for regularized iterations of regression. Previous research has demonstrated that ridge regression can yield a coreset that is more compact than the coreset obtained from least squares regression, which is the unregularized counterpart. The researchers show that a regularized regression coreset can't have a smaller size than the optimal coreset of the unregularized variant. This category includes the well-known lasso problem, which therefore prohibits using a coreset smaller than the one required for least squares regression. 

A modified iteration of the lasso problem is proposed, and a coreset of lesser size is obtained compared to the least squares regression. The researchers demonstrate empirically that, similar to the original lasso, the modified version of lasso induces sparsity in solution. The researchers have expanded the methods to include multi-response regularized regression. In conclusion, they provide empirical evidence regarding the performance of the modified lasso.

Sources of Article

Image source: Unsplash

Want to publish your content?

Publish an article and share your insights to the world.

Get Published Icon
ALSO EXPLORE