PhD Position in Coding Theory and Machine Learning
IMT Atlantique has an open PhD position in machine learning in support of coding theory & practice: learning to code better and decode smarter for reliable and efficient transmission or storage of digital data in uncertain environments

Context and research overview

Digital communication links transmit information bits from here to there. Computer memories and other digital storage systems from now to then. Cloud storage or distributed computing systems do both at the same time. In either case noise or system failures will cause part of the received data to differ from the original message. Forward error-correction (FEC) is the art of introducing controlled redundancy into a digital message, in the form of additional check symbols, to achieve a high degree of reliability over space and time, despite errors or failures. Coding theory is the science of designing efficient FEC codes, that are simple to encode, simple to decode, and meet the target level of performance with as few check bits as possible.

After seventy years of continuing research and practice in coding theory, error-correction at medium to long block length has now reached a mature state. A fair amount of experience has been accumulated on the design of FEC codes and implementation of message-passing decoders that achieve or closely approach the ultimate performance limit (channel capacity) at data rates of several tens or even hundreds of Gb/s [1]. Still, many challenges remain ahead in the field of channel coding. In particular short-packet machine-to-machine communications at the heart of Internet of Things (IoT) have revived interest in the design of efficient short codes for messages ranging from a few tens up to a few hundred bits. Yet much less is known or understood regarding optimal coding at short length. The rules or methods that drive the design of capacity-approaching Low-Density Parity-Check (LDPC), Turbo, or Polar codes are for the most part asymptotic in nature, and thus fail to produce short codes that perform well under message-passing decoding. Experimental evidences collected so far tend to suggest that an unavoidable increase in decoding complexity has to be paid in order to closely approach the short-block-length performance limits [2].

Meanwhile, the superior performance achieved by deep learning on problems notoriously difficult to tackle by conventional approaches in various areas such as computer vision or speech processing over the past decade has fuelled a resurgence of interest in applying machine learning (ML) techniques to a variety of communication problems. The availability of off-the-shelf learning software packages now makes it possible to parametrize communication systems and algorithms, or even replace them by generic, black-box deep neural networks, and to train them to perform at least as well as the state-of-the-art [3]. Application to FEC code design and decoding is no exception to this general trend as both operations require searching into very high-dimensional spaces [4, 5]. Yet the application of ML to channel coding is still in its early stages, and its potential benefits for the field not clear nor well understood.

The central goal of this PhD thesis is to investigate thoroughly how ML can help improving current knowledge and practice in the design and the decoding of short FEC codes. The focus will be placed on LDPC codes, a family of powerful error-correcting codes that has found widespread use in storage and wireless communication systems, e.g. in Wi-Fi or in 5G. Three key and inter-related problems will be addressed: 1) how to design short LDPC codes that perform well under iterative message-passing decoding; 2) how to decode short LDPC codes close to their best possible performance (maximum-likelihood decoding or near); 3) how to design robust LDPC decoders that automatically adapt to unknow channel parameters or mismatches in the assumed channel model, from the received data. Not only are these three problems regarded as ideal playgrounds for assessing the potential benefits of ML for channel coding, but any advances on these questions is also expected to find quick outcomes in emerging communication systems. In addressing these three problems, the PhD candidate will be exposed to a variety of learning approaches, including, but not limited to, supervised Deep Neural Networks (DNN), Reinforcement Learning (RL), and Meta-Learning. Here the intent is not to replace expert knowledge by black-box algorithms, but ultimately to learn from the machine. Accordingly, the driving methodology will be to follow a model-based learning approach, whereby existing code design methods or decoders will be supplemented with learning algorithms wherever relevant, and then the trained solutions will be analyzed in order to gain new hindsight that could translate into improved code design tools and decoders.

Expected start date for the PhD is September 2022 or sooner.

About the project

This PhD work will be supported by the French ANR-21 AI4CODE research project (funding is secured). The AI4CODE project brings together 6 research teams with strong expertise in the design, decoding and standardization of forward-error-correction codes. The aim is to develop skills in artificial intelligence and machine learning, and to explore how learning techniques can contribute to the improvement of code design methods (by using less parameters, more relevant heuristics, producing stronger codes) and decoders (better performance, reduced complexity or energy consumption), on selected scenarios of practical interest for which a full theoretical understanding is still lacking. Our ultimate goal is to obtain new theoretical hindsight that could translate into better codes and decoders.

Project homepage : https://ai4code.projects.labsticc.fr/

About the research team

The PhD candidate will join the team CODES of the UMR CNRS Lab-STICC laboratory, and will be located at the Mathematical & Electrical Engineering (MEE) Dept of IMT Atlantique, Brest. Some of the core FEC solutions at the heart of the 3G/4G and DVB-T2 standards have been invented at IMT Atlantique.

The CODES research team designs and develops channel and source coding solutions for emerging digital communication and storage systems. The primary challenge and driver at the heart of our research work is to propose codes and algorithms that can operate close to the fundamental limits in error-correction and source coding while simultaneously addressing the needs of practical applications either in terms of systems specifications (e.g. short packets, distributed network architectures, uncoordinated communications, etc) or hardware constraints (energy consumption, ultra-high-rate transmission, non-faulty computations, etc).

The PhD work will be jointly supervised by Prof. Charbel Abdel-Nour, Dr Raphaël Le Bidan, and Dr Elsa Dupraz.

Candidate profile

MSc degree or equivalent in one of the following areas: information theory, communication theory, signal processing, or applied mathematics. Some previous experience with Machine Learning algorithms will be highly appreciated. The candidate should also be familiar and comfortable with Python programming.

How to apply

Candidates are invited to send an application email to [email protected] with copy to [email protected] and [email protected], describing in a few lines your interest and skills for the proposed research work, along with:

  • A full CV with a list of past projects and courses related to the subject
  • Complete academic records, from Bachelor to MSc
  • The name of one or two reference persons (past advisors) that we could contact

Applications will be reviewed and processed on a rolling basis until a candidate is selected.

References

[1] N. Wehn, “Channel Coding for Tbit/s Communications: An Implementation Centric View”, Proc. European Conf. on Networks and Communications (EuCNC), Valencia, Spain, June 2019. Online

[2] M. C. Coskun, G. Durisi, T. Jerkovits, G. Liva, W. Ryan, B. Stein, and F. Steiner, “Efficient error-correcting codes in the short blocklength regime”, Phys. Commun., vol 34, June 2019. arXiv:1812.08562

[3] T. O’Shea and J. Hoydis, “An introduction to deep learning for the physical layer”, IEEE Trans. Cogn. Commun. Networks, vol. 3, no. 4, Dec. 2017. arXiv:1702.00832

[4] L. Huang et al, “AI coding: Learning to construct error-correcting codes”, IEEE Trans. Commun., vol. 68, n°1, Jan 2020. arXiv:1901.05719

[5] E. Nachmany et al, “Deep learning methods for improved decoding of linear codes”, IEEE J. Selec. Topics Signal Process., vol. 12, no. 1, Feb. 2018. arXiv:1706.07043