MTH6108 Coding theory

MTH6108 Coding theory

This is the home page for the Coding theory module, to be given in the second semester of 2009–10 at Queen Mary, University of London.


Lectures, classes and office hours

Lectures

  • Monday 10, Eng 325
  • Monday 12, CMLT
  • Thursday 3, Eng 324

Classes

  • Friday 11, Geog 226
(The second class has been cancelled, due to lack of attendance.)

Office hours (in 152)

  • Mondays 1:20–2:20
  • Thursdays 1:40–2:40

(Note: in practice I can manage most days if you want to see me about the course; but it will help a lot if you email the day before.)


Lecture summaries

I hope to put here scans of all the summaries I display at the start of lectures.


Coursework questions

There will probably be one coursework sheet every week except in weeks 1,7 and 12. That makes nine courseworks, and recent policy (which I expect to follow again) is that the best seven attempts out of nine will count. I'll put the sheets here as well as giving out paper copies in lectures, and there will be some model solutions.


Some other things

Here is the table of known values of A2(n,d) that I showed a few times.

Here is the graph I showed of the capacity of a binary channel.


Background to the course

Modern communications are subject to noisy interference, and messages can get distorted. Therefore a popular technique is to include redundant information in the message in order that the recipient can detect, and possibly even correct, errors. The very simplest example of this is just to transmit the message twice; then if the recipient receives two non-identical messages, he knows that there's been an error in transmission and he can ask for the message to be sent again. You'll have experienced this yourself: when you sign up to a website which asks you to create a password, it will ask you type in your password twice; then if the two passwords you type in are not identical, it will tell you that you've made an error. (In this case, the "error in transmission" is human error.)

A slightly less simple example is supermarket bar codes. Some of these are eight digits and some are thirteen; let's stick to the thirteen-digit ones. Often in the supermarket the bar code won't scan because it's torn or creased, and the cashier has to type in the number. If the cashier makes a single error in typing in the thirteen digits, then the product doesn't register as something different; instead, the cashier gets an error message. The way this works is that not every thirteen-digit number is allowed to be used on a bar code; in fact, only one tenth of the possible thriteen-digit numbers are allowed, and it's set up in such a way that there aren't two bar codes that differ by a single digit.

To summarise: we can get improved accuracy by only allowing ourselves to use a certain set of "words" (or in the bar code case, thirteen-digit numbers). In this course, a set of words is called a code, and the course is about finding codes which are "good", in the sense that:

  1. any two words in the codes are sufficiently different that one can't be turned into another by a transmission error; and
  2. there are as many different words in the code as possible (so that we can transmit a variety of messages without having to transmit a lot of words).

We'll see some constructions of nice codes, as well as some theorems that tell us how good codes can be. Also we'll consider the problem of decoding, i.e. correcting transmission errors.


Prerequisites

The only official prerequisite for this module is MTH5112 Linear Algebra I. Note, however, that this course being a prerequisite doesn't mean that you should have been to the lectures for this course, or that you should have taken the exam for this course, or even that you should have done well in the exam for this course. It means that you are familiar and comfortable with the material in the course. The particular parts of Linear Algebra I that I'd like you to be familiar with are:

This will all be revised, but, as with all lecture courses, it will be much easier to understand if you've already learnt/revised the material.

Also, you will need to know very basic combinatorial concepts, in particular binomial coefficients. This doesn't mean you need to have done MTH6109 Combinatorics, but you need to be receptive to some of the same ideas.


Syllabus

The concept of an error-correcting code is a very important one, with wide applications in communications. This module approaches the subject from a pure-mathematics perspective, to give the student a thorough grounding in construction of codes and decoding algorithms, and the main coding theory problem.

Topics

  1. Basic concepts: codes, minimum distance, equivalence of codes.
  2. The Main Coding Theory Problem. The Hamming, Singleton and Plotkin bounds.
  3. Noisy channels and nearest-neighbour decoding. Statement of Shannon's Theorem.
  4. Linear codes. Generator matrices and parity-check matrices. Syndrome decoding. (A very brief review of the required linear algebra will be given.)
  5. Examples: Hamming, Reed–Muller and MDS codes. The Gilbert–Varshamov bound.

Learning outcomes

By the end of the course, a student should be able to:

  1. define error-detecting and error-correcting codes, explain their significance and construct simple examples, such as repetition and parity-check codes;
  2. define the constants Aq(n,d) and calculate small examples;
  3. understand the definition and advantages of linear codes;
  4. define, construct and manipulate generator matrices and parity-check matrices;
  5. decode linear codes via Slepian arrays and syndrome decoding;
  6. discuss error probabilities and state Shannon's theorem;
  7. give (with proof) bounds on the sizes of codes – the Hamming, Singleton, Plotkin and Gilbert–Varshamov bounds;
  8. construct Hamming codes and MDS codes.

The exam

Rubric

There has been a bit of confusion about exam rubrics; the new regulations about answering all the questions do not apply to level 6. In recent years, the rubric for Coding Theory has been "answer any four questions", with the number of questions on the paper ranging from 6 to 8. I can't see a reason why this will be any different in 2010.

Some past exam papers

Some revision tips