Abstract
A fundamental problem in coding theory concerns
the computation of the maximum cardinality of a
set S of length n code words over an alphabet of
size q, such that every pair of code words has Hamming distance at least d, and the set of additional
constraints U on S is satisfied. This problem has
application in several areas, one of which is the
design of DNA codes where q = 4 and the alphabet is { A, C, G, T }. We describe a new constraint model for this problem and demonstrate that
it improves on previous solutions (computes better lower bounds) for various instances of the problem. Our approach is based on a clustering of DNA
words into small sets of words. Solutions are then
obtained as the union of such clusters. Our approach is SAT based: we specify constraints on
clusters of DNA words and solve these using a
Boolean satisfiability solver