The goal of this project is to provide implementations of algorithms and data structures from Introduction to Algorithms, Fourth Edition, and from Solutions to exercises and problems. The book was written by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. The solutions are developed by myself, and are currently work in progress. The implementations here serve as a companion project to the solutions and play an invaluable role in verifying the theoretical work in practice. They help detecting and fixing bugs or limitations in the suggested algorithms and data structures.
- Adaptation of pseudocodes from the textbook and from the solutions to a real programming language.
- Implementation of algorithms and data structures with no explicit pseudocodes, yet precisely described in any of the source positions.
- Testing the implementations, especially those from the solutions, to increase confidence in algorithms correctness.
- Laying a foundation for a more practical (or "production-ready") library of algorithms and data structures for real use cases.
A couple of years ago I started implementing the algorithms from the second edition of the book while working on the solutions in Polish. On GitHub there are legacy projects written in Java and Python. Soon after the fourth edition of the book got published, I deprecated those projects and began working on the solutions to the fourth edition by migrating the old material, i.e., adapting it to the new edition and translating it to English.
Python 3 was chosen as a programming language for several reasons:
- Python is widely used in academia and is often the programming language of choice in introductory computer science courses. It is also widely used in many business applications. That makes it widely known in many communities, both academic and professional.
- Python syntax and semantic show many similarities to pseudocode used in the textbook and the solutions. This enables easily transitioning between the two ways of expressing algorithms.
- Python does not limit developers with a single programming paradigm, making it elastic to adapt to different models found in pseudocodes.
- Python typing system resembles the rules followed in pseudocode, particularly dynamic typing and duck typing.
The implementations are written in a way to be as close as possible to the algorithms in the textbook or in the solutions, both in terms of syntax, and behavior. This principle lead to using the procedural paradigm whenever possible and applicable, as well as translating pseudocode instructions to most relevant Python statements or expressions. Algorithms with no pseudocodes provided are implemented in a more flexible and often more concise way, more resembling an idiomatic Python code. The code in such cases is often appropriately structured to increase readability.
The code is located in the src directory split into the book and solutions subdirectories.
This structure makes a clear distinction between the code that implements algorithms and data structures provided by the book, from the code that implements algorithms and data structures suggested in the solutions.
Both subdirectories are further split to organize the code by the chapter they come from.
Aside the src directory there is the similarly structured test directory containing unit tests.
The unit tests are written in the unittest
framework, and most of them follow a property-based testing approach using the Hypothesis library.
To run all tests, run the following command from the main project directory:
PYTHONPATH=./src python3 -m unittest discover -v test/
The table below lists the rules followed in the implementation from translating pseudocode constructs into Python instructions. A pseudocode may also contain lines that don't match any of the categories in this table, like instructions in plain English or more sophisticated mathematical expressions. In such cases the rule is to implement these lines as a separate helper function in Python, named appropriately. The function should implement only the step reflected by that line. Pseudocode comments are ignored.
Category | Pseudocode construct | Python code | Remarks |
---|---|---|---|
predefined constant |
![]() |
True |
|
![]() |
False |
||
![]() |
None |
||
![]() |
math.inf |
||
custom constant |
![]() |
NO_SUCH_PATH |
Defined as a standalone or an enum value. |
variable |
![]() |
x |
|
![]() |
y2_ |
||
![]() |
best_score |
||
object's attribute |
![]() |
T.root |
|
![]() |
A.heap_size |
||
![]() |
x.key[i] |
key is implemented as an Array . |
|
assignment |
![]() |
x = y |
|
scalar comparison |
![]() |
x == y |
x and y are scalars. |
![]() |
x != y |
||
![]() |
x < y |
||
![]() |
x > y |
||
![]() |
x <= y |
||
![]() |
x >= y |
||
pointer comparison |
![]() |
x is y |
x and y are pointers. |
![]() |
x is not y |
||
logical operation |
![]() |
<condition1> and <condition2> |
|
![]() |
<condition1> or <condition2> |
||
![]() |
not <condition> |
||
arithmetic operation |
![]() |
x += y |
|
![]() |
x -= y |
||
![]() |
x + y |
||
![]() |
x - y |
||
![]() |
x * y |
||
![]() |
x / y |
For real x and y . |
|
x // y |
For integer x and y . |
||
![]() |
x // y |
||
![]() |
-(x // -y) |
||
![]() |
x % y |
||
![]() |
x ** y |
||
![]() |
math.sqrt(x) |
||
![]() |
math.floor(x) |
||
![]() |
math.ceil(x) |
||
![]() |
min(x, y) |
||
![]() |
max(x, y) |
||
value exchange |
![]() |
x, y = y, x |
|
![]() |
|||
printing |
![]() |
print(x) |
|
procedure definition |
![]() |
def insertion_sort(A, n):
<block> |
May include type hints for the parameters or the return value. |
procedure call |
![]() |
insertion_sort(A, n) |
|
procedure return |
![]() |
return |
|
![]() |
return x |
||
![]() |
return x, y |
||
array creation |
![]() |
A = Array(0, n) |
The created array consists of None values on each position. |
array cell reference |
![]() |
A[i] |
Either as an accessor or a mutator. |
set creation |
![]() |
S = set() |
|
set membership |
![]() |
x in S |
|
sets union |
![]() |
S | {x} |
|
set cardinality |
![]() |
len(S) |
|
error |
![]() |
raise ValueError('overflow') |
|
if statement |
![]() |
if <condition1>:
<block1>
elif <condition2>:
<block2>
else:
<block3> |
Can have zero or more elif branches and zero or one else branch. |
for loop |
![]() |
for i in range_of(x, to=y):
<block> |
range_of(x, to=y) is equivalent to range(x, y + 1) . |
![]() |
for i in range_of(y, downto=x):
<block> |
range_of(y, downto=x) is equivalent to range(y, x - 1, -1) . |
|
for-each loop |
![]() |
for v in V:
<block> |
|
while loop |
![]() |
while <condition>:
<block> |
|
repeat loop |
![]() |
while True:
<block>
if <condition>:
break |
|
Currently, the implementations are being migrated and adapted from the legacy project by moving the legacy code and keeping it consistent, as I work on solving exercises in each chapter. Therefore, the progress on the implementations almost entirely depends on the progress in the solutions.