docs: ported docs

knakamura13 · Aug 14, 2024 · fdb3889 · fdb3889
1 parent 8885a58
commit fdb3889
Show file tree

Hide file tree

Showing 28 changed files with 1,351 additions and 270 deletions.
diff --git a/docs/conf.py b/docs/conf.py
diff --git a/docs/source/intro.md → docs/docs/about.md b/docs/source/intro.md → docs/docs/about.md
@@ -1,57 +1,47 @@
-# Overview
-
+## Overview
 mlrose is a Python package for applying some of the most common randomized optimization and search algorithms to a range of different optimization problems, over both discrete- and continuous-valued parameter spaces.
 
-## Project Background
-
+### Project Background
 mlrose was initially developed to support students of Georgia Tech’s OMSCS/OMSA offering of CS 7641: Machine Learning.
 
 It includes implementations of all randomized optimization algorithms taught in this course, as well as functionality to apply these algorithms to integer-string optimization problems, such as N-Queens and the Knapsack problem; continuous-valued optimization problems, such as the neural network weight problem; and tour optimization problems, such as the Travelling Salesperson problem. It also has the flexibility to solve user-defined optimization problems.
 
 At the time of development, there did not exist a single Python package that collected all of this functionality together in the one location.
 
-## Main Features
-
+### Main Features
 **Randomized Optimization Algorithms**
-
 * Implementations of: hill climbing, randomized hill climbing, simulated annealing, genetic algorithm and (discrete) MIMIC;
 * Solve both maximization and minimization problems;
 * Define the algorithm’s initial state or start from a random state;
 * Define your own simulated annealing decay schedule or use one of three pre-defined, customizable decay schedules: geometric decay, arithmetic decay or exponential decay.
 
 **Problem Types**
-
 * Solve discrete-value (bit-string and integer-string), continuous-value and tour optimization (travelling salesperson) problems;
 * Define your own fitness function for optimization or use a pre-defined function.
 * Pre-defined fitness functions exist for solving the: One Max, Flip Flop, Four Peaks, Six Peaks, Continuous Peaks, Knapsack, Travelling Salesperson, N-Queens and Max-K Color optimization problems.
 
 **Machine Learning Weight Optimization**
-
 * Optimize the weights of neural networks, linear regression models and logistic regression models using randomized hill climbing, simulated annealing, the genetic algorithm or gradient descent;
 * Supports classification and regression neural networks.
 
-<a id="install"></a>
-
-## Installation
+### Installation
+mlrose-ky was written in Python 3 and requires NumPy, SciPy and Scikit-Learn (sklearn).
 
-mlrose was written in Python 3 and requires NumPy, SciPy and Scikit-Learn (sklearn).
+The latest released version is available at the [Python package index](#) and can be installed using pip:
 
-The latest released version is available at the [Python package index](https://pypi.org/project/mlrose/) and can be installed using pip:
 
-```default
-pip install mlrose
+``` pip
+pip install mlrose-ky
 ```
 
-## Licensing, Authors, Acknowledgements
 
+### Licensing, Authors, Acknowledgements
 mlrose was written by Genevieve Hayes and is distributed under the [3-Clause BSD license](https://github.com/gkhayes/mlrose/blob/master/LICENSE). The source code is maintained in a [GitHub repository](https://github.com/gkhayes/mlrose).
 
 You can cite mlrose in research publications and reports as follows:
-
 * Hayes, G. (2019). *mlrose: Machine Learning, Randomized Optimization and SEarch package for Python*. [https://github.com/gkhayes/mlrose](https://github.com/gkhayes/mlrose). Accessed: *day month year*.
 
 BibTeX entry:
-
 ```default
 @misc{Hayes19,
  author = {Hayes, G},

diff --git a/docs/docs/algorithms.md b/docs/docs/algorithms.md
@@ -0,0 +1,143 @@
+## Algorithms
+Functions to implement the randomized optimization and search algorithms.
+
+> [!NOTE] Recommendation
+> The below functions are implemented within mlrose-ky. However, it is highly recommended to use the [Runners](/runners/) for assignment.
+
+### Hill Climbing
+
+> [!INFO] Function
+> `hill_climb`(_problem_, _max\_iters=inf_, _restarts=0_, _init\_state=None_, _curve=False_, _random\_state=None_)
+
+Use standard hill climbing to find the optimum for a given optimization problem.
+
+**Parameters:**
+
+*   **problem** (_optimization object_) – Object containing fitness function optimization problem to be solved. For example, `DiscreteOpt()`, `ContinuousOpt()` or `TSPOpt()`.
+*   **max\_iters** (_int, default: np.inf_) – Maximum number of iterations of the algorithm for each restart.
+*   **restarts** (_int, default: 0_) – Number of random restarts.
+*   **init\_state** (_array, default: None_) – 1-D Numpy array containing starting state for algorithm. If `None`, then a random state is used.
+*   **curve** (_bool, default: False_) – Boolean to keep fitness values for a curve. If `False`, then no curve is stored. If `True`, then a history of fitness values is provided as a third return value.
+*   **random\_state** (_int, default: None_) – If random\_state is a positive integer, random\_state is the seed used by np.random.seed(); otherwise, the random seed is not set.
+
+**Returns:**
+
+*   **best\_state** (_array_) – Numpy array containing state that optimizes the fitness function.
+*   **best\_fitness** (_float_) – Value of fitness function at best state.
+*   **fitness\_curve** (_array_) – Numpy array containing the fitness at every iteration. Only returned if input argument `curve` is `True`.
+
+#### References
+
+Russell, S. and P. Norvig (2010). _Artificial Intelligence: A Modern Approach_, 3rd edition. Prentice Hall, New Jersey, USA.
+
+### Random Hill Climbing
+
+> [!INFO] Function
+> `random_hill_climb`(_problem_, _max\_attempts=10_, _max\_iters=inf_, _restarts=0_, _init\_state=None_, _curve=False_, _random\_state=None_)
+
+Use randomized hill climbing to find the optimum for a given optimization problem.
+
+**Parameters**:
+
+*   **problem** (_optimization object_) – Object containing fitness function optimization problem to be solved. For example, `DiscreteOpt()`, `ContinuousOpt()` or `TSPOpt()`.
+*   **max\_attempts** (_int, default: 10_) – Maximum number of attempts to find a better neighbor at each step.
+*   **max\_iters** (_int, default: np.inf_) – Maximum number of iterations of the algorithm.
+*   **restarts** (_int, default: 0_) – Number of random restarts.
+*   **init\_state** (_array, default: None_) – 1-D Numpy array containing starting state for algorithm. If `None`, then a random state is used.
+*   **curve** (_bool, default: False_) – Boolean to keep fitness values for a curve. If `False`, then no curve is stored. If `True`, then a history of fitness values is provided as a third return value.
+*   **random\_state** (_int, default: None_) – If random\_state is a positive integer, random\_state is the seed used by np.random.seed(); otherwise, the random seed is not set.
+
+**Returns**:
+
+*   **best\_state** (_array_) – Numpy array containing state that optimizes the fitness function.
+*   **best\_fitness** (_float_) – Value of fitness function at best state.
+*   **fitness\_curve** (_array_) – Numpy array containing the fitness at every iteration. Only returned if input argument `curve` is `True`.
+
+#### References
+
+Brownlee, J (2011). _Clever Algorithms: Nature-Inspired Programming Recipes_. [http://www.cleveralgorithms.com](http://www.cleveralgorithms.com/).
+
+### Simulated Annealing
+
+> [!INFO] Function
+> `simulated_annealing`(_problem_, _schedule=<mlrose.decay.GeomDecay object>_, _max\_attempts=10_, _max\_iters=inf_, _init\_state=None_, _curve=False_, _random\_state=None_)
+
+Use simulated annealing to find the optimum for a given optimization problem.
+
+**Parameters**:
+
+*   **problem** (_optimization object_) – Object containing fitness function optimization problem to be solved. For example, `DiscreteOpt()`, `ContinuousOpt()` or `TSPOpt()`.
+*   **schedule** (schedule object, default: `mlrose.GeomDecay()`) – Schedule used to determine the value of the temperature parameter.
+*   **max\_attempts** (_int, default: 10_) – Maximum number of attempts to find a better neighbor at each step.
+*   **max\_iters** (_int, default: np.inf_) – Maximum number of iterations of the algorithm.
+*   **init\_state** (_array, default: None_) – 1-D Numpy array containing starting state for algorithm. If `None`, then a random state is used.
+*   **curve** (_bool, default: False_) – Boolean to keep fitness values for a curve. If `False`, then no curve is stored. If `True`, then a history of fitness values is provided as a third return value.
+*   **random\_state** (_int, default: None_) – If random\_state is a positive integer, random\_state is the seed used by np.random.seed(); otherwise, the random seed is not set.
+
+**Returns**:
+
+*   **best\_state** (_array_) – Numpy array containing state that optimizes the fitness function.
+*   **best\_fitness** (_float_) – Value of fitness function at best state.
+*   **fitness\_curve** (_array_) – Numpy array containing the fitness at every iteration. Only returned if input argument `curve` is `True`.
+
+#### References
+
+Russell, S. and P. Norvig (2010). _Artificial Intelligence: A Modern Approach_, 3rd edition. Prentice Hall, New Jersey, USA.
+
+### Genetic Algorithms
+
+> [!INFO] Function
+> `genetic_alg`(_problem_, _pop\_size=200_, _mutation\_prob=0.1_, _max\_attempts=10_, _max\_iters=inf_, _curve=False_, _random\_state=None_)
+
+Use a standard genetic algorithm to find the optimum for a given optimization problem.
+
+**Parameters**:
+
+*   **problem** (_optimization object_) – Object containing fitness function optimization problem to be solved. For example, `DiscreteOpt()`, `ContinuousOpt()` or `TSPOpt()`.
+*   **pop\_size** (_int, default: 200_) – Size of population to be used in genetic algorithm.
+*   **mutation\_prob** (_float, default: 0.1_) – Probability of a mutation at each element of the state vector during reproduction, expressed as a value between 0 and 1.
+*   **max\_attempts** (_int, default: 10_) – Maximum number of attempts to find a better state at each step.
+*   **max\_iters** (_int, default: np.inf_) – Maximum number of iterations of the algorithm.
+*   **curve** (_bool, default: False_) – Boolean to keep fitness values for a curve. If `False`, then no curve is stored. If `True`, then a history of fitness values is provided as a third return value.
+*   **random\_state** (_int, default: None_) – If random\_state is a positive integer, random\_state is the seed used by np.random.seed(); otherwise, the random seed is not set.
+
+**Returns**:
+
+*   **best\_state** (_array_) – Numpy array containing state that optimizes the fitness function.
+*   **best\_fitness** (_float_) – Value of fitness function at best state.
+*   **fitness\_curve** (_array_) – Numpy array of arrays containing the fitness of the entire population at every iteration. Only returned if input argument `curve` is `True`.
+
+#### References
+
+Russell, S. and P. Norvig (2010). _Artificial Intelligence: A Modern Approach_, 3rd edition. Prentice Hall, New Jersey, USA.
+
+### MIMIC
+
+> [!INFO] Function
+> `mimic`(_problem_, _pop\_size=200_, _keep\_pct=0.2_, _max\_attempts=10_, _max\_iters=inf_, _curve=False_, _random\_state=None_, _fast\_mimic=False_)
+
+Use MIMIC to find the optimum for a given optimization problem.
+> [!DANGER] Warning
+> MIMIC cannot be used for solving continuous-state optimization problems.
+
+**Parameters**:
+
+*   **problem** (_optimization object_) – Object containing fitness function optimization problem to be solved. For example, `DiscreteOpt()` or `TSPOpt()`.
+*   **pop\_size** (_int, default: 200_) – Size of population to be used in algorithm.
+*   **keep\_pct** (_float, default: 0.2_) – Proportion of samples to keep at each iteration of the algorithm, expressed as a value between 0 and 1.
+*   **max\_attempts** (_int, default: 10_) – Maximum number of attempts to find a better neighbor at each step.
+*   **max\_iters** (_int, default: np.inf_) – Maximum number of iterations of the algorithm.
+*   **curve** (_bool, default: False_) – Boolean to keep fitness values for a curve. If `False`, then no curve is stored. If `True`, then a history of fitness values is provided as a third return value.
+*   **random\_state** (_int, default: None_) – If random\_state is a positive integer, random\_state is the seed used by np.random.seed(); otherwise, the random seed is not set.
+*   **fast\_mimic** (_bool, default: False_) – Activate fast mimic mode to compute the mutual information in vectorized form. Faster speed but requires more memory.
+
+**Returns**:
+
+*   **best\_state** (_array_) – Numpy array containing state that optimizes the fitness function.
+*   **best\_fitness** (_float_) – Value of fitness function at best state.
+*   **fitness\_curve** (_array_) – Numpy array containing the fitness at every iteration. Only returned if input argument `curve` is `True`.
+
+#### References
+
+De Bonet, J., C. Isbell, and P. Viola (1997). MIMIC: Finding Optima by Estimating Probability Densities. In _Advances in Neural Information Processing Systems_ (NIPS) 9, pp. 424–430.
+