Changes after sphinx run

tirthajyoti · Jul 22, 2019 · b71b1d7 · b71b1d7
1 parent 4826d17
commit b71b1d7
Show file tree

Hide file tree

Showing 8 changed files with 659 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -102,3 +102,7 @@ venv.bak/
 
 # mypy
 .mypy_cache/
+
+_build
+_static
+_templates
diff --git a/MANIFEST.in b/MANIFEST.in
@@ -0,0 +1,2 @@
+# Include the license file
+include LICENSE.txt
diff --git a/README.md b/README.md
@@ -1,3 +1,8 @@
+# DOEPY (`pip install doepy`)
+---
+![doe-1](https://raw.githubusercontent.com/tirthajyoti/doepy/master/images/doe_1.PNG)
+#### Authored and maiantained by Dr. Tirthajyoti Sarkar, Fremont, CA 94536 (https://tirthajyoti.github.io)
+
 ## Introduction
 [Design of Experiment (DOE)](https://en.wikipedia.org/wiki/Design_of_experiments) is an important activity for any scientist, engineer, or statistician planning to conduct experimental analysis. This exercise has become **critical in this age of rapidly expanding field of data science and associated statistical modeling and machine learning**. A well-planned DOE can give a researcher meaningful data set to act upon with optimal number of experiments preserving critical resources.
 
@@ -16,6 +21,8 @@ Need for careful design of experiment arises in all fields of serious scientific
 ### Options for open-source DOE builder package in Python?
 Unfortunately, majority of the state-of-the-art DOE generators are part of commercial statistical software packages like [JMP (SAS)](https://www.jmp.com/) or [Minitab](www.minitab.com/en-US/default.aspx). However, a researcher will surely be benefited if there exists an open-source code which presents an intuitive user interface for generating an experimental design plan from a simple list of input variables. There are a couple of DOE builder Python packages but individually they don’t cover all the necessary DOE methods and they lack a simplified user API, where one can just input a CSV file of input variables’ range and get back the DOE matrix in another CSV file.
 
+---
+
 ## Features
 This set of codes is a collection of functions which wrap around the core packages (mentioned below) and generate **design-of-experiment (DOE) matrices** for a statistician or engineer from an arbitrary range of input variables.
 
@@ -43,6 +50,8 @@ In this way, ***the only API user needs to be exposed to, are input and output C
 * Halton sequence based,
 * Uniform random matrix
 
+---
+
 ## How to use it?
 ### What supporitng packages are required?
 First make sure you have all the necessary packages installed. You can simply run the .bash (Unix/Linux) and .bat (Windows) files provided in the repo, to install those packages from your command line interface. They contain the following commands,
@@ -108,6 +117,8 @@ read_write.write_csv(df_lhs,filename=filename)
 
 You should see a `lhs.csv` file in your directory.
 
+---
+
 ## Acknowledgements and Requirements
 The code was written in Python 3.7. It uses following external packages that needs to be installed on your system to use it,
 * pydoe: A package designed to help the scientist, engineer, statistician, etc., to construct appropriate experimental designs. [Check the docs here](https://pythonhosted.org/pyDOE/).

diff --git a/README.rst b/README.rst
@@ -0,0 +1,266 @@
+=======
+DOEPY
+=======
+----------------------------------------------------------------------
+A Python package for easily generating design of experiment tables
+----------------------------------------------------------------------
+.. image:: https://raw.githubusercontent.com/tirthajyoti/doepy/master/images/doe_1.PNG
+
+Authored and maiantained by `Dr. Tirthajyoti Sarkar <https://www.linkedin.com/in/tirthajyoti-sarkar-2127aa7/>`_, Fremont, California.
+
+Check my website: https://tirthajyoti.github.io
+
+Introduction
+------------
+
+`Design of Experiment
+(DOE) <https://en.wikipedia.org/wiki/Design_of_experiments>`__ is an
+important activity for any scientist, engineer, or statistician planning
+to conduct experimental analysis. This exercise has become **critical in
+this age of rapidly expanding field of data science and associated
+statistical modeling and machine learning**. A well-planned DOE can give
+a researcher meaningful data set to act upon with optimal number of
+experiments preserving critical resources.
+
+    After all, aim of Data Science is essentially to conduct highest
+    quality scientific investigation and modeling with real world data.
+    And to do good science with data, one needs to collect it through
+    carefully thought-out experiment to cover all corner cases and
+    reduce any possible bias.
+
+What is a scientific experiment?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In its simplest form, a scientific experiment aims at predicting the
+outcome by introducing a change of the preconditions, which is
+represented by one or more `independent
+variables <https://en.wikipedia.org/wiki/Dependent_and_independent_variables>`__,
+also referred to as “input variables” or “predictor variables.” The
+change in one or more independent variables is generally hypothesized to
+result in a change in one or more `dependent
+variables <https://en.wikipedia.org/wiki/Dependent_and_independent_variables>`__,
+also referred to as “output variables” or “response variables.” The
+experimental design may also identify `control
+variables <https://en.wikipedia.org/wiki/Controlling_for_a_variable>`__
+that must be held constant to prevent external factors from affecting
+the results.
+
+What is Experimental Design?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Experimental design involves not only the selection of suitable
+independent, dependent, and control variables, but planning the delivery
+of the experiment under statistically optimal conditions given the
+constraints of available resources. There are multiple approaches for
+determining the set of design points (unique combinations of the
+settings of the independent variables) to be used in the experiment.
+
+Main concerns in experimental design include the establishment of
+`validity <https://en.wikipedia.org/wiki/Validity_%28statistics%29>`__,
+`reliability <https://en.wikipedia.org/wiki/Reliability_%28statistics%29>`__,
+and `replicability <https://en.wikipedia.org/wiki/Reproducibility>`__.
+For example, these concerns can be partially addressed by carefully
+choosing the independent variable, reducing the risk of measurement
+error, and ensuring that the documentation of the method is sufficiently
+detailed. Related concerns include achieving appropriate levels of
+`statistical power <https://en.wikipedia.org/wiki/Statistical_power>`__
+and
+`sensitivity <https://en.wikipedia.org/wiki/Sensitivity_and_specificity>`__.
+
+Need for careful design of experiment arises in all fields of serious
+scientific, technological, and even social science
+investigation — \ *computer science, physics, geology, political
+science, electrical engineering, psychology, business marketing
+analysis, financial analytics*, etc…
+
+Options for open-source DOE builder package in Python?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Unfortunately, majority of the state-of-the-art DOE generators are part
+of commercial statistical software packages like `JMP
+(SAS) <https://www.jmp.com/>`__ or
+`Minitab <www.minitab.com/en-US/default.aspx>`__. However, a researcher
+will surely be benefited if there exists an open-source code which
+presents an intuitive user interface for generating an experimental
+design plan from a simple list of input variables. There are a couple of
+DOE builder Python packages but individually they don’t cover all the
+necessary DOE methods and they lack a simplified user API, where one can
+just input a CSV file of input variables’ range and get back the DOE
+matrix in another CSV file.
+
+--------------
+
+Features
+--------
+
+This set of codes is a collection of functions which wrap around the
+core packages (mentioned below) and generate **design-of-experiment
+(DOE) matrices** for a statistician or engineer from an arbitrary range
+of input variables.
+
+Limitation of the foundation packages used
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Both the core packages, which act as foundations to this repo, are not
+complete in the sense that they do not cover all the necessary functions
+to generate DOE table that a design engineer may need while planning an
+experiment. Also, they offer only low-level APIs in the sense that the
+standard output from them are normalized numpy arrays. It was felt that
+users, who may not be comfortable in dealing with Python objects
+directly, should be able to take advantage of their functionalities
+through a simplified user interface.
+
+Simplified user interface
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+**User just needs to provide a simple CSV file with a single table of
+variables and their ranges (2-level i.e. min/max or 3-level).** Some of
+the functions work with 2-level min/max range while some others need
+3-level ranges from the user (low-mid-high). Intelligence is built into
+the code to handle the case if the range input is not appropriate and to
+generate levels by simple linear interpolation from the given input. The
+code will generate the DOE as per user's choice and write the matrix in
+a CSV file on to the disk.
+
+In this way, **the only API user needs to be exposed to, are input and
+output CSV files. These files then can be used in any engineering
+simulator, software, process-control module, or fed into process
+equipments.**
+
+Designs available
+~~~~~~~~~~~~~~~~~
+
+-  Full factorial,
+-  2-level fractional factorial,
+-  Plackett-Burman,
+-  Sukharev grid,
+-  Box-Behnken,
+-  Box-Wilson (Central-composite) with center-faced option,
+-  Box-Wilson (Central-composite) with center-inscribed option,
+-  Box-Wilson (Central-composite) with center-circumscribed option,
+-  Latin hypercube (simple),
+-  Latin hypercube (space-filling),
+-  Random k-means cluster,
+-  Maximin reconstruction,
+-  Halton sequence based,
+-  Uniform random matrix
+
+--------------
+
+How to use it?
+--------------
+
+What supporitng packages are required?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+First make sure you have all the necessary packages installed. You can
+simply run the .bash (Unix/Linux) and .bat (Windows) files provided in
+the repo, to install those packages from your command line interface.
+They contain the following commands,
+
+::
+
+    pip install numpy
+    pip install pandas
+    pip install pydoe
+    pip install diversipy
+
+How to install the package?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You can pip install the package!
+
+``pip install doepy``
+
+Quick start
+~~~~~~~~~~~
+
+Let's say you have a design problem with the following table for the
+parameters range. Imagine this as a generic example of a checmical
+process in a manufacturing plant. You have 3 levels of ``Pressure``, 3
+levels of ``Temperature``, 2 levels of ``FlowRate``, and 2 levels of
+``Time``.
+
+| ``Pressure``: 40/55/70
+| ``Temperature``: 290/320/350
+| ``FlowRate``: 0.2/0.4
+| ``Time``: 5/8
+
+First, import ``build`` module from the package,
+
+``from doepy import build``
+
+| Then, try a simple example by building a **full factorial design**. We will use ``build.full_fact()`` function for this. You have to pass a dictionary object to the function which encodes your experimental data.
+
+::
+
+    build.full_fact({'Pressure':[40,55,70],'Temperature':[290, 320, 350],
+    'Flow rate':[0.2,0.4], 'Time':[5,8]})
+
+If you build a full-factorial DOE out of this, you should get a table with 3 x 3 x 2 x 2 = 36 entries.
+
+Other functions to try on
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Try other functions like ``build.space_filling_lhs()`` to construct a
+`space-filling Latin hypercube
+design <https://en.wikipedia.org/wiki/Latin_hypercube_sampling>`__.
+
+Or try from one of the following available design options...
+
+-  Full factorial: ``build.full_fact()``
+-  2-level fractional factorial: ``build.frac_fact_res()``
+-  Plackett-Burman: ``build.plackett_burman()``
+-  Sukharev grid: ``build.sukharev()``
+-  Box-Behnken: ``build.box_behnken()``
+-  Box-Wilson (Central-composite) with center-faced option: ``build.central_composite()`` with ``face='ccf'`` option
+-  Box-Wilson (Central-composite) with center-inscribed option: ``build.central_composite()`` with ``face='cci'`` option
+-  Box-Wilson (Central-composite) with center-circumscribed option: ``build.central_composite()`` with ``face='ccc'`` option
+-  Latin hypercube (simple): ``build.lhs()``
+-  Latin hypercube (space-filling): ``build.space_filling_lhs()``
+-  Random k-means cluster: ``build.random_k_means()``
+-  Maximin reconstruction: ``build.maximin()``
+-  Halton sequence based: ``build.halton()``
+-  Uniform random matrix: ``build.uniform_random()``
+
+Read from and write to CSV files
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Internally, you pass on a dictionary object and get back a Pandas
+DataFrame. But, for reading from and writing to CSV files, you have to
+use the ``read_write`` module of the package.
+
+::
+
+    from doepy import read_write
+    data_in=read_write.read_variables_csv('../Data/params.csv')
+
+Then you can use this ``data_in`` object in the DOE generating
+functions.
+
+For writing back to a CSV,
+
+::
+
+    df_lhs=build.space_filling_lhs(data_in,num_samples=100)
+    filename = 'lhs'
+    read_write.write_csv(df_lhs,filename=filename)
+
+You should see a ``lhs.csv`` file in your directory.
+
+--------------
+
+Acknowledgements and Requirements
+---------------------------------
+
+The code was written in Python 3.7. It uses following external packages
+that needs to be installed on your system to use it,
+
+-  ``pydoe``: A package designed to help the scientist, engineer,
+   statistician, etc., to construct appropriate experimental designs.
+   `Check the docs here <https://pythonhosted.org/pyDOE/>`__.
+-  ``diversipy``: A collection of algorithms for sampling in hypercubes,
+   selecting diverse subsets, and measuring diversity. `Check the docs
+   here <https://www.simonwessing.de/diversipy/doc/>`__.
+-  ``numpy``
+-  ``pandas``
diff --git a/docs/Makefile b/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
diff --git a/docs/conf.py b/docs/conf.py
@@ -0,0 +1,55 @@
+# Configuration file for the Sphinx documentation builder.
+#
+# This file only contains a selection of the most common options. For a full
+# list see the documentation:
+# http://www.sphinx-doc.org/en/master/config
+
+# -- Path setup --------------------------------------------------------------
+
+# If extensions (or modules to document with autodoc) are in another directory,
+# add these directories to sys.path here. If the directory is relative to the
+# documentation root, use os.path.abspath to make it absolute, like shown here.
+#
+# import os
+# import sys
+# sys.path.insert(0, os.path.abspath('.'))
+
+
+# -- Project information -----------------------------------------------------
+
+project = 'doepy'
+copyright = '2019, Tirthajyoti Sarkar'
+author = 'Tirthajyoti Sarkar'
+
+# The full version, including alpha/beta/rc tags
+release = '0.0.1'
+
+
+# -- General configuration ---------------------------------------------------
+
+# Add any Sphinx extension module names here, as strings. They can be
+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
+# ones.
+extensions = [
+]
+
+# Add any paths that contain templates here, relative to this directory.
+templates_path = ['_templates']
+
+# List of patterns, relative to source directory, that match files and
+# directories to ignore when looking for source files.
+# This pattern also affects html_static_path and html_extra_path.
+exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
+
+
+# -- Options for HTML output -------------------------------------------------
+
+# The theme to use for HTML and HTML Help pages.  See the documentation for
+# a list of builtin themes.
+#
+html_theme = 'alabaster'
+
+# Add any paths that contain custom static files (such as style sheets) here,
+# relative to this directory. They are copied after the builtin static files,
+# so a file named "default.css" will overwrite the builtin "default.css".
+html_static_path = ['_static']
-Original file line number
+Diff line change
@@ Expand Up / @@ -102,3 +102,7 @@ venv.bak/ @@
     # mypy
     .mypy_cache/
+    _build
+    _static
+    _templates
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,2 @@
		# Include the license file
		include LICENSE.txt