Calculate paired T-test

Installation

The app is a python application and can be installed with pip:

pip install anon_excel-1.0.0-py3-none-any.whl

This will also install all dependencies (pandas and scipy).

Usage

usage: anonex [-h] [-a] [-c] [-o] [-s] [-t] [-x] folder

This app scans multiple sets of surveys. It offers an option to clean and store the survey data, and also an option to perform and store a T-test analysis. The T-test is only possible when both pre- and post- survey is available.
Optionally personal information is removed.

positional arguments:
  folder           Specify the folder with the excel report(s)

options:
  -h, --help       show this help message and exit
  -a, --anonymize  Anonymize personal data (default = No)
  -c, --color      Add colors in excel file with clean ranked data (default = No)
  -o, --overwrite  Overwrite existing excel outputs (default = No)
  -s, --strip      Strip leading s-char from s-number (default = No)
  -t, --ttest      Perform T-test calculation (default = No)
  -x, --clean      Save cleaned data (default = No)

The surveys

Inputs are surveys with common questions and students. The surveys are in Excel format. Surveys come in a pre-course survey and a post-course survey. The pre-course survey contains the survey result before lecture/workshop/course. The post-course survey contains survey results obtained after the lecture/workshop/course.

Only a single set of pre- and post-course surveys is expected. The filenames of pre- and post-course surveys are unique: the pre-course survey starts with Pre, the post-course surveys starts with Post. For example:

Pre-Course Survey_ Perceived Sense of Community in Blended Learning(1-89).xlsx
Post-Course Survey_ Perceived Sense of Community in Blended Learning(1-34).xlsx

Properties of the surveys:

Both contain student ID's and thus can identify individuals
The lists of students do not have to be identical
The lists of questions do not have to be the same
The answers of the questions are categorical: (Strongly agree (SA), Agree (A), Neutral (N), Disagree (D), Strongly Disagree (SD)

Ranking table

An additional input is the translation table to link the categorical survey values to numerical values. The location is assumed to be in the same folder as the survey data files. For now the name of this ranking table is fixed to:

Scoring.xlsx

[!Note] The ranking table is unique for each set of surveys.

Application requirements

The application can generate multiple outputs: both cleaned data and T-test are optional. When a T-test is calculated cleaning will also be run, but saving the cleaned data is still optional. Cleaned data excel file:

Output of cleanup up data for both pre-survey and post-survey data
Optionally anonymized student information (student number)
Optionally records are color-coded:
- green record: appears in both survey results
- purple/blue record: appears only in pre-survey results
- red record: appears only in post-survey results

Analysis output excel file:

Paired T-test results, for both dimensions: questions and students
Descriptive statistics for the questions
Table with numerical rankings per question for all students: one table with the pre-survey questions, one with the post-survey questions and a third one with the questions exclusive to the post-survey Only questions common to pre- and post- survey are included in the first two tables.

Approach

Find sets of surveys, if none found stop.
Optionally encrypt the student data to remove possibility of identification
Recode the categorical ranking into numerical values
Clean the survey data; for cleaning only post-survey data is not required
Optionally save the cleanup data. Data will be stored in subfolder cleaned
App is finished if no T-test is specified
Extract only the common questions from both surveys
Filter the data to only use data from students participating in both surveys
Calculate the T-test for each question
Calculate the T-test for each student
Save the result to the subfolder analysis

Anonymize student ID's

Turn ID codes in a unique code. This is done with a hashing function called "blake2b". This is a stable hashing function to guarantuee that the hashcode will be the same for the same student each time, as well as unique.

The hashed code will be only be visible in the optional cleaned data, and when the anonymize command line parameter is set appropriately. In the analysis result the anonymized ID's are replaced with human readable ID's: student_nn; all other identifyable data is absent from the analysis results.

The user can control how data is anonymized with the -a option:

if not specified: no anomyzation is stored in the output
if -a is specified: anonymization is applied on the student ID column and an additional column is added, the original ID column is retained.
if -aa is specified: anonymization is applied on the student ID column, an additional column is added, the original ID column is removed

Example command lines

Display help text anonex -h

Clean pre-survey data, removing letters from student numbers anonex -s -x <survey data folder>

Clean pre-survey data, removing non-digits from student numbers and anonymizing student numbers, overwriting any existing previous results. anonex -s -x -aa -o <survey data folder>

Clean pre-survey data, removing non-digits from student numbers and anonymizing student numbers, performing t-test analysis, overwriting any existing previous results. anonex -s -x -t -aa -o <survey data folder>

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
src/anon_excel		src/anon_excel
tests		tests
.coveragerc		.coveragerc
.flake8		.flake8
.gitignore		.gitignore
license.txt		license.txt
pyproject.toml		pyproject.toml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Calculate paired T-test

Installation

Usage

The surveys

Ranking table

Application requirements

Approach

Anonymize student ID's

Example command lines

About

Releases 1

Packages

Contributors 2

Languages

License

WillemNieuwenhuis/anon_excel

Folders and files

Latest commit

History

Repository files navigation

Calculate paired T-test

Installation

Usage

The surveys

Ranking table

Application requirements

Approach

Anonymize student ID's

Example command lines

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages