Skip to content

Commit

Permalink
Another refactoring
Browse files Browse the repository at this point in the history
  • Loading branch information
ypriverol committed Sep 23, 2024
1 parent 7a08e82 commit 5e56b21
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 8 deletions.
10 changes: 5 additions & 5 deletions docs/README.data.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
## fsspark - data structures
## fslite - data structures

---

`fsspark` is a Python package that provides a set of tools for feature selection in Spark.
Here we describe the main data structures used in `fsspark` and how to use them.
`fslite` is a Python package that provides a set of tools for feature selection in Spark.
Here we describe the main data structures used in `fslite` and how to use them.

### Input data

Expand Down Expand Up @@ -32,7 +32,7 @@ The following is an example of a TSV file with a binary response variable:

### Import functions

`fsspark` provides two main functions to import data from a TSV file.
`fslite` provides two main functions to import data from a TSV file.

- `import_table` - Import data from a TSV file into a Spark Data Frame (sdf).

Expand All @@ -57,7 +57,7 @@ psdf = import_table_as_psdf('data.tsv.bgz',

### The Feature Selection Spark Data Frame (FSDataFrame)

The `FSDataFrame` (**Figure 1**) is a core functionality of `fsspark`. It is a wrapper around a Spark Data Frame (sdf)
The `FSDataFrame` (**Figure 1**) is a core functionality of `fslite`. It is a wrapper around a Spark Data Frame (sdf)
that provides a set of methods to facilitate feature selection tasks. The `FSDataFrame` is initialized
with a Spark Data Frame (sdf) or a Pandas on Spark Data Frame (psdf) and two mandatory arguments:
`sample_col` and `label_col`. The `sample_col` argument is the name of the column in the sdf that
Expand Down
6 changes: 3 additions & 3 deletions docs/README.methods.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@

# fsspark - features selection methods
# fslite - features selection methods

---

`fsspark `includes a set of methods to perform feature selection and machine learning based on spark.
A typical workflow written using `fsspark` can be divided roughly in four major stages:
`fslite `includes a set of methods to perform feature selection and machine learning based on spark.
A typical workflow written using `fslite` can be divided roughly in four major stages:

1) data pre-processing.
2) univariate filters.
Expand Down

0 comments on commit 5e56b21

Please sign in to comment.