This project is the answer to a challenge I’ve been asked for a job position. I decided to answer to it with the Scala language.
See the challenge description in the documentation.
- Make sure you have the following dependencies installed :
- Clone this repository to your computer:
git clone https://github.com/undx/challenge.git
You’re ready, go to your cloned repository.
The project relies on simple-build-tool (sbt). The first time you’ll run sbt in the project, all internal dependencies will be fetched.
Here are the main targets:
sbt test
- Run the projects’ tests.
sbt generateDatasets
- Build some sample datasets that you can use for
running the application. There are five of them in the
./data/
folder :tiny.csv
-> 100 rows.small.csv
-> 1000 rows.medium.csv
-> 100,000 rows.large.csv
-> 1,000,000 rows.huge.csv
-> 10,000,000 rows.
Other targets :
sbt doc
- Build Scaladoc of the project.
sbt oneJar
- Build a standalone jar containing code and dependencies for running the project with java
in the target folder. To use the standalone jar:
$ java -jar target/scala-2.11/challenge_2.11-1.0.0-one-jar.jar --help
For running project inside sbt, first step into a sbt session (sbt
), then use the command
run <params>
- Launch the executable. There’s only one mandatory
parameter
-f <file>
, the input source. To see the list of parameters and help, just runsbt run --help
:Usage: challenge [options] -f, --filename <file> input filename to process. -d, --department <dept> department selection. You can select only one in HR1, HR2, etc. -r, --rooms <r1,r2,...> rooms to include (A to F). -o, --output <path> where to write files produced. Defaults to filename's folder. --help prints this usage text
Here’s an example:
run -f data/large.csv -d HR7 -r A,C -o /home/me/tmp/challenge