Skip to content

Commit

Permalink
Release v1.0.0-beta.3 (#31)
Browse files Browse the repository at this point in the history
Release v1.0.0-beta.3
  • Loading branch information
bluerogue251 authored Dec 17, 2017
1 parent 0e8f624 commit 9e1819d
Show file tree
Hide file tree
Showing 4 changed files with 46 additions and 50 deletions.
42 changes: 19 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Feel free to open a GitHub ticket if you would like support for a different data

```bash
# Download the DBSubsetter.jar file
$ wget https://github.com/bluerogue251/DBSubsetter/releases/download/v1.0.0-beta.2/DBSubsetter.jar --output-document /path/to/DBSubsetter.jar
$ wget https://github.com/bluerogue251/DBSubsetter/releases/download/v1.0.0-beta.3/DBSubsetter.jar --output-document /path/to/DBSubsetter.jar

# Show explanation and examples of how to configure multiple schemas,
# multiple base queries, missing foreign or primary keys, columns to exclude,
Expand Down Expand Up @@ -72,28 +72,24 @@ Whether it is to fix a typo, improve the documentation, report or fix a bug, add
The only condition for contributing to this project is to follow our [code of conduct](CODE_OF_CONDUCT.md) so that everyone is treated with respect.


## Related projects and acknowledgments

DBSubsetter was inspired by and borrowed ideas from:

* [Jailer](http://jailer.sourceforge.net/home.htm)
* [rdbms-subsetter](https://github.com/18F/rdbms-subsetter)

Here are some other similar or related resources:

* [db_subsetter](https://github.com/lostapathy/db_subsetter)
* [DataBee](https://www.databee.com/)
* [pg_sample](https://github.com/mla/pg_sample)
* [DATPROF](http://www.datprof.com/products/datprof-subset/)
* [abridger](https://github.com/freewilll/abridger)
* [postgres-subset](https://github.com/BeautifulDestinations/postgres-subset)

DBSubsetter is written in [Scala](https://www.scala-lang.org/) using:

* [Akka Streams](https://doc.akka.io/docs/akka/2.5.8/stream/index.html?language=scala)
* [Chronicle-Queue](https://github.com/OpenHFT/Chronicle-Queue)
* [scopt](https://github.com/scopt/scopt)
* [Slick](http://slick.lightbend.com/)
## Related projects

DBSubsetter was inspired by
[Jailer](http://jailer.sourceforge.net/home.htm) and
[rdbms-subsetter](https://github.com/18F/rdbms-subsetter).
Other related resources include
[db_subsetter](https://github.com/lostapathy/db_subsetter),
[DataBee](https://www.databee.com/),
[pg_sample](https://github.com/mla/pg_sample),
[DATPROF](http://www.datprof.com/products/datprof-subset/),
[abridger](https://github.com/freewilll/abridger), and
[postgres-subset](https://github.com/BeautifulDestinations/postgres-subset).

DBSubsetter is written in [Scala](https://www.scala-lang.org/) using
[Akka Streams](https://doc.akka.io/docs/akka/2.5.8/stream/index.html?language=scala),
[Chronicle-Queue](https://github.com/OpenHFT/Chronicle-Queue), and
[scopt](https://github.com/scopt/scopt).
[Slick](http://slick.lightbend.com/) is used for testing.

## License

Expand Down
2 changes: 0 additions & 2 deletions TODO.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
* RELEASE v1.0.0.beta.3

* Automatic `--skipPkStore` option
- Including case where a unique index means we might _not_ have to store PKs for a table
- Include config option to print out which tables' PKs will be stored
Expand Down
2 changes: 1 addition & 1 deletion build.sbt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name := "DBSubsetter"

version := "v1.0.0-beta.2"
version := "v1.0.0-beta.3"

scalaVersion := "2.12.3"

Expand Down
50 changes: 26 additions & 24 deletions src/main/scala/trw/dbsubsetter/config/CommandLineParser.scala
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import trw.dbsubsetter.db.{ColumnName, SchemaName, TableName}

object CommandLineParser {
val parser: OptionParser[Config] = new OptionParser[Config]("DBSubsetter") {
head("DBSubsetter", "v1.0.0-beta.2")
head("DBSubsetter", "v1.0.0-beta.3")
help("help").text("Prints this usage text\n")
version("version").text("Prints the application version\n")

Expand Down Expand Up @@ -101,7 +101,7 @@ object CommandLineParser {

opt[String]("primaryKey")
.maxOccurs(Int.MaxValue)
.valueName("<schema1>.<table1>(<column1>, <column2>, ...)")
.valueName("<schema>.<table>(<column1>, <column2>, ...)")
.action { case (fk, c) =>
val regex = """^\s*(.+)\.(.+)\((.+)\)\s*$""".r
fk match {
Expand All @@ -116,6 +116,23 @@ object CommandLineParser {
| Can be specified multiple times
|""".stripMargin)

opt[String]("excludeTable")
.valueName("<schema>.<table>")
.maxOccurs(Int.MaxValue)
.action { (str, c) =>
val regex = """^\s*(.+)\.(.+)\s*$""".r
str match {
case regex(schema, table) =>
c.copy(excludeTables = c.excludeTables ++ Set((schema, table)))
case _ => throw new RuntimeException
}
}
.text(
"""Exclude a table from the resulting subset
| Also ignore all foreign keys to and from this table
| Can be specified multiple times
|""".stripMargin)

opt[String]("excludeColumns")
.valueName("<schema>.<table>(<column1>, <column2>, ...)")
.maxOccurs(Int.MaxValue)
Expand All @@ -137,23 +154,6 @@ object CommandLineParser {
| Can be specified multiple times
|""".stripMargin)

opt[String]("excludeTable")
.valueName("<schema>.<table>")
.maxOccurs(Int.MaxValue)
.action { (str, c) =>
val regex = """^\s*(.+)\.(.+)\s*$""".r
str match {
case regex(schema, table) =>
c.copy(excludeTables = c.excludeTables ++ Set((schema, table)))
case _ => throw new RuntimeException
}
}
.text(
"""Exclude a table from the resulting subset
| Also ignore all foreign keys to and from this table
| Can be specified multiple times
|""".stripMargin)

opt[String]("skipPkStore")
.valueName("<schema>.<table>")
.maxOccurs(Int.MaxValue)
Expand All @@ -166,26 +166,28 @@ object CommandLineParser {
}.text(
"""Skip runtime in-memory storage for a table's primary keys
| For large tables, this can significantly reduce DBSubsetter's memory footprint.
| Right now, this is not well documented, and involves understanding how DBSubsetter
| This currently is not well documented. It involves understanding how DBSubsetter
| works and knowing that a given table's rows will all only be processed once.
| Feel free to open a GitHub ticket to ask for more information about this.
| A future release of DBSubsetter will hopefully automate this step.
| A future release of DBSubsetter will hopefully automate this.
| Can be specified multiple times
|""".stripMargin)

opt[Int]("preTargetBufferSize")
.valueName("<int>")
.action((int, c) => c.copy(preTargetBufferSize = int))
.text(
"""Buffer up to this many target database insert statements in memory if the target database is not yet ready for them
| This can sometimes improve performance at the cost of increased RAM usage
"""Buffer up to this many target database insert statements in memory
| This can sometimes improve performance at the cost of an increased memory footprint
| The default buffer size is 100
|""".stripMargin)

opt[Unit]("singleThreadedDebugMode")
.action((_, c) => c.copy(isSingleThreadedDebugMode = true))
.text(
"""Run DBSubsetter in debug mode (NOT recommended)
| Uses a simplified architecture which avoids akka-streams and parallel computations
| Uses a simplified, single-threaded architecture
| Avoids using Akka Streams and Chronicle-Queue
| Ignores `--originDbParallelism` and `--targetDbParallelism` and uses one connection per database
| Subsetting may be significantly slower
| The resulting subset should be exactly the same as in regular mode
Expand Down

0 comments on commit 9e1819d

Please sign in to comment.