This is the code for the "Better Resemblance without Bigger Patterns: Making Context-sensitive Decisions in WFC" paper by Bahar Bateni, Isaac Karth and Adam Smith. The paper was published at FDG 2023 and is available at https://dl.acm.org/doi/10.1145/3582437.3582441.
The paper proposes a small and easy to implement modification to the WaveFunctionCollapse algorithm which significantly improves the quality fo results.
WaveFunctionCollapse (WFC) is a Procedural Content Generation algorithm which generates an image similar to some input image. The input image is broken into tiles or patterns which can be composed by the algorithm to create a new image. A set of constraints extracted from the input are satisfied during the process which results in the similarity between the input and output. For more information about the WFC algorithm refer to https://github.com/mxgmn/WaveFunctionCollapse.
For more information about this modification refer to the paper. To run this code, refer to the How to Run section. If you want to implement the context sensitive heuristic in your own work, refer to the Porting Context-sensitive Heuristic section. To use an interactive tool and get some outputs, use the colab notebook available at [TODO add link].
The list of requirements are available in the requirements.txt
file. You can install these requirements with pip
by using:
pip install -r requirements.txt
It is suggested to use a virtual environment when installing the requirements. You can use virtualenv by running:
pip install virtualenv
virtualenv venv
[windows] venv\Scripts\activate
[linux] source venv/bin/activate
pip install -r requirements.txt
To learn more, refer to virtualenv documents.
You can run the WFC algorithm by running the main script:
python main.py
This would results in generating one output per each of the Decision Heuristics for the zelda example.
The result will differ each time. To get the same results everytime, change the GLOBAL_HASH_TYPE
in the utility.py
file:
GLOBAL_HASH_TYPE = HASH_TYPE.NUMBER_HASH
This will run a deterministic hashing algorithm which is significantly slower than the other implementation, but will remove the randomness and generate the same output every time the algorithm is executed with the same seed.
The following options are available in the main.py
file to get the desired outputs. The examples in the main.py
file show how the same code can be used for bigger patterns,
as well as visualizing those patterns.
backtrack=False
shows if backtracking will be used for satisfying all the constraints. The default value isFalse
, meaning that when there are no possible options for a tile, that tile can be left blank (showing as a white tile). By usingbacktrack=True
the execution time will significantly increase based on how limiting the constraints are, but the algorithm will fill all the tile positions with valid tiles.weighting_option
is the heuristic which gives weights to each tile option in the Decicion process. For more information refer to the Decition Heuristic section.entropy_option
is the heuristic which gives entropy values to each position in the Selection process. For more information refer to the Selection Heuristic section.
The main loop of WFC does the following:
- Start with an empty grid of the desired size
- Iteratively, while an empty position exists on the grid:
- Select an empty position on the grid (sometimes refered to as the Selection Heuristic)
- Decide which one of the valid tiles to put in the target position (refered to as the Decision Process). By deciding this tile, we say that the wave function for this position has collapsed to one single tile value.
- Update the set of valid tiles. Collapsing one position may eliminiate some of the options at other positions based on the constraints.
- Once all positions have been filled with tiles, the output has been generated and the algorithm terminates.
A tile is considered valid for a position if the tile is not resulting in contradiction with the available constraints. These constraints decide which pair of tiles can appear as neighbors in the output based on if the same pair have appeared as neighbors in the input. For instance, if tile A have never been an immediate neighbor above tile B in the input, then tile B can't be chosen for a position that has A immediately above it in the output.
As mentioned above, Selection Heuristic is the process in which an empty position on the output grid is selected to be filled with a tile.
Different Selection Heuristics have been implemented in this project, and can be selected by changing the entropy_option
argument. Please note that
even though changing the Selection Heuristics can change the quality of the result, the paper is focused on the effect of the Decision Heurisitic.
The following options are available for the Selection Heuristic:
TOP_LEFT
starts by selecting the top-left corner tile, then row by row (left to right) until the bottom-right corner tile.TOP_RIGHT
starts by selecting the top-right corner tile, then row by row (right to left) until the bottom-left corner tile. The reason to have this heuristic is for comparison with the previous one and showing the bias these two heuristics are causing.NUMBER_OF_OPTIONS
selects the position with minimum number of options at each step.SHANNON
uses shannon entropy to calculate the entropy of each position and chooses the position with minimum entropy. Note that the shannon entropy is defined based on the probability of each option, which means that the Decision Heuristic, which defines this probability, will also affect the Selection process. For example, when the Decision Heuristic is set toUniform
,SHANNON
option will be the same asNUMBER_OF_OPTIONS
.
For visualizing the Selection Heuristics side-by-side, visualize_wfc_selection_heuristics
function in the main.py
file can be used. An example of this generated with the CONTEXT_SENSITIVE
Decision Heuristic on zelda example is shown below:
As you can see, using the TOP_LEFT
and TOP_RIGHT
option results in some diagonal pattern artifacts in the output.
The Decision Heuristic is the process which decides which one of the valid tiles to use for some target position. The heuristic assigns probabilities to valid tile options and then samples these probabilities. The following Decision Heuristics are available in this implementation:
UNIFORM
uses a uniform probability distribution for the tile options. In other words, it chooses one of the valid options randomly.TILE_FREQUENCY
uses the frequency of tiles to assign probability to options. The frequency for each tile is the number of times that tile have appeared in the input. The probability of each option is its frequency devided by sum of all the frequencies for all valid options.CONTEXT_SENSITIVE
uses the frequency of tiles in the same context to assign the probability of options. We define context as a tuple of tiles adjacent to the target tile (which is a tuple of 4 tiles in this 2D implementation). Basically, for the target position, we first extract the context around it (for any tile which haven't been decided yet or which is out of bound, a special unknown toke is used which is shown byNone
in this implementation). Then we access a lookup table to get the frequency of each of the tile options in the same context in the input. We then devide this frequency by sum of the same frequencies for all the options to get a probability. If the context is new to us, we use the general frequency of tile in a completely unknown context, which is the same asTILE_FREQUENCY
option.
For visualizing the Decision Heuristics side-by-side, visualize_wfc_decision_heuristics
function in the main.py
file can be used. An example of this generated with the NUMBER_OF_OPTIONS
Selection Heuristic on zelda example is shown below:
This implementation contains the following files:
image.py
defines an image unit as a 2D array of pixels which can be converted into a single number with the_get_number
function. In this README file we refer to image unit as tile, but it can be any unit of image we assign to positions on the image. Some example of this unit which are implemented in here areTile
, and two biggerPattern
s callednxmPattern
andUpLeftLPattern
. ThenxmPattern
is a rectangular pattern with the shape n by m (nxm tiles in total). TheUpLeftLPattern
is a L shaped pattern containing a center tile, n tiles above it, and m tiles left of it (n+m+1 tiles in total). New patterns can be easily defined by listing the set of tiles in the pattern as relative indices to a center tile.tiled_image.py
defines a process of extracting the image units from an image. The image contains a 2D array of numbers representing tiles at each position of the input, and a one-to-one mapping from number to tile.image_distribution
defines theImageDistribution
class which extracts the frequency of tiles, frequency of tile pairs, and frequency of tiles in different contexts. It also extracts the valid tile pairs. TheImageDistribution
class represents the training data of WFC and can be trained on as manyTiledImage
objects as desired.WFC.py
is the WFC implemetation containing different heuristics. For more details about how this is implemented, refer to the WFC Implementation subsection.utility.py
contains some utility functions used by other files.
As mentioned above, the core loop of WFC is implemented in the WFC.py
file. This core loop uses a supermap to run the WFC. Supermap is a 2D map representing the output.
Each position (i, j) in the supermap (i.e. supermap[i, j]
) is a list of possible valid options available for the position (i, j). If only one option is available at some position, that position have already collapsed. If all positions have only one option each, the output is ready. If at some position there are no options available, depending on the backtrack
paramter it either returns None for that position or uses backtracking to revert steps and try other options.
Each option is a single number which represents a tile or a pattern. The one-to-one mapping from this number to pixel values is in the TiledImage
object, and WFC only works with the number representing that option. To get the final pixel values, the result of WFC.generate
function should be passed to TiledImage.from_generated
function.
The WFC object is initiated based on a ImageDistribution
object which represents its training data. The same WFC
object can then be used as many times as desired to
generate outputs with different heuristics, with or without backtraking, and with different sizes.
The Selection Heuristic is implemented in a single function in the WFC.py
file called _get_entropy
. Given the supermap, x, y, and output size, this function returns the entropy of position (x, y) of supermap. The entropy can be any number, and the WFC algorithm will select the empty position with minimum entropy at each step to collapse.
For instance, the TOP_LEFT
entropy option returns i*n+j
as entropy of position (i, j), with n being the number of columns. This means that an empty position of (i, j)
is prioritized over positions (i + 1, j) and (i, j + 1). The NUMBER_OF_OPTIONS
entropy option returns the number of options at position (i, j), which is the length of
supermap[i, j]
.
To add a new Selection Heuristic, simply add your heuristic name as an EntropyOptions
in WFC.py
. Then add an if statement in the _get_entropy
function to reutrn the
entropy value for a given position. If you want to change how this entropy is used, for example using the maximum entropy position instead of minimum entropy, you
can change the _get_position_to_collapse
function in WFC.py
.
The Decision Heuristic is implemented in a single function in the WFC.py
file called _get_weights
. Given the supermap, x and y, this function returns a list of
weights for position (x, y). This list should have the same length as supermap[x, y]
, because it represents the weight of each option at position (x, y).
The WFC algorithm will then convert these weights to probabilities by deviding them by their sum. This probability will then be sampled to decide a single tile
for position (i, j). This probability will also be used in calculating the entropy if the SHANNON
entropy option is selected.
For instance, the UNIFORM
weighting option, which samples the uniform distribution, is implemented by returning an array of 1s with the same lenght as supermap[i, j]
. The
TILE_FREQUENCY
option returns the frequency of each tile in supermap[i, j]
. The frequency can be accessed via ImageDistribution
object
trained on the input images available at self.dist
.
To add a new Decision Heuristic, add your heuristic name as an WeightingOptions
in WFC.py
. Then add an if statement in the _get_weights
function to return the
weight values for a given position. The weight values should be positive numbers, but there are not other requirements. The weights will be automatically normalized to
probability values in the _get_probabilities
function. The WFC then samples this probability to select a single tile in the _collapse
function.
The Context-sensitive heuristic is implemented in two main parts:
- Pre-process to count the frequency of each tile in each context and storing these values in a lookup table. This process should be done once before running the WFC function. The same lookup table can then be used in generating as many outputs as desired without any change to the table.
- Calculate the context-sensitive heuristic in the main loop of WFC. This is done by finding the context around target tile position, and then looking up the frequency of (x, context) for each option x of the available tile options at the target position. For more information refer to the steps below.
In the preprocessing of WFC, we have a set of inputs as tile numbers on a grid. The valid tile adjacencies are infered by looking at adjacent tiles on this grid. To add the context-sensitive heuristic, in the same pre-processing step we calculate a lookup table for frequency of tiles in different context.
We define the context around some tile as an array or tuple of its neighbors. In 2D, this means that the context around position (x, y) is defined as:
# target position (x, y)
def get_context(tiles, x, y):
return (tiles[x+1, y], tiles[x, y+1], tiles[x-1, y], tiles[x, y-1])
In M dimentional environment, this would be a tuple of 2M tiles. The context object should have the same order for each adjacent tile (e.g. tile above always as the first tile in context).
If any of the the tiles in the context is outside the boundary of the input, we use a special unknown (UNK
) token to represent that tile.
For example the context around tile (0, 0) would be:
# target (0, 0)
context = (tiles[1, 0], tiles[0, 1], UNK, UNK)
In WFC's main loop, in most cases the full context of our target tile is not available. As the output is being generated, target points are selected in positions that may have some of their adjacent tiles still empty (in super-position, not collapsed yet). It also might happen that some of the adjacent tiles are out of bounds. Because of that, we have to also populate these partial context in our lookup table. We define the list of partial context around position (x, y) as:
# target position (x, y)
def get_partial_context_list(tiles, x, y):
N_1, N_2, N_3, N_4 = tiles[x+1, y], tiles[x, y+1], tiles[x-1, y], tiles[x, y-1]
return [
(UNK, UNK, UNK, UNK),
(N_1, UNK, UNK, UNK),
(UNK, N_2, UNK, UNK),
(UNK, UNK, N_3, UNK),
(UNK, UNK, UNK, N_4),
(N_1, N_2, UNK, UNK),
(N_1, UNK, N_3, UNK),
(N_1, UNK, UNK, N_4),
(UNK, N_2, N_3, UNK),
(UNK, N_2, UNK, N_4),
(UNK, UNK, N_3, N_4),
(N_1, N_2, N_3, UNK),
(N_1, N_2, UNK, N_4),
(N_1, UNK, N_3, N_4),
(UNK, N_2, N_3, N_4),
(N_1, N_2, N_3, N_4),
]
Please note that it should be considered that some of these context might be empty. In that case, the duplicates context keys in the context list should be eliminated. As an example, if the target position is at (0, 0):
# target (0, 0)
N_1, N_2 = tiles[x+1, y], tiles[x, y+1]
partial_context_list = [
(UNK, UNK, UNK, UNK)
(N_1, UNK, UNK, UNK)
(UNK, N_2, UNK, UNK)
(N_1, N_2, UNK, UNK)
]
We define a lookup table for our context frequencies. This lookup table is populated as follows:
For each position (x, y) of some input, with tile value T
:
- Find the list of partial context around position (x, y)
- For each partial context
C
in this list, we increase the frequency of key = (T, C). If the key doesn't exist, we add it to the lookup table with frequency 1.
# target position (x, y)
T = tiles[x, y]
partial_context_list = get_partial_context_list(tiles, x, y)
for C in partial_context_list:
table[(T, C)] += 1
In the decision process, we have a target position (x, y), and a list of valid options for that position. These options have been filtered to guarantee satisfying the constraints.
To choose one of these tiles, in context-sensitive heuristic we sample a probability distribution calculated by the frequency of context keys for each option.
# target position (x, y)
def decide_tile(output_tiles, x, y)
C = get_context(output_tiles, x, y)
weights = [table[(T, C)] for T in options]
probabilities = weights / sum(weights)
return choose(options, probabilities)
- The paper is available at https://dl.acm.org/doi/10.1145/3582437.3582441
- WFC was intially introduced by Maxim Gumin, and the implementation is availabe at https://github.com/mxgmn/WaveFunctionCollapse. The repository also contains a list of implementations in other programming languages.
- The Zelda map image available in this repository is from https://nesmaps.com/maps/Zelda/ZeldaOverworldQ1BG.html
The following features have been implemetned but are not part of this repository yet:
- Interactive WFC notebook
- Resemblance Metrics
- 3D WFC: my implementation also contains the code to read/write minecraft data as an exmaple of this. Below you can see the results of genenerating a 16x16x16 minecraft chunk with different heuristics (uniform, tile frequency, context-senstive (ours) in this order). In all of them a 4x16x16 chunk of the output is copied from an existing minecraft world, and then WFC is used to continue filling that chunk while attending to the constraints.
- M Dimensional Implementation: my implementation for M dimensions is a bit complicated and harder to read than the 2D implementation. While very similar, the 2D implementation is a lot easier to understand because it doesn't use any fancy functions. For example it uses a nested
for
loop instead of anndenumerate
loop. - Expressive Range Analysis
If you think some other features should be added here or some bugs should be fixed, please submit an issue.
Alternatively if you have already implemented some features or fixed something that you want to add to the repository, please submit a pull request.
If you want to talk about this, feel free to contact me by emailing me at bbateni@ucsc.edu.
Possible areas of investigation:
- What is the effect of changing the context definition?
- How can we implement this for a WFC on a graph?
- Can we further improve the resemblance by modifying the selection heuristic?
Some tiles in the output are blank (completely white). What's wrong?
This is because the algorithm can't find any valid tiles for those positions. Try running the algorithm with backtracking by setting backtrack=True
. This will increase the
exection time but will search all the available options to get a valid result. If there are no valid results available, the code will return a Not possible
exception.
There are some artifact in the results: there is a diagonal pattern seen in the outputs. What should I do?
This shows some bias in the output, which can be a result of selection heuristic. Changing the selection heuristic to SHANNON
or NUMBER_OF_OPTIONS
will resolve this problem.
Running the WFC algorithm with the same seed results in different values. How can I solve this?
The reason is that the main process which turns a tile to a number uses the python hash function. The python hash function is undeterministic and changes every time you run the program because of security reasons. To change this, you can change hash method by changing
GLOBAL_HASH_TYPE = HASH_TYPE.PYTHON_HASH
to
GLOBAL_HASH_TYPE = HASH_TYPE.NUMBER_HASH
in the utility.py
file. The NUMBER_HASH
hashing method is deterministic, but will make the tile extraction process significantly slower. Note that this doesn't have any impact on the execution time of other parts such as the WFC algorithm and only affects the pre-processing of input image.