Digital - Senior Engineer test

Candidate notes

Run tests:

$ docker-compose up -d && pytest

Build and run container:

$ docker-compose up -f docker-compose.prod.yml --build

DSPs report DSRs containing hundreds of millions of usages. If you were to deploy this solution to production, would you do any change in the database or process, in order to import the usages? Which ones?

Denormalize the data to minimize joins, if business requirements allow it. Currencies and locations can be represented as string enumerations in DSR rows.
Wrap ingest_dsr in a consumer to perform ingestion asynchronously. Add IN_PROGRESS status for DSRs. Store ingestion logs.
Think on faster validation/serialization. Maybe employ a more lightweight framework.
Delegate file upload to a highly available storage service proxied by a dedicated thin microservice. Offload gzip decompression to infrastructure, e.g. nginx.

Digital - Senior Engineer test

A lot of our work is about connecting digital service providers (DSPs) like Spotify or YouTube with societies like SGAE or SACEM, who represent music creators. DSPs provide digital sales reports (DSRs), which contain information about music metadata and revenue generated. We crunch this data and give societies the information they need.

For this test, we provide several DSRs that represent the usages and revenue from different countries. The aim is to parse the contents of the DSRs and insert them into a database to extract statistics through an API. Each line of a DSR represents a sound recording and its associated usage data. In detail, it contains the following fields:

dsp_id: the unique identifier of a sound recording provided by DSP.
title: sound recording title.
artists: pipe-separated list of artists.
isrc: International Sound Recording Code.
usages: number of plays for this sound recording, territory and period.
revenue: revenue generated by this sound recording, territory and period.

DSR filenames specify metadata related to the DSR, such as Territory, Period, and Currency. You will find the DSRs in the data/ directory.

The API specification is provided as an OpenAPI specification:

openapi.md

Our current (and incomplete) database contains the following tables:

DSR: Models the DSR file and stores some relevant information.
Currency: Models a currency.
Territory: Models a territory.

Deliverables:

A way to import the contents of DSRs to the DB.
Complete the API according to the OpenAPI specification.
A form in the admin page to delete DSRs and it's contents.
Tests for each api endpoint, using any preferred testing framework.
Dockerfile

Requirements:

Django 3.1
Python 3.9

Extra questions:

DSPs report DSRs containing hundreds of millions of usages. If you were to deploy this solution to production, would you do any change in the database or process, in order to import the usages? Which ones?

Note:

In order to manage python dependencies, it will be necessary to use any tool (e.g.: pipenv) that interprets the Pipfile placed in the root folder.

For example, using pipenv, it's enough to do:

pipenv sync --dev

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
digital		digital
docker		docker
dsrs		dsrs
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
manage.py		manage.py
openapi.md		openapi.md
pytest.ini		pytest.ini
uwsgi.ini		uwsgi.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Candidate notes

Digital - Senior Engineer test

About

Releases

Packages

Languages

khvn26/backend-software-digital

Folders and files

Latest commit

History

Repository files navigation

Candidate notes

Digital - Senior Engineer test

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages