In this project, we work with multi-dimensional data. Given a potentially large set of d-dimensional points, where each point is represented as a d-dimensional vector, we need to detect interesting points by implementing scalable and efficient algorithms. There are three different tasks that we complete:
Task1. Given a set of d-dimensional points, return the set of points that are not dominated. This is also known as the skyline set.
Task2. Given a set of d-dimensional points, return the k points with the highest dominance score. The dominance score of a point p is defined as the total number of points dominated by p.
Task3. Given a set of d-dimensional points, return the k points from the skyline with the highest dominance
score
Aristotle Univeristy of Thessaloniki (AUTh)
Technologies for Big data Analytics Course
Contributors: Kyriaki Potamopoulou, Vasiliki Zarkadoula