Solve Precision Issues #19

meiji163 · 2022-04-09T04:56:13Z

This is a tracking issue for solving precision issues.

Problem Overview

Precision is one of the major obstacles to the adoption of hyperbolic geometry in machine learning.
As shown in Representation Tradeoffs for Hyperbolic Embeddings, there is a tradeoff between precision and dimensionality when representing points in hyperbolic space with floats, independent of the model that is used.

Hyperlib should have a solution to this in its core components. Ideally the solution will satisfy the following.

reasonably efficient: it doesn't incur significant overhead compared to Euclidean methods and is GPU compatible
easy to use: it's abstracted away from the API so that a casual user doesn't have to touch it
general: it's general enough to be used with different models of hyperbolic space

Approaches

Hope for the best

We see many papers that simply accept the precision errors and try to mitigate them, or go to higher dimensions.
E.g. Our current approach in the Poincare model is to cast tf.float64, which only gets us 53 bits of precision.

Multiprecision

In the sarkar embeddings, we use a multi-precision library mpmath to represent points. As far as multiprecision arithmetic goes it is fast (assuming it is using the gmpy) backend. However the support for vector operations is not good and it cannot easily interoperate with numpy or tensorflow. Also we do not yet have a good method to automatically determine the precision setting (for example, in sarkar_embedding it uses far too much precision by default).

Avoiding the Problem

One common approach to avoid precision errors, especially in hyperbolic SGD, is to map from the (Euclidean) tangent space and do all operations there instead. We should definitely experiment with and support this method in Hyperlib. This will work for all models via the exponential map. However, it only solves part of the problem.

Multi-Component Float

Multi-Component Floats (MCF) are an alternate representation for floats that can be vectorized, proposed by Yu and De Sa as a way to do calculations in the upper half-space model. IMO this is the most promising approach if it can be extended to other models of hyperbolic space.

Todos

Spike: implementing MCF for upper-half space

The text was updated successfully, but these errors were encountered:

meiji163 added the tracking An issue to track and plan work on a large project label Apr 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solve Precision Issues #19

Solve Precision Issues #19

meiji163 commented Apr 9, 2022 •

edited by sourface94

Loading

Solve Precision Issues #19

Solve Precision Issues #19

Comments

meiji163 commented Apr 9, 2022 • edited by sourface94 Loading

Problem Overview

Approaches

Hope for the best

Multiprecision

Avoiding the Problem

Multi-Component Float

Todos

meiji163 commented Apr 9, 2022 •

edited by sourface94

Loading