Neurips24 #1970

lenhoanglnh · 2024-05-15T19:10:04Z

Description

Initially the goal was mostly data analysis and experiments for NeurIPS 24 submissions.
However, this has spurred two changes to Solidago source files:

Added import of other csv files of Tournesol public dataset.
Modified qr_quantile, which was previously flawed. More precisely, qr_quantile had the undesirable property that the addition of a new user with (near) infinite uncertainty decreased the qr_quantile estimate. With the new version of qr_quantile, infinite uncertainty is equivalent to zero voting right.

Checklist

I described my changes and my decisions in the PR description
I read the development guidelines of the CONTRIBUTING.md
The tests pass and have been updated if relevant
The code quality check pass

❤️ Thank you for your contribution!

Fixed tiny_tournesol.zip file for testing. Added data_analysis for dataset submission. WIP Runtime error on icml24 experiments to be fixed

…than additional term. This implies that the addition of a new user with huge uncertainties will not affect the quantile much.

amatissart · 2024-05-16T13:25:34Z

solidago/src/solidago/primitives.py

+    if quantile == 0.5:
+        return regularization + forces.sum()
+
+    left_strength = min(1.0, quantile / (1-quantile))
+    right_strength = min(1.0, (1-quantile) / quantile)
+
+    forces = np.where(
+        forces < 0,
+        forces * left_strength,
+        forces * right_strength,
+    )

-    return regularization + quantile_term + forces.sum()
+    return regularization + forces.sum()


@lenhoanglnh This change seems to change significantly the behaviour of the "zero shift" on current Tournesol data. Is it expected? Should we adjust the quantile parameter?

On "main", after applying the shift with score_shift_quantile = 0.15, about 13% of the individual scores are negative. On this branch "neurips24", that would be 37%.

As a consequence the distribution of Tournesol would be modified, with fewer videos reaching the recommendability threshold (1238 instead of 3013).

(I used the "legacy2023" pipeline, currently deployed on production. But I expect it would similar with the new pipeline).

This is unsatisfactory indeed.
I'm a bit disturbed. It feels like the quantile is now poorly estimated.
Maybe this is because videos with lower scores have higher uncertainty? Or less trust?

OK I looked at the data and indeed, the uncertainties for bad videos are smaller than for good videos, which explains why the quantile increased with the new quantile definition. I see two simple fixes:

Reduce score_shift_quantile = 0.15 to score_shift_quantile = 0.05.

Remove uncertainties in quantile estimation.

The former is much more satisfactory.

… of neg. log likelihood by 1 (#1973) --------- Co-authored-by: Louis Faucon <lpfaucon@gmail.com>

…ournesol into solidago-pipeline-docs-1

[solidago] Update docstrings and add simple API for `Pipeline`

…nge_threshold in gbt args

…_score different from 0

…tent with existing tournesol tests

…pilations

lenhoanglnh added 2 commits May 12, 2024 17:25

Added import for vouchers and scores in pipline/inputs

72af6d9

Fixed tiny_tournesol.zip file for testing. Added data_analysis for dataset submission. WIP Runtime error on icml24 experiments to be fixed

Important change: Modified qr_quantile using asymmetric Huber rather …

df57fbf

…than additional term. This implies that the addition of a new user with huge uncertainties will not affect the quantile much.

amatissart reviewed May 16, 2024

View reviewed changes

amatissart added 4 commits May 16, 2024 15:50

cleanup docstrings in Solidago (wip)

9f0ddb4

implement 'get_pipeline_kwargs' in TournesolInput

fd1fb49

fix experiments script

049d72e

read vouches in TournesolInput

dde4c9f

This was referenced May 16, 2024

[solidago] Update docstrings and add simple API for Pipeline #1971

Merged

[solidago] gbt: estimate asymmetrical uncertainties based on increase of loss by 1 #1973

Merged

GresilleSiffle added the Solidago Tournesol algorithms library label May 23, 2024

amatissart and others added 20 commits June 1, 2024 18:41

[solidago] gbt: estimate asymmetrical uncertainties based on increase…

82e9c4f

… of neg. log likelihood by 1 (#1973) --------- Co-authored-by: Louis Faucon <lpfaucon@gmail.com>

cleanup docstrings in Solidago (wip)

c58e424

implement 'get_pipeline_kwargs' in TournesolInput

5e6d598

fix experiments script

051f088

read vouches in TournesolInput

3483609

Fixed experiments calls to Tournesol inputs API

498f4a3

Merge branch 'solidago-pipeline-docs-1' of github.com:tournesol-app/t…

fde2a83

…ournesol into solidago-pipeline-docs-1

fix docstring

afc32d4

Merge pull request #1971 from tournesol-app/solidago-pipeline-docs-1

0032c86

[solidago] Update docstrings and add simple API for `Pipeline`

fix numerical issues in gbt implementations

a2dfbaa

normalize weight per user in Standardize

39c5652

normalize weight per user in QuantileZeroShift

fdd40f3

Merge remote-tracking branch 'origin/main' into neurips24

23b6da3

solidago: fix numerical instability in gbt

3b911d8

try to stabilize lbfgs

ef9819c

fix wrong usage of 'med' in qr_uncertainty, expose high_likelihood_ra…

de2434e

…nge_threshold in gbt args

add QuantileShift (in addition to QuantileZeroShift) to define target…

7aca462

…_score different from 0

lbfgs: raise error when max_iter is reached

b75f802

cleanup pairs.py

615225e

update ml_train to call new pipeline, tweaks in solidago to be consis…

ab7b578

…tent with existing tournesol tests

amatissart added 4 commits October 3, 2024 23:47

fix test_mehestan in solidago, standardize typing to reduce numba com…

cf5c7ed

…pilations

fix mehestan after refactoring

3197563

update test about scalings

8f1ff44

fix lbfgs initialization when past scores are available

5da2c22

amatissart approved these changes Oct 24, 2024

View reviewed changes

amatissart merged commit e5c96e5 into main Oct 24, 2024
10 checks passed

amatissart deleted the neurips24 branch October 24, 2024 09:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Neurips24 #1970

Neurips24 #1970

lenhoanglnh commented May 15, 2024

amatissart May 16, 2024

lenhoanglnh May 16, 2024

lenhoanglnh May 17, 2024 •

edited

Loading

Neurips24 #1970

Neurips24 #1970

Conversation

lenhoanglnh commented May 15, 2024

Description

Checklist

amatissart May 16, 2024

Choose a reason for hiding this comment

lenhoanglnh May 16, 2024

Choose a reason for hiding this comment

lenhoanglnh May 17, 2024 • edited Loading

Choose a reason for hiding this comment

lenhoanglnh May 17, 2024 •

edited

Loading