-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AllOverlaps for Excited States #302
base: develop
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 12899684348Details
💛 - Coveralls |
Benchmark resultJudge resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsA ratio greater than
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfoTarget
Baseline
Target resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsBelow is a table of this job's results, obtained by running the benchmarks.
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfo
Baseline resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsBelow is a table of this job's results, obtained by running the benchmarks.
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfo
Runtime information
Architecture: x86_64
Benchmark resultJudge resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsA ratio greater than
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfoTarget
Baseline
Target resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsBelow is a table of this job's results, obtained by running the benchmarks.
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfo
Baseline resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsBelow is a table of this job's results, obtained by running the benchmarks.
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfo
Runtime information
Architecture: x86_64
Benchmark resultJudge resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsA ratio greater than
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfoTarget
Baseline
Target resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsBelow is a table of this job's results, obtained by running the benchmarks.
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfo
Baseline resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsBelow is a table of this job's results, obtained by running the benchmarks.
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfo
Runtime information
Architecture: x86_64
Benchmark resultJudge resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsA ratio greater than
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfoTarget
Baseline
Target resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsBelow is a table of this job's results, obtained by running the benchmarks.
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfo
Baseline resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsBelow is a table of this job's results, obtained by running the benchmarks.
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfo
Runtime information
Architecture: x86_64
Benchmark resultJudge resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsA ratio greater than
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfoTarget
Baseline
Target resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsBelow is a table of this job's results, obtained by running the benchmarks.
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfo
Baseline resultBenchmark Report for /home/runner/work/Rimu.jl/Rimu.jlJob Properties
ResultsBelow is a table of this job's results, obtained by running the benchmarks.
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfo
Runtime information
Architecture: x86_64
|
src/StatsTools/reweighting.jl
Outdated
num_spectral = length(Set( | ||
[name[1:findlast('_',name)] for name in names(df) if startswith(name,"norm")] | ||
)) | ||
num_reps = div(length(filter(startswith("norm"), names(df))),num_spectral) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm this feels a bit clunky and probably isn't backwards-compatible. Maybe we should store num_spectral
and num_replicas
in the data frame and use that. If the metadata is not available, we could resort to assuming it's a data frame from and old version of Rimu and use the old version of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the fields "num_replicas"
and "num_spectral_states"
in the report metadata!
append!( | ||
names, | ||
[name[1:findlast('_',name)-1]*"_s$j"* | ||
name[findlast('_',name):end] for name in names_j] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
append!( | |
names, | |
[name[1:findlast('_',name)-1]*"_s$j"* | |
name[findlast('_',name):end] for name in names_j] | |
) | |
append!( | |
names, | |
name[1:findlast('_',name)-1]*"_s$j"* | |
name[findlast('_',name):end] for name in names_j | |
) |
I think it should work like this (or maybe you need to put the for-loop expression in regular parentheses), so it doesn't need to allocate a bunch of temporary vectors. Also, indexing into strings is generally not recommended, but I don't immediately see a cleaner solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all_overlaps
now generates the full names in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick comment so far.
For multiple spectral states, the column names are of the form `c{i}_dot_s{l}_c{j}` for | ||
vector-vector overlaps, and `c{i}_Op{k}_s{l}_c{j}` for operator overlaps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the best naming scheme is here. This feels a bit unintuitive to me. I guess l
would be the spectral state?
How about c{i}_dot_c{j}:{s}
, where s
numbers the spectral state? Similarly c{i}_Op{k}_c{j}:{s}
. Or is there a good reason not to use a colon (:
) in a field name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so the colon is not a good idea. My current thinking is that the best format would be
r{i}s{m}_dot_r{j}s{m}
for the overlaps of the m
th spectral state of replicas i
and j
, and correspondingly for the operators r{i}s{m}_Op1_r{j}s{m}
.
Overlaps mixing different spectral states would not be generated by default but could be enabled by a keyword argument, e.g. mixed_spectral_overlaps=true
, in which case you would get column names like r{i}s{m}_dot_r{j}s{n}
.
To make this consistent, we should also rename the single-replica column names, e.g. shift_r{i}s{m}
. The only special case I would allow is when there is only a single replica and single spectral state, in which case it'll just be shift
.
Now this is a fairly big change, as some StatsTools
code will have to be changed. The functions that read the DataFrame
should though determine the number of replicas and spectral states from the metadata. Which states they will act on could be specified by a command-line argument (or by passing explicitly the names of the DataFrame
columns). It will also break some of the older scripts used for analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now implemented, with all_overlaps
now taking a matrix of vectors and calculating all replica overlaps for each spectral state, as well as all replica overlaps for different spectral states if mixed_spectral_overlaps=true
(in this case the computation can be very slow).
For the StatsTools
, most of them allow explicitly passing the column names. For the rayleigh_replica_estimator
, replica_fidelity
, and variational_energy_estimator
, I added a keyword argument for which spectral state to use. Let me know if anything should be added or done differently.
Just a quick comment here: This looks good. I see that there are conflicts with It may also be good to add some more tests to bump the coverage to >90% 😄 |
…om/joachimbrand/Rimu.jl into feature/AllOverlaps-excited-states
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments, but looks great otherwise!
throw(ArgumentError( | ||
"No replicas found. Use keyword \ | ||
`replica_strategy=AllOverlaps(n)` with n≥2 in `ProjectorMonteCarloProblem` to set up replicas!" | ||
)) | ||
end | ||
@assert num_replicas ≥ 2 "At least two replicas are needed, found $num_replicas" | ||
|
||
num_overlaps = length(filter(startswith(r"c._dot"), names(df))) | ||
num_overlaps = length(filter(startswith(Regex("r[0-9]+s$(spectral_state)_dot_r[0-9]+s$(spectral_state)")), names(df))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does Regex(r"...")
work? Shouldn't it be just
num_overlaps = length(filter(startswith(Regex("r[0-9]+s$(spectral_state)_dot_r[0-9]+s$(spectral_state)")), names(df))) | |
num_overlaps = length(filter(startswith("r[0-9]+s$(spectral_state)_dot_r[0-9]+s$(spectral_state)"), names(df))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it has to be done with Regex("...")
when you have interpolation (the "r"
is just part of the string).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matija probably meant this way of writing it:
num_overlaps = length(filter(startswith(Regex("r[0-9]+s$(spectral_state)_dot_r[0-9]+s$(spectral_state)")), names(df))) | |
num_overlaps = length(filter(startswith(r"r[0-9]+s$(spectral_state)_dot_r[0-9]+s$(spectral_state)"), names(df))) |
I'm not sure whether this would make it more readable (if it works, I haven't tried it), so happy to leave it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think I'd prefer to store the number of overlaps in the DataFrame and retrieve it with a function num_overlaps()
similar to num_replicas()
and num_spectral_states
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I misread that! I think it needs to be done this way if you want to do string interpolation, so it's all good!
Co-authored-by: mtsch <matijacufar@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks very good now.
Please see my optional comments below.
throw(ArgumentError( | ||
"No replicas found. Use keyword \ | ||
`replica_strategy=AllOverlaps(n)` with n≥2 in `ProjectorMonteCarloProblem` to set up replicas!" | ||
)) | ||
end | ||
@assert num_replicas ≥ 2 "At least two replicas are needed, found $num_replicas" | ||
|
||
num_overlaps = length(filter(startswith(r"c._dot"), names(df))) | ||
num_overlaps = length(filter(startswith(Regex("r[0-9]+s$(spectral_state)_dot_r[0-9]+s$(spectral_state)")), names(df))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matija probably meant this way of writing it:
num_overlaps = length(filter(startswith(Regex("r[0-9]+s$(spectral_state)_dot_r[0-9]+s$(spectral_state)")), names(df))) | |
num_overlaps = length(filter(startswith(r"r[0-9]+s$(spectral_state)_dot_r[0-9]+s$(spectral_state)"), names(df))) |
I'm not sure whether this would make it more readable (if it works, I haven't tried it), so happy to leave it as it is.
num_replicas = parse(Int, metadata(df, "num_replicas")) | ||
if num_replicas == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should instead have a method for num_replicas(::DataFrame)
, e.g. defined in pmc_simulation.jl
that could be called here.
The method could fall back to the old definition (filtering names(df)
) for older DataFrames but throw an informative error if the DataFrame was not created by Rimu.jl? We will need that function more often.
throw(ArgumentError( | ||
"No replicas found. Use keyword \ | ||
`replica_strategy=AllOverlaps(n)` with n≥2 in `ProjectorMonteCarloProblem` to set up replicas!" | ||
)) | ||
end | ||
@assert num_replicas ≥ 2 "At least two replicas are needed, found $num_replicas" | ||
|
||
num_overlaps = length(filter(startswith(r"c._dot"), names(df))) | ||
num_overlaps = length(filter(startswith(Regex("r[0-9]+s$(spectral_state)_dot_r[0-9]+s$(spectral_state)")), names(df))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think I'd prefer to store the number of overlaps in the DataFrame and retrieve it with a function num_overlaps()
similar to num_replicas()
and num_spectral_states
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
AllOverlaps
implemented for multiple spectral states.