-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor init() to smaller functions #34
Conversation
…xing accidental comment associated w/this file
include/topmodel_backup.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove from this PR, the git history will contain the original file if you are concerned about loosing the previous implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. Will focus on removing extraneous changes/files from PR, better documenting within the code, and giving special attention to assumption around memory allocation. If this is not already done somewhere in the original code I may need to touch base with you on that sometime next week to make sure I'm going about it properly. Thanks for the feedback!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove from the PR, the original is in the version history
src/WorkingFiles/topmodel_backup.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove from PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like maybe you ended up staging some working files you didn't intend to. I see now that a couple of these exist in src/WorkingFiles
. You can always add to your local .gitignore
to prevent spurious working directories from getting staged.
src/topmodel.c
Outdated
* **********************************/ | ||
|
||
|
||
// FUNCTION TO CONVERT DISTANCE/AREA FORM TO TIME DELAY HISTOGRAM ORDINATES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely reccomend documenting function paramers inline. using doxygen style comments can also help auto generate human readable docs in the future. Consider a comment like this
/**
* Function to convert Distance/Area from T0 time delay historgram ordinates.
*
* <additional details, also known as long description, here e.g.>
* This function converts paramters to m/time step DT with exception of XK0
* which must stay in m/h.
* Q0 is already in m/time step
* T0 is input as Ln(To)
*
* @param[in] dist_from_outlet pointer to array of length num_channels containing WHAT
* @param[in] num_channels the number of channels, corresponds to the length of dist_from_outlet and tch arrays
* @param[in] chv <is what???>
* @param[in] rv <is what???>
* @param[in] dt delta time, or time step, used to convert chv and rv
* @param[out] tch <what is tch, generated by this function>
*/
I'll also note that I don't see an XK0
or T0
parameter anywhere in this function, so perhaps that comment needs to be dropped???
|
||
for(ir=1;ir<=(*num_time_delay_histo_ords);ir++) | ||
{ | ||
time=(double)(*num_delay)+(double)ir; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, this cast is suspicious, why does time need to be a double? both num_delay and ir are ints, so should time simply be an int as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also from original code.
topmodel/refs/original_code_c/tmod9502.c
Line 733 in 9452b7b
time=(double)(*num_delay)+(double)ir; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hellkite500, can you make heads or tails of this casting? The old Fortran code does it, too. Maybe a legacy of it being over 40 years old?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing I can think of is that time
was expected to be large, and a double type was used to avoid overflowing an integer with large enough time
values? For signed (positive/negative) ints, an 8 bit int can hold numbers up to 127, a 16 bit int 32,767, 32-bit int 2,147,483,647, and a 64-bit int 9,223,372,036,854,775,807.
It is possible that original code was developed on something with only 16 bit integers and double was used to increase the largest possible time representation.
If we really want to ensure we can hold large representations for time, we can change the type of time to int64_t
and use
#include <stdint.h>
to have a portable, compliant 64 bit int to put that into...
We can also use unsigned types if we know the values are not ever going to be negative.
The other point I'll make here is that even if time
does need to be a floating point, you don't have to cast the int, you can safely add ints and store them in a double, the compiler will handle that just fine.
double time, a1, sumar, a2; | ||
int j, ir; | ||
|
||
if((*time_delay_histogram)==NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function's responsibility is to calc the histogram, it probably should also be responsible for initializing memory. At the same time, its probably a good practice for any of these functions that accept pointers as INPUT to check the pointer is valid (e.g. not NULL) and simply throw an error if it is.
assert( time_delay_historgram != NULL )
assert( *time_delay_histogram != NULL )
assert( num_time_delay_histo_ords != NULL )
assert( num_delay != NULL )
assert( cum_dist_area_with_dist != NULL )
src/topmodel.c
Outdated
// updates values associated with model structure | ||
// outputs used in TOPMOD() include num_time_delay_histo_ords, time_delay_histogram, | ||
// num_delay, and Q | ||
extern void calc_time_delay_histogram(int max_time_delay_ordinates, int num_channels, double area, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A common C style guide/implementation is to put all input params (things that are NOT modified by the function) first in the arg list, and all out or in/out params (things that will be modified by the function) last. I strongly recommend this!
Also its imperative in C codes to document assumptions on memory already being allocated vs the memory to be allocated and returned by the function. As noted below, this function could NOT check and allocate the time_delay_histogram
, which is valid, but it should be documented in the doc string (see above comment on doxygen style doc strings) either way. Keeping the allocation his is OK, as long as the caller knows it was allocated in this function and can potentially deallocate it later.
src/topmodel.c
Outdated
// updates values associated with model structure | ||
// outputs used in TOPMOD() include sbar, bal, deficit_root_zone | ||
|
||
extern void init_water_balance(double **deficit_local, int max_atb_increments, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another good example of ordering params and documenting behavior. Most "init" functions are safe to assume are doing some allocation/initialization, but being clear about WHICH variables are being allocated is important
) | ||
{ | ||
// declare local variables | ||
double sum; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can declare/init sum in one line for berevity
double sum = 0.0;
Left some initial review comments, and mashed the button before I could elaborate on my requested changes. Definitely need to address some of the array declaration/passing with static sizes (11) and remove the unnecessary files. The rest of the comments are stylistic and documentation related, but would likely be worth addressing to set the tone for future refactoring and/or contributions/maintainers -- they will thank us later! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ben-Choat, please focus on making the changes suggested by @hellkite500, particularly the removal of all temporary files (as he noted, GitHub does the version tracking—no need for WorkingFiles
or backup
) and the addition of comments reflecting the newly implemented order of parameters.
You can also suggest @ajkhattak and @madMatchstick as reviewers while I'm out. Thanks!
// FUNCTION TO CONVERT DISTANCE/AREA FORM TO TIME DELAY HISTOGRAM ORDINATES | ||
// outputs used in subsequent functions include tch | ||
extern void convert_dist_to_histords(double *dist_from_outlet, int num_channels, | ||
double chv, double rv, double dt, double tch[11]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hellkite500, my reading of the original code is that TCH(10)
is a hard-coded limit corresponding to the number of channels, a parameter value (num_channels
) read in from subcat.dat
.
This does not seem like the correct way of implementing things because there is no guardrail, as far as I know, preventing num_channels
from being > 10. How would you recommend handling this?
Also, as a C dilettante, why would it be TCH(10)
in Fortran and tch[11]
in C (the 10 vs. 11 part, not the rest of the syntax)?
|
||
for(ir=1;ir<=(*num_time_delay_histo_ords);ir++) | ||
{ | ||
time=(double)(*num_delay)+(double)ir; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hellkite500, can you make heads or tails of this casting? The old Fortran code does it, too. Maybe a legacy of it being over 40 years old?
@Ben-Choat - Looks good overall atm gh actions/workflows check standalone and unit test scripts, all of which passed.
-thx! |
Has been addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates run in ngen
coupled with PET, as outlined in wiki. Output seems reasonable.
Refactored init() within topmodel.c to four smaller functions. Ultimately, this was done to allow calibratable parameters to update secondary variables withib bmi_topmodel.c. For now, only topmodel.c has been edited.
Additions
double chv, double rv, double dt, double tch[11]))
double tch[11], int *num_time_delay_histo_ords, int *num_delay,
double **time_delay_histogram, double *cum_dist_area_with_dist)
int num_topodex_values, double dt, double *sr0,
double *szm, double *Q0, double *t0, double tl,
double **stor_unsat_zone, double *szq,
double **deficit_root_zone, double *sbar,
double *bal)
int *num_time_delay_histo_ords, double **time_delay_histogram)
Removals
Changes
Testing
Screenshots
Notes
Todos
Checklist
Target Environment support
Accessibility
Other