Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AtomGroup.resids should return an array of equal size to AtomGroup #385

Closed
dotsdl opened this issue Aug 6, 2015 · 9 comments
Closed

AtomGroup.resids should return an array of equal size to AtomGroup #385

dotsdl opened this issue Aug 6, 2015 · 9 comments

Comments

@dotsdl
Copy link
Member

dotsdl commented Aug 6, 2015

AtomGroup.resids gives an array of resids that is equal in size to AtomGroup.residues, but it would be more consistent to make AtomGroup.resids yield an array of resids equal in size to AtomGroup. Although the current behavior is useful, there are instances where one actually wants an array of the resids associated with each atom in the AtomGroup, but currently one has to build this array with, e.g. a list comprehension like:

resids = np.array([atom.resid for atom in atomgroup])

The current behavior can also be achieved with AtomGroup.residues.resids, so changing this won't eliminate functionality.

This issue was originally raised in #372.

Edit: we would also want to change AtomGroup.segids to also return an array of equal size to AtomGroup.

@dotsdl dotsdl added this to the 0.11 milestone Aug 6, 2015
@richardjgowers
Copy link
Member

I like this. We've got ResidueGroup for working at the Residue level.. eg:

ag.masses += 1.0  # not sure why you would, but it works

ag.resids += 1  # nope!
ag.resids = [atom.resid + 1 for atom in ag]  # fugly!

# the resids of the ATOMS
ag.resids

# the resids of the RESIDUES
ag.residues.resids

@dotsdl
Copy link
Member Author

dotsdl commented Aug 6, 2015

Thanks for the example of where this would directly benefit @richardjgowers! I know I've had to do backflips around this behavior before, but I can't remember the specific context of what I was trying to do.

@orbeckst
Copy link
Member

orbeckst commented Aug 6, 2015

You have my blessing :-).

... just make sure that the tests and docs are up-to-date and that this is noted for #377 and #381.

@richardjgowers
Copy link
Member

Thanks for taking this on @dotsdl

With the tests for AtomGroup properties, I've added a new set right at the end of test_atomgroup which uses a test generator to apply the same test to all properties (test for identical behaviour by applying identical tests). This currently doesn't do resids because of this issue, but once it does if you could add it to those tests it would be good.

@dotsdl
Copy link
Member Author

dotsdl commented Aug 30, 2015

Almost done with this. It's been a pretty major effort, especially since AtomGroups and ResidueGroups had some pretty bad stale cache problems that I've made some fixes to mitigate (which actually fix the known_errors in the AtomGroup tests (#202)). There remain some cacheing issues that I don't think can be fixed with the current object model of AtomGroups, ResidueGroups, and SegmentGroups, but hopefully can finally be resolved when we move to the structured-array approach in the next release cycle.

Anyway, a new question: ResidueGroup and SegmentGroup are subclasses of AtomGroup, and for properties including resids, resnums, and segids they each yield arrays equal in length to themselves e.g.

ag = u.atoms
rg = ag.residues

len(ag.resids) == len(ag)
len(rg.resids) == len(rg)

For properties that refer "downward" in the hierarchy, an array of size equal to the appropriate level is returned e.g.

ag = u.atoms
rg = ag.residues
sg = ag.segments

len(rg.indices) == len(rg.atoms) 
len(sg.resids) == len(sg.residues)

I think this adds some needed consistency. However, there are other properties, such as rg.charges or rg.masses that currently return arrays of len(rg.atoms) but could make sense and be useful as len(rg), where the values for the atoms in each residue are summed.

Thoughts?

richardjgowers added a commit that referenced this issue Aug 31, 2015
@orbeckst
Copy link
Member

orbeckst commented Sep 1, 2015

Re: automatic accumulative properties/methods: Your suggestion to automatically aggregate over extensive quantities sounds elegant. There is, however, a danger for this becoming a bit confusing because you won't really know which properties will behave in this way and then we are loosing consistency again (after all, you lobbied hard for Residues.resids to return a resid for each atom!).

In this instance I'm inclined to sacrifice elegance (that would be mainly appreciated by the people knowing the code very well) in favor of explicit naming, e.g. total_charges or residue_charges or have a general method ResidueGroup.byresidue('charge') or ResidueGroup.byresidue('mass') and Segment.byresidue('charge') or SegmentGroup.bysegment('charge') (probably with a better naming convention...).

@richardjgowers
Copy link
Member

Ah, I missed the extra bit about summing residue properties, but it is something I have thought of before. It gets more interesting/confusing (depending on your taste) if you think about positions too. Should residue.position be the center of mass? Then should be residuegroup.positions return an array of the center of mass for each residue?

This would be useful for when you want to select residues based on position, I think the current best method for that is something like:
rg = u.select_atoms('prop x etc etc').residues
So you're selecting atoms then propagating upwards.

Maybe you should be able to do:
rg = u.select_residues('prop x etc etc')

If @dotsdl 's idea isn't practical, then I'd like to remove residuegroup.charges altogether (so people have to explicitly go via (residuegroup.atoms.charges). It would be very simple for every residuegroup input and output to be len(rg). So a residuegroup is useful for querying and working at the residue level only.

@orbeckst
Copy link
Member

orbeckst commented Sep 4, 2015

Can we please open a new issue for the discussion of what ResidueGroup and SegmentGroup should do with per-atom properties?

@dotsdl
Copy link
Member Author

dotsdl commented Sep 4, 2015

Done: #411

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants