-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathchanges.txt
4294 lines (3270 loc) · 162 KB
/
changes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
CHANGES BETWEEN 5.1.1 and 5.1.2 Official Releases
Bugs Fixed:
- Supramap issue where dates in google earth are not normalized. (Thanks, Dan Janies)
- Added 'nucleotides' for prealigned read-option (nucleotide was always present).
- Removed arguments for clear_memory() command, they were unnecessary.
- Fixed issues with OCaml 4.01.0+ where format strings were not being converted
properly in our logging module.
- Fixed issues with clang compilation
- Removed function deprecations for newer versions of OCaml
- Issue with trailing/leading spaces in CSV files for supramap (Thanks Dan Janies!)
- Added support for OSX Yosemite (veclib -> Accelerate framework)
- Fixed bug in sankoff distance calculation. The bug affects branch lengths and a
heuristic on the cost of the tree, and not the actual cost.
Command Changes
- removed transform(prealigned) (confusion with transform(ia)).
- removed transform td, ti, trailing_insertion, trailing_deletion,
auto_sequence_partition, auto_static_approx commands. These are internal to the system
during the search. These commands had modified the ordering of results which is not
appropriate for reporting the data.
CHANGES BETWEEN 5.1.0 and 5.1.1 Official Releases
Bugs Fixed:
- Command swap(parallel) had issues with convergence and filtering trees.
- Issues with gather after parallel search procedure in certain situations.
Known Issues:
- We have experienced segmentation faults on POY compiled on OSX 10.9
Mavericks with clang/LLVM3.3
CHANGES BETWEEN 5.0.0 and 5.1.0 Official Releases
New Features:
- Script Analyzer has some resolutions for parallel analysis that should
result in better performance of parallel execution.
- swap(parallel) command, see below.
- OPAM distribution via our github repo and basic_builder.sh. Please see our
opam-amnh repo (http://github.com/amnh/opam-amnh)
- Added initial level transformation of Aminoacid characters (during read);
previously we reported this option was being ignored.
Bugs Fixed:
- Reading morphology under Nexus had errors in processing SYMBOLS tag.
- Reading Nexus files with single quoted taxa-names failed.
- Parsering issue for trees with branch lengths in scientific notation and a
plus sign in their exponent, 1.0e+09. Now accept '+' symbol. (Thanks Cyrille)
- Under bremer (w/ negative constraints), nodes with excluded partitions were
not properly scored (as infinity); this resulted in trees with lower score
than expected during the bremer search.
- Correctly implemented the prealigned amino-acid characters to accept
resampling support techniques (jackkknife and bootstrap).
- Minor issues with basic_builder.sh with install and ./configure with parmap
- Certain level arguments were ignored since there are multiple ways to assign
level arguments, through the cost matrix assignment, or as an additional
argument (aminoacids and custom alphabet characters only).
- Fixed inlining C functions to support CLANG / OSX Mavericks.
New Commands (see manual for full explanations):
- Added documentation for commands in swap/build at_random, first, last. These
control how trees of the same cost are selected in the search/build. The
default is last, that is, the tree in memory is always replaced by the
newest found during a search/swap/build/fuse.
- swap(parallel) will swap in parallel a particular tree. This will allow
large trees that require a lot of computation (dynamic likelihood) to be
parallelized when a small number of trees are being analyzed. For example,
...
build(1)
transform(likelihood:(gtr,mpl,gap:coupled))
swap(parallel,all)
...
This will allow searching on the selected tree in parallel when the number
of nodes in the MPI run is > 1. The 'all' option is recommended so that each
node is fully utilized.
Known Issues :
- We have experienced segmentation faults on POY compiled on OSX 10.9
Mavericks with clang/LLVM3.3
CHANGES BETWEEN 5.0 beta2 and 5.0 Official Release
New Features:
- Status messages for progress when using Model Selection and Fixed States
transformation commands.
- report(diagnosis) produces columns of only relevant information.
- Improved error messages.
- Updated Documentation.
- Updated Test Suite and Tutorials.
- New option for optimization level of dynamic likelihood (see below).
Bugs Fixed:
- Script analysis would drop graphsupports command in finalized script.
- Due to a number of issues when loading trees with taxa missing from the
loaded data, we do not allow these trees to be loaded. The application
will report the missing trees and skip loading the information. This can
be used to generate a select command to filter terminals, like
select(terminals, not files:("missing_terminals_file"))
- General NonAdditive Characters causing error under Exhaustive DO. The
appropriate functions not being called from previous testing.
- Logic backwards in processing the level and orientation arguments for
reading BreakInv and chromosome characters.
- Loading data with unequal number of fragments (separated by #) did not
cause an error, instead filled sections with "missing" data.
- Issues with synonym files and processing trees (Ron Clouse)
- Build under dynamic likelihood joined two nodes with different models.
- Order of nexus blocks had Assumption block first, not last.
New Commands (see manual for full explanations) :
- set( opt:exhaustive_dyn) will optimize the dynamic likelihood model
directly, instead of via an implied alignment.
CHANGES BETWEEN 5.0 beta1 and 5.0 beta2
New Features:
- report trees with branches would work for likelihood characters, but would
not report parsimony branch lengths. The new command allows printing the
branch lengths from three different methods (see below), these options are
ignored under likelihood as we always report the parameter value that
maximizes the log-likelihood.
- report graphtrees, asciitrees, and trees with collapsible property have
been extended to use the branch lengths mentioned above (see command
description below).
- Custom Alphabet transformation to static character via static approx. This
was an issue with the characters that can be represented as prealigned and
how to properly transform between them.
- Added support for ocaml PARMAP for generating fixed-state cost matrices.
- Added command report(robinson_foulds) (see command description below)
Bugs Fixed:
- Selecting terminals caused an error in ncurses display when writing to an
incorrect window (Louise Crowley).
- Transforming to likelihood under a model selection criteria (aic,aicc,bic)
from a tcm under parsimony with different gap cost than substitution cost
resulted (in failure) with added data that represents the cost of a gap.
These should be filtered from the transformation, as in other likelihood
transformations on those characters.
- Custom-alphabet implied alignment produced an alignment with extra indels.
This was due to an encoding issue that became relevant when mixing custom
alphabet characters and levels.
- Issue with missing data in custom-alphabet prealigned characters resolved.
- generation of cost-matrix with all-elements row/col and non-zero diagonal
cost matrix --was being replaced with zeros, should have been min of row.
- No reported error when reading in sequence data with unequal fragments.
This forced the computation to proceed as if those fragments were
missing. Now (as in POY4) we report an error to the user.
New Commands (see manual for full explanations) :
- report command for parsimony branch lengths
report(trees:(branches:min))
- minimum number of changes is reported on branches
report(trees:(branches:max))
- maximum number of changes is reported on branches
report(trees:(branches:single))
- number of changes reported based on single assignment of dynamic
characters
- report command for tree distances using Robinson Foulds distance metric
report(robinson_foulds)
- will print matrix to terminal window
report("OUTFILE", robinson_foulds)
- will print matrix to file OUTFILE
Changed Commands (see manual for full explanations) :
- report command for collapsed branches has been changed. Instead of
collapse:true or collapse:false, we've extended the command like the
branch lengths above to collapse:min, collapse:max, collapse:single. The
branch is collapsed when the length as defined above is equal to 0.0.
report(trees:(collapse:single))
report("tree.pdf", graphtrees:collapse:min)
report( asciitrees:collapse:max )
CHANGES BETWEEN 5.0 alpha3 and 5.0 beta
New Features:
- Consistency in alignment procedures across the application. The trace-back
procedures produce the same result in affine (with gap opening equal to 0)
as normal alignment procedures, as well as alignment procedures with speed
increases (newkkonen), and the space saving algorithm.
- Continuous characters are fully supported above the range of 0-255, now
0 to the maximum size of integers on the machine. Although this change is
slightly slower, characters that do fit in the 255 range are vectorized as
previously implemented.
- build(N,random) does not do a random Wagner build, but generates and
diagnoses a random topology.
- Sankoff and Sequence characters with matrices of non-0 diagonal elements
have been implemented.
- Information theoretic model selection procedures have been designed under
static and dynamic likelihood (see new command below). This command can be
used on multiple character sets and types of data in which case the model
selected for each data-set are combined on the final tree(s). A tree must
be in memory when the command is executed. The command analises all the
models possible for each tree and selects the best based on the
information criteria selected.
- Updated build/compile procedures for different environments.
- Implemented No Common Mechanism likelihood model (see new command below).
- Bootstrap Probabilities under likelihood have been implemented. We use the
same command as BP for other characters previously.
- Better memory usage when loading multiple files of the same TCM.
- Speed increases in the diagnosis of normal and affine sequence alignment.
- Changed most (all found/possible) functions to tail-recursion to avoid
stack-overflows, should also increase speeds.
Bugs Fixed:
- Character Selection procedures (through the IDENTIFIERS in the command
structure) have been verified and implement special cases of each-other
when necessary for lower chances of future bugs. (Thanks to Fernando
Marques).
- Bug-fix with partitioned dynamic likelihood characters of multiple models
or in combination with static characters causing optimization failures.
- Error in transform(prealigned) on static characters --command only works
on dynamic characters. These characters should have been ignored.
- Likelihood model optimization routine returning matrix of NAN when branch
lengths were sub-normal; minimum value has been used to avoid this.
- Error in report(seq_stats) when missing data is present.
- Proper usage of missing data in iterative:exact and iterative:approx.
Previously missing data could be assigned in the median nodes of
characters in certain situations, resulting in 0 costs assignments in
subtrees, as well as errors in the median assignment functions. (Thanks to
Denis Jacob Machado)
- elikelihood for static likelihood characters had a bug in counting
'uninformative' data in estimating transition probabilities.
- Single assignment functions for the newkkonen alignment procedure were not
calling the proper median function.
- Static likelihood takes '?' into account correctly under fifth state (gap
as an additional state) models. Previously it was interpreted as a gap,
now it is interpreted as missing, like gaps in four-state models.
- Static likelihood was not taking into account missing data correctly.
- Correction for tree diagnosis in non-0 diagonal tcm matrix.
- selecting unique topologies may choose suboptimal likelihood trees if the
models/branches are different, we now select the lower of the two.
- transform(likelihood(...) -> transform(parsimony) resulted in an error
state of the application and incorrect costs from before the likelihood
transform command. Now the transform can be done to recover the parsimony
costs or vice versa.
- Join caused a failure in fuse, causing failure in diagnosis of tree.
New Commands (see manual for full explanations) :
- Command to set the optimization thoroughness for the likelihood procedures
has been added. The command,
set(opt:coarse)
set(opt:exhaustive)
set(opt:no_opt)
determines the number of passes for the optimization algorithm, and the
convergence factors for the numerical routines.
- Information theoretic model selection for likelihood uses the same
transform command as (e)likelihood, but replaces the model (ie. jc69, gtr)
with an information theoretic criteria --aic, aicc, or bic. ie,
transform(likelihood:(aic,rates:gamma:(4)))
- No Common Mechanism (ncm) has been added as an additional model under
likelihood. This is for static characters only. ie,
transform(likelihood:(ncm))
Changed Commands (see manual for full explanations) :
- reading prealigned characters (outside of nucleotides) have been unified
with the normal command structure. For example,
read( custom_alphabet:("DATAFILE", "MATRIXFILE"; init3D:true) )
now is,
read( prealigned:( custom_alphabet:("DATAFILE"), tcm:("MATRIXFILE")) )
The previous command structure was only briefly implemented in an alpha.
Known Issues :
- Pre-aligned affine data reports an incorrect cost. This option for
analysis has been turned off and an error is reported.
- Diagnosis on level over 5 creates a seg-fault. This is probably a memory
issue as the function would exceed many computers limits.
- Newkkonen space saving (set(space_saving_alignment)) command has been
de-activated due to segfault.
- 'help' commands generated from latex docs are a mess.
CHANGES BETWEEN 5.0 alpha2 and 5.0 alpha3
Bugx Fixed:
- Continuous characters are fully supported from Hennig86/Nona files. The
previous format that POY reads is the same (integers separated by spaces,
missing data represented as '?', and ranges defined in square brackets
separated by spaces. The data limit for the continuous characters requires
a maximum range of 255. (Thanks to Edmundo Gonzalez)
- Costs displayed on trees is incorrect for partitions/sets of characters.
This is fixed to represent the overall tree cost.
CHANGES BETWEEN 5.0 alpha1 and 5.0 alpha2
New Features:
- orientation set to true by default for breakinversion data-type.
- faster alignment under low-mem settings
Bugs Fixed:
- Fixed Makfile in src directory to perform the install, and removed the
Makefile/configuration in the root directory. These files were synonyms
for the ones the src directory and add no value to the compilation
process.
- Default for configuring with --enable-mpi is to set interface to flat.
This is a requirement that is oft forgotten and there is no reason why we
cannot facilitate that requirement. Setting interface to anything else
will over-write that choice and report a warning.
- Add Error message for input sequence with different number of
fragments(fragments are devided by '#'). (bug report by Torsten Dikow).
- Improved Makefile and Configure scripts from minor errors. Also removed
Makefile and configure script from the root directory to avoid confusion
and easier to maintain. (bug report by Jan De Laet).
- Report lkmodel (to report likelihood model), was not working properly in
certain situations; without identifiers. (bug report by John Denton).
- Transform likelihood with multiple types (for example a combination of
static and dynamic) would fail in the transform due to alphabet size
issues. We partition the data now between static and dynamic and then
apply the transform to the characters. (bug report by Fernando Marques).
- Used non-affine alignment for affine models under parsimony; this has
been reverted correctly, and also includes affine low-mem ukkonen.
- Compiling supramap and cmxs libraries dynamic linking rule was missing
in our myocamlbuild file. (bug report by Travis Treseder).
- Diagnosing Static and Dynamic Likelihood mixed models caused errors due to
demarcation of sets of data. Resolved so we group by the type of model
being applied to the characters as well as pre-defined sets and data
classes.
- Diagnosis for Dynamic Likelihood characters did not work on leaf nodes.
- Backtrace works the same in POY4 and POY5 for normal alignment; the issue
is in regard to the preference in inserting indels in which sequence.
- Affine alignment bug with aligning two sequences at a point each having
gap polymorphisms. This is a very rare instance.
New Commands:
- set(space_saving_alignment)
- set(normal_alignment)
- commands turn on/off low-memory alignment procedure. Default off.
Changed Commands:
- transform(chromosome:(newkkonen,..)
- transform(genome:(newkkonen,... ))
- this option is specified in the low-memory/space-saving alignment
procedure mentioned above.
CHANGES BETWEEN 4.1.2.1 and 5.0 alpha1
New Features
- Added likelihood criterion for diagnosing trees.
- Added methods of optimization for likelihood during build/swap/fuse
- Support for dynamic and static characters under a variety of models.
- Most Parsimonious and Maximum Average Likelihood cost models.
- Static and Dynamic character support for likelihood, including any
alphabet size (ie, discrete morphological characters, amino acid, ...)
- Added a variety of median solvers for rearrangements.
- Support for Genome and Chromosome characters with annotator Mauve.
- New selection method for polymorphic data in fixed state characters.
- Added level support on all alphabets sizes.
- Changed default TCM to 1,1.
- Updated configure scripts for newer versions of gcc and ocaml.
- Low memory option for alignment of sequences.
- Changed command for transform for dealing with identifiers.
- Require one type of delimiters in data files.
- Choice of equally costly medians can be user specified.
- pre-aligned for custom-alphabet and amino-acid characters.
- Allow additional medians by search-based command for fixed state
characters.
- Internal assignment in diagnosis output of fixed state characters print
taxon name.
- Default for amino-acid to not use 3D alignment in up-pass
- Better support for manipulating and calling character sets in data
- Graphic output for mauve outlining alignment of blocks and rearrangements
Bugs Fixed:
- Memory leaks in grappa interface.
- Building random trees does not use a modified Wagner build.
- Missing data is presented as a '?' in output from implied alignments.
- Avoid rediagnosing trees before certain operations.
- Issues in reading prealigned phylip files.
- Better support for all features of NEXUS files.
- Added POY block to nexus files for our specific needs, including
likelihood, chromosome, genome, and dynamic character information.
- Detecting file types is more accurate.
- Parsing file types has better support.
- Can transform Break Inversion and Custom alphabet with cost matrix file.
- Replaced command dynamic_pam to deal with new datatypes.
- now chromosome, genome, breakinv, etc.
- Priority for backtrace in the alignments standardized between floating
point alignment (dynamic likelihood), affine, and sequence characters.
- Nexus output is fully produced when called in the report command
--includes trees, set information, data, etc.
- Support for scientific notation of floating point numbers in parsers.
- Custom Alphabet and Break Inversion data characters case sensitive.
- Fixed and improved POY help documentation
- Initial cost for downpass in fixed state characters was incorrect (up-pass
and final costs were correct).
- Status messages during branch and bound build (after every 1% complete).
- 3D option set to false is observed after re-diagnosis.
- all element code (X) in amino acid is treated as a polymorphism
- Improved max_time behavior in searching
- Fixed cost issue in rearrangement for annotated characters
Features eliminated:
- dynamic_pam command has been replaced by the commands chromosome, genome,
breakinv, and custom_alphabet.
New Commands:
- transform( likelihood:( ... ) )
- transform( genome:( ... ) )
- transform( chromosome:( ... ) )
- transform( breakinv:( ... ) )
- transform( custom_alphabet:( ... ) )
- transform( parsimony )
- transform( level:INT )
- set( partition:( ... ))
- set( codon_partition:( ... ))
- swap/fuse/build( optimize:(model:(...),branches:(...)) )
- report( trees:(branches) )
- report( lkmodel )
Changed Commands:
- transform( [IDS], (transformations,...) )
- read( custom_alphabet:([datafile],[costmatrix],[prealigned]))
CHANGES BETWEEN 4.1.2 and 4.1.2.1
New Features:
- Added support for NEXUS output in the portal binary.
Bugs Fixed:
- Output of NEXUS files when terminals are filtered,
or dynamic homology characters are not present could
produce errors or a file that POY itself could not
read (Felipe G. Grazziotin).
CHANGES BETWEEN 4.1.1. and 4.1.2
New Features:
- Added build (nj) to build a tree using the neighbor joining
algorithm. The algorithm implementation is deterministic, so
only one tree can be produce for each dataset.
- Added read (prealigned:(....., gap_opening:INT)) to read
prealigned sequences and assign each indel block a gap opening
cost.
- The build process now uses ocamlbuild instead of plain
Makefiles.
- The distributed binaries are the first to support plugins for
POY, but the feature is still experimental.
- Improved the NEXUS file format support, by adding the POY block,
which now includes: GAPOPENING, TCM, and WTSET. All the
characters in POY can now be stored in one NEXUS file that can
be reloaded later. The support for NEXUS files has improved to
allow more transparent interaction with other applications.
- POY now interprets all the ASSUMPTION block commands in NEXUS
files that the program can apply (e.g. EXSET, TYPESET, and
USERTYPE).
- Added report (nexus) and report (trees:(nexus)) to generate
output in nexus format.
- report ("out.ext", data, trees) produce NEXUS or hennig format
depending on ext. If the extension of the filename to which data
and trees are generated is "nexus" or "nex" then the data and
trees are generated in NEXUS format. If the extension is "hen",
"hennig", or "ss", then the format is Hennig86. For example,
report ("out.nexus", data,trees) is equivalent to report
("out.nexus", nexus, trees:(nexus)).
- Replace spaces with underscores in taxon names of NEXUS files.
- Added support to newick files with branch lengths. The branches
are ignored.
- Added support to 1~4 ranges.
- Bremer support values can output on the branches of a consensus
tree or any user provided input tree. (See the new commands
below.)
New Commands:
- report (nexus trees:(nexus)).
- transform (dynamic_pam:(locus:dcj:INT))
- report (supports:bremer:of_file:("file1", "file2", ...))
Features eliminated:
- Dropped ti and td
- ':' is not accepted in a taxon name, required for newick file
format. For example "mytaxon:1" is not an acceptable taxon name
anymore.
Bugs Fixed:
- Fixed issue 72.
- Fixed incorrect description of select (missing) in the
documentation.
- Annotated chromosomes could produce incorrect costs
- Fixed compilation problems of the portal
- Data.get_tcm2d error (Buz Wilson, Katrina Menard).
- Sankoff characters where not applied the user defined weight.
- Branch collapsing in Sankoff and Additive characters could be
incorrect.
- Tree scores with additive characters may be incorrect (Taran
Grant). This only happened if the additive character had state
7.
- search (constraint) and swap (constraint) have multiple fixes to
better guarantee the restrictions set by the constraint.
- 0 length branches now appear with 0 bremer support when bremer is
reported.
- Fixed compilation error in Mac OS X 64 bits using OCaml 3.11.0.
- The total rearrangement cost included indels for rearranged
elements.
CHANGES BETWEEN 4.1 and 4.1.1
Improvements:
- Simplified naming of single fragments and partitioned sequences:
now the number only appears if more than one fragment is loaded.
For example, if a file a.fas has only one fragment, the old
naming convention for that character was a.fas:0. The name for
the character is now a.fas.
- Improved the message for bad NEXUS files comming from Mesquite.
Bugs Fixed:
- search () used too much memory in some cost regimes, causing a
dramatic drop in the application performance when searching
those costs.
CHANGES BETWEEN 4.0.2911 AND 4.1
Improvements:
- Improved detection of inconsistencies in synonym files.
- ci and ri could fail with a Not_found error when static
homology characters where missing in an input file.
- The characters produced from an implied alignment keep
the name of the original sequence character. For example,
the first base in the implied alignment of chel.aln:0 gets
the name chel.aln:0:ia:0.
- Improved the graphical output, now in pdf format.
- If the synonyms file is not found, POY stops the script
execution.
- New command to report bremer values using multiple input files
containing trees collected during a search. For example, if a
user has 5 separate files containing trees collected
independently using swap (visited:"file1"), swap
(visited:"file2")..., swap (visited:"file5"), then the created
files can be used to produce the bremer support values using
report (supports:bremer:("file1", "file2", ..., "file5"))
- New command report (supports:[jackknife|bootstrap]:"FILE") and
report (graphsupports:[jackknife|bootstrap]:"FILE"). FILE
contains input trees, and the report contains those trees with
the corresponding support annotations.
- Added support for compression using the zlib library in bremer
files using swap (visited:"file"). The files produced can be
decompressed with gunzip.
- Removed the need to make depend before compiling any target.
Bugs Fixed:
- swap with non-additive characters can fail with segfault.
- swap (visited:"file") fails in windows.
- Diagnosis of Sankoff characters could fail.
- Sankoff characters diagnosis in XML did not print correct human
symbols.
- Bogus warning message when reading a list of trees.
- Compressed bremer files fail to use all of the trees.
- Additive characters do not appear in the diagnosis.
- Improved the precission of the maximum time when using search ()
- Fixed stack overflow / seg fault when reading a large number of
trees.
New Compilation Requirements:
- Made OCaml 3.10.2 or superior a requirement (due to Camlp4
bugs).
- POY now needs the zlib library.
CHANGES BETWEEN 2885 AND 2911
BUGS FIXED:
- Printing long trees in parenthetical notation could cause a crash.
- After compiling with --enable-xslt poy behaves as if not enabled.
- Transform (static_approx) with aa characters can fail.
- Test build/build9.poy fails.
- Issue#69 and some improvements to the Makefile rules.
- Multiple improvements and bugfixes in the swapers and tabu managers.
- Compressed files did not handle \r\n -> \n conversion in win32.
- Crash in Mac OS X - PPC.
- Fixed bug in the phastwinclad output.
- Hennig and Nexus parsers do not handle polymorphisms.
- Issue #67
- Iterative pass under affine and missing data could assing power sequences
(and therefore incorrect tree cost).
- Reading some very large files could cause a stack overflow error.
- Some input file channels where never closed, and many forks can be used.
- Windows fails on report (supports:bremer:"x")
- read (prealigned:("x", y)) some times fails.
- reading trees containing synonyms after rename () does not work.
- report (supports:bremer:"file") contained values for the root.
IMPROVEMENTS:
- Significant improvements in the performance of search (): it finds
better trees.
- Added -enable-xslt to the top configure for help purposes.
- Added support to make install DESTDIR=path
- Upon reading trees, POY verifies that all the loaded terminals are
present.
- Some times WinClada fails to read trees in Hennig format.
- Reduced memory consumption in XML conversion functions.
- Changed from Error to Warning the "You are loading a non-metric TCM"
message.
- Added XML file for SWAMI bioinformatics portals, and the companion
poy_server program.
CHANGES BETWEEN 2880 AND 2885:
- Bugfix: Reading some very large files could cause a stack
overflow error.
- swap (visited:"f") and report (supports:bremer:"f") compress the
file f.
- Bugfix: swap (visited) could print the wrong cost when only
static homology characters have been loaded (Taran Grant).
- New Arguments: report->searchstats, build->threshold,
build->lookahead, transform->max_kept_wag
- Improved the handling of the program arguments for better
execution in more implementations of MPI.
- Bugfix: Static approx fail to add gaps in non-affine sankoff
matrices.
- Bugfix: Make sure that characters that should be ignored
according to the Nexus and Hennig files are indeed ignored.
- Bugfix: help () did not work! Issue# 65.
- Multiple documentation improvements.
- Bugfix: Missing sequences could cause bogus 0-length branches in
the tree.
- Improved build (_mst).
- Bugfix: Duplicate character names in input files overwrite older
ones.
- Bugfix: Hennig/TNT parser do not accept empty comment in xread
command.
- Bugfix: The ncurses interface some times, misteriously,
internally deletes characters that are not deleted in the
screen, producing illegal commands.
- Improved error messages for Mesquite's TITLE command in NEXUS
files.
- Improved detection of Hennig/NONA/TNT files.
- Eliminated the full help on command errors. This is only
confusing.
- Improved the behavior of the GenBank file format parser.
- Bugfix: Equate within polymorphisms didn't parse in
Hennig/Nexus.
- Hennig parser now interprets the nstates command.
CHANGES BETWEEN 2870 AND 2880:
- Improved static approx for annotated chromosomes.
- Fixed multiple bugs in the implied alignment for chromosomal and genome characters
- Issue #55: Sorted the characters by code in report(crossreferences:names:("x")).
- Issue #57: Use non-additive characters in more cases after static approx.
- Fixed typo in report (memory) (Campations is Compactions).
- Issue #58: read ("unexistant") causes a hang in POY when running in parallel.
- Fixed bug in genome alignment
- Documentation update: Reference filed partially updated to reflect more uniform 2 intial format.
- Fixed Issue #56 and added set (iterative:false).
- Cleaned some harsh words in source code.
- Fixed cost matrix problem in the genome character
- Improved various details in the set (iterative:false) algorithm.
- Fixed Issue# 59 and removed bogus user messages.
- Added the options iterative:exact and iterative:approximate.
- Fixed Issue #60
- Modified heuristic of search (), and fixed diagnosis bug under iterative.
- Fix a bug in suffix tree creation.
- Bugfix: transform (weight) does not automatically update trees in memory.
- Improved the behavior of the tree selection under negative weights.
- Bugfix: accept negative values for random number generator seed.
- Minor fixes in report(data)
- Added seq.h header to eliminate a compile time warning.
- Added files to ignore from the documentation and compilation accessory files.
- Bugfix: Issue #61
- Bugfix: help (search) shows many results but not just the command search.
- Bugfix: help (search) shows a paragraph with vertical boxing (missing brackets).
- Bugfix: help (search) shows all the See Also items in one line.
- Added man pages.
- Bugfix: Issue #62
- Bugfix: The Fixed States field of report (data) is always empty.
- Handle negative segments in chromosomes when creating implied alignment
- Take into account inverted segments in annotated chromosomes
- Bugfix: Fixed a wrong case handling in the diagonal extension opening a gap.
- Simplified the execution of the changes introduced in the previous commit.
- The iterative algorithm now follows a postorder traversal.
- Added search->visited argument.
- Added search->constraint
- Fixed bug in file constraint ignored during break.
- Change kept_wag default value from 3 to 2
- Activated iterative algorithms for chromosome characters
- Improved 3D-chromosome and genome alignments
- Fixed bug in build with internal transform.
- Fixed bug in analyzer for expressions like build (10, transform (static_approx))
CHANGES BETWEEN 2635 AND 2870:
Improvements:
- Added a new initial assignment of sequences using fixed states.
- Added the new argument transform -> direct_optimization and
modified transform -> fixed_states.
- Multiple crashes in parallel execution have been corrected.
- Massive improvements in affine gap cost.
- Added filters for redundant tree evaluations during spr and tbr.
- Improved TBR.
- Changed the default swap strategy from alternate to TBR.
- Improved the command search ().
- Newick is the new default format for all the parenthetical tree
output functions.
- Reduced memory consumption by more than half for DH chars.
- Improvements in the performance of perturb operations.
- Added the command set->timer:INT.
- Limited number of rediagnosis performed during search.
- Improved the performance of swap(constraint).
- Disabled iterative in non-sequence chars.
- Modified the following arguments for better readability:
1. seq_to_breakinv -> custom_to_breakinv
2. breakinv_to_seq -> breakinv_to_custom
3. breakpoint -> locus_breakpoint
4. inversion -> locus_inversion
5. approx -> med_approx
6. sig_block_len -> min_loci_len
7. rearranged_len -> min_rearrangement_len
- Speedup tree fusing.
- Modified timeout's behavior (see documentation.)
- Moved the select (terminals) report from stdout to stderr, where
it belongs.
- Improved transform (static_approx) when using affine and sankoff
tcm's.
- Simplified representation of SankCS.t to reduce memory
consumption.
- Reduced the number of reports of the current search state.
- Improved tree fusing by not swapping those trees that are
already in the population.
Compilation Changes:
- New Configure Option: --enable-large-messages. See ./configure --help.
Bugs Fixed:
Bugfix:
Symptom:
Searches under gap_opening:x can fail with a segmentation
fault (Buz Wilson).
Problem:
The computation of the medians under affine is not atomic,
yet some heuristics assumed they where, breaking some
invariants.
Solution:
Store the complete median for use, and don't recompute the
individual medians.
Bugfix:
Symptom:
Certain scripts containing wipe () fail to be analyzed
(Lara Lopardo).
Problem:
The analyzer has a glitch for wipe () and use ().
Solution:
Analyze the portions separation by wipe () and use ()
independently.
Bugfix:
Symptom:
Orientation and init3D are true even if the user doesn't
set it to that value.
Problem:
We check if the list of options does not contain
`Orientation false or `Init3D false, but the default
selection contains an empty list, makeing this true.
Solution:
Check if the list of options contains `Orientation true
or `Init3D true.
Bugfix:
Symptom:
Reading an input tree when data some terminals have been
selected causes a Not_found error.
Problem:
We assign code in the tree by counting the number of
leaves instead of the number of taxa loaded in Data.d.
Solution:
Use the parameter in Data.d.
Bugfix:
Symptom:
transform (randomize_terminals) fails with a Not_found
error.
Problem:
We eliminate all the data and recompute the tree de novo.
But before recomputing we compare the nodes to see if
something really needs to be recomputed, the problem is
that some of the nodes are missing!
Solution:
force the update and if the error occurs, trigger the
update.
Bugfix:
Symptom:
report (diagnosis) may show some bogus character states.
Problem:
The reported characters remain classified in groups,
therefore, for some of the input characters, the reported
states may not match, thought the overall cost is correct.
Solution:
Reload the tree with the raw nodes, without
classification.
Bugfix:
Symptom:
transform () together with use () and store () may result
in empty datasets.
Problem:
The nodes are not necessarily regenerated when the data is
loaded back with the use () command.
Solution:
To avoid this problem, we store both the data and the
nodes, and use them when requested. This has 0 speed
penalty, and some memory penalty.
Bugfix:
Symptom:
If there are 0 characters loaded, a Random.int error is
raised when the calculate_support (bootstrap) command is
issued.
Problem:
We don't verify if the number of characters is greater
than 0 before doing the resample, and Random.int requires
a positive integer.
Solution:
If there are 0 characters, there is nothing to resample.
Just return the same array.
Bugfix:
Symptom:
Assertion failure when the input tree contains terminals
that don't exist in the input files.
Problem:
We don't verify the leaves before attempting to build the
tree, which has now a predefined set of codes. If the
input tree contains _more_ leaves than vertices require
the input data to produce a tree, the program will break
an assertion.
Solution:
Check the leaf names before attempting to load the tree.
Bugfix:
Symptom:
read ("bleh") produces a background File Not Found
messages in windows, which can appear anywhere in the
screen.
Problem:
We don't use proper handling of stderr in the windows port
for this kind of command.
Solution:
Redirect stdin, stderr, and stdout, ignoring the first two
completely in every architecture, by using not the system,
but directly OCaml's Unix module.
Bugfix:
Symptom:
Some times POY prints a large number instead of INF for
trivial Bremer support values.
Problem:
We check if the number is large, but floating point
comparison may fail.
Solution:
We now do a rough comparison for a number half of the
internal infinity number (Pervasives.float_of_int
(Pervasives.max_int / 4)) and if larger, then we assume it
is just infinity.
Fixed typo in Perturbing message (was Perburbing) (Buz Wilson).
Bugfix:
Symptom:
Reading a nexus file with comments inside the matrix
itself failed with Segmentation fault.
Problem:
We have an endless loop that causes an Stack Overflow
error, which can be a segmentation fault in some
architectures.
Solution:
Fix the endless loop by incrementing the counter before
making the recursive call.
Bugfix:
Symptom:
Reading UNALIGNED blocks in NEXUS files fails.
Problem:
Internally we convert the NEXUS matrix into a FASTA file,
but the generated file is ... incorrect.
Solution:
Make sure that the resulting file is correct.
Symptom:
read (aminoacids:("A")) transform (tcm:(1,1)) report (ia)
fails with a Not_found error (Boyan Alexandrov).
Problem:
The gap code is not the last code of the alphabet, which is an
assumption of the tcm generator.
Solution:
Exchange the integer code of the X and the gap.
Symptom:
Tree fusing fails when running static homology characters only
(Ward Wheeler).
Problem:
When using static homology characters only, no information is
attached to an edge, but the functions assume (and assert)
that there is indeed information associated with an edge.
Solution:
Generalize the code to consider the other possible set of
algorithms.
Fixed bug in the serialization of the three dimensional cost
matrices.
Symptom:
During parallel execution, if the dataset includes additive
characters, POY may crash (Fernando Marques).
Problem:
The serialization functions for additive characters did not
use proepr macros for 64 bit and 32 bit environments. They
also assumed that successive mem_malloc calls would produce
successive memory locations (clearly incorrect).
Solution:
Define the required macros to handle properly native integers,
and serialize and deserialize all the vectors independently.
Symptom:
poy script.txt when poy is compiled in parallel fails for the
following script:
read ("x")
build (10)
select ()
swap ()
report (trees)
with the Warning: "No trees in memory" (Fernando Marquez)
Problem:
The `GatherTrees command does not run the merging instructions
if only one process is being executed.
Solution:
For every branch of the tree exchange algorithm, run the
joiner set of instructions.
Symptom:
Issue 51 (loucrow and Fernando Marquez).
Problem:
The GatherTrees command had some bugs in the way it was
merging trees from different processes.
Solution:
Simply merge the stored_trees and trees fields from
Scripting.run and do not postprocess trees thanks to the
change described above.
Symptom:
Reading a NONA/TNT file produces trees rooted in the last
terminal of the file (Federico Lopez).
Problem:
The parser of NONA/TNT files, just as the Nexus parser,
produces the output in inverse order.
Solution:
Repeat the solution for Nexus files in 2635 with NONA/TNT
files.
Symptom:
select (terminals, "filename") outputs a table of included and
excluded for each slave.
Problem:
Although we filter output for slaves, table output is not
being filtered out.
Solution:
Filter the table output to include only output requests from
the slaves.
CHANGES BETWEEN 2602 AND 2635:
Bugfixes:
- read (prealigned:("file.txt", tcm:(1,1))) report (phastwinclad)
prints an error when file.txt contains fragments and some
fragments are missing (Julian Faivovich).