Brownian motion in cladisticsnext section
Under the conditions examined in our simulations, ordered parsimony performed best, thus confirming our first hypothesis. Simulating characters on a phylogeny has the advantage that the reference phylogeny is known without error. However, the Brownian motion, evolutionary model that is widely used in simulations of evolution, and which we used here, impacts upon the results, and reflects theoretical assumptions. Under Brownian motion, character evolution is stochastic and unpredictable, as are many historical events, but follows a general pattern that reflects the phylogeny, which can be inferred by analyzing character state data. Here, because of the generation of discretized continuous characters, the distribution of character states is unimodal (S6). They are intrinsically ordered and thus represent morphoclines. Brownian motion, like most other models of molecular evolution, such as GTR+I+Г (Tavaré, 1986) or the speciational model, is not directional (i.e., it implies no trends). Thus, the relationships between character states can be represented by unrooted trees (Fig. 1A). Brownian motion thus leads to an intrinsically ordered and unpolarized modeling (when the root condition is not specified in the simulation, as is the case here), contrary to models implying trends or irreversible evolution, which are intrinsically polarized. Under Brownian motion, the probability for a character state 0 to evolve into 1 is greater than the probability for 0 to evolve into 2 in a short time; it was thus expected to favour ordered parsimony over unordered parsimony, and to a lesser extent, 3ta. However, this model, one of the simplest, seems applicable to various characters, such as ontogenetic sequence data (Poe and Wake, 2004; Poe, 2006).
Reversals in phylogenetics
A particularly controversial issue in cladistics concerns the treatment of reversals. Proponents of parsimony (Kluge, 1994; Farris et al., 1995; Farris, 1997; Farris and Kluge, 1998) and 3ta (De Laet and Smets, 1998; Siebert and Williams, 1998) have been deeply divided on this particular issue. In parsimony, a transformational approach to homology using the Wagner (Farris et al., 1970) and Fitch (Fitch, 1971) parsimony algorithms treats characters from the perspective of unrooted character-transformation trees (Slowinski, 1993). Reversals provide information and can serve for clade support because they are evidence of secondary homology with the appropriate test of maximizing congruence. This maximization of congruence leads to search the pattern with the minimum of ad hoc hypotheses that are convergences and reversals. For Farris (2012), reversals can be inferred a priori in the inference of primary homologies but also from an analysis: ‘More fundamentally, even if (as I have seen other authors suggest) Hennig would have preferred to distinguish apomorphies from plesiomorphies before starting to construct the tree, he was obviously willing to revise assessments of plesiomorphy during tree construction, for Hennig did in fact recognize reversals and apply them as synapomorphies’. This is the most widespread point of view in cladistics, which prevails in systematic paleontology, and the only point of view represented in probabilistic methods, which prevail in molecular systematics and are starting to be used on phenotypic characters as well (Müller and Reisz, 2006). Assumptions of 3ta are much less familiar. In 3ta, hierarchical hypotheses of homology, i.e. a nested set of character states, are submitted to a test of congruence. The test either accepts or rejects the relevance of the hypothesis. Convergence is one of the multiple explanations of rejection. ‘Parsimony-like reversals’, i.e. the generation of hypotheses of homology not proposed by the systematist but generated by the method, violate hierarchical classifications. Thus, they cannot be justified in 3ta rationale. Evolutionary reversals, i.e., losses of instances of character-states, are not used in 3ta to support nodes; they represent only homoplasies, i.e. mistaken hypotheses of homology.
For instance, the evolutionary hypothesis (generated after an initial analysis by inferring character history on a tree) involving three conditions deduced a posteriori from an analysis: 0 (‘absent’), 1 (‘present’) and 0* (‘secondary absence’; scored the same in a matrix but interpreted differently from primitive absence on a tree) is interpreted differently under parsimony and 3ta. The secondary absence can be explicitly represented as an apomorphy in the primary homology hypothesis (0(1(0*))), under parsimony. Another interpretation (3ta) consists in disregarding secondary absence as synapomorphic but to consider it as a particular case of absence: (0,1(0*)). Here, neither the absence nor the reversal is considered as a state (neither plesiomorphy, nor apomorphy) because the absence is not a state in 3ta (in Fig. 3, 0*, 0** and 0*** are not considered in 3ta). Parsimony proponents favour the first option, which yields support for six clades in the phasmatodea phylogeny (Fig. 3A). To summarize, the first interpretation (parsimony) considers a loss as an homology and a synapomorphy (because it supports a clade), an homoplasy (because the primary hypothesis is falsified by the distribution of the other characters) and a plesiomorphy (as defined in the matrix), according to Brower and de Pinna (2014). The second interpretation considers a loss as uninformative: it is neither an homology nor a synapomorphy (because it supports no clade), it is not an homoplasy (because the primary hypothesis is in agreement with the distribution of the other characters) and it is not a plesiomorphy (because the absence is not a state in 3ta; it is the root, including all). 3ta proponents favour this interpretation: only one clade in the Phasmatodea phylogeny is supported, and the only synapomorphy is the homology reflecting the first appearance of wings. These two interpretations are thus in perfect opposition. Here we emphasize that Brownian motion is only coherent with the assumptions entailed by the first interpretation: reversals (i.e., secondary absence) are treated as apomorphies in the primary homology hypotheses (as an order with parsimony, or as a hierarchy with 3ta). Our simulations produce informative reversals under Brownian motion, which can be exploited only under a parsimony viewpoint of these reversals: our results present a quantification of the loss in resolving power and artefactual resolution in 3ta if true and informative reversals are present (i.e. if true reversals are simulated and ‘hidden’ into the same state as the plesiomorphy but which support a clades of the known tree). Thus, our results must be interpreted accordingly. Irreversible characters might yield different results and will be tackled in another study.
We take this opportunity to propose a nomenclatural clarification about reversals (based on the example in Fig. 3A) as secondary homology hypotheses; thus, this clarification is valid both for parsimony and for 3ta. First rounds of reversals are generally called ‘secondary losses’ (e.g. (Carine and Scotland, 1999), when in fact, only the absence should be considered secondary and the loss in itself should be considered as an event that appeared for the first time (i.e., primary). Thus, a character state is primitively absent (primary absence; state 0 on Fig. 3). It can then appear; this is a primary appearance (of state 1), denoted +1 on Fig. 3A. It can be subsequently lost (-1, reversal to state 0, but identified as 0* on Fig. 3a, for greater clarity); this should be called a primary loss, which results in a secondary absence. After this, a secondary gain (+2) can lead to secondary presence (1* in Fig. 3A), and a secondary loss (-2) can lead to ternary absence (0** in Fig. 3), etc.
We failed to find significant results on the impact of outgroup branch lengths and uncertainty in polarization on the neontological trees (Fig. 5A-C), except on some trees built from the symmetrical reference topology.
The effect of outgroup branch length appears to be much stronger on paleontological trees, perhaps because of the shorter branches in the ingroup. Tests on a non-ultrametric version of the equiprobable tree (Fig. 5D) with a variable outgroup branch lengths show that the performance of all methods decreases when outgroup branch length increases (Fig. 7; Table 2). Results of ordered parsimony and 3ta are very similar, with unordered parsimony performing much more poorly. 3ta is more sensitive to outgroup branch length, an effect that might be linked to reversal treatment, but that in any case confirms our second hypothesis. It is however surprising to see that an incorrect polarization has so little effect on resolving power and artefactual resolution.
Character states and ordering schemes
Our simulations clearly show that unordered parsimony performs far worse than the two other methods when reliable criteria for character state ordering are ignored (Fig. 6). All states were ordered (except for unordered parsimony) using a similarity criterion, whereby transition costs (step-matrices in ordered parsimony) or state hierarchy (3ta) reflect similarity (and outgroup condition, for 3ta). Other ordering criteria exist (Hauser and Presch, 1991), but our results clearly indicate that ordering character states is preferable when characters can be shown to form morphoclines. These results suggest that the current tendency not to order characters in phylogenetic analyses is suboptimal, and shows that important benefits could arise from considering ordering schemes when it appears biologically justified. Note that such ordering requires no prior knowledge of the phylogeny; only knowledge of the character distribution or likely evolutionary model (which can be gained from genetic or developmental data, among others) is required. Ordered parsimony and 3ta share more similarities on this particular point than unordered parsimony.
Tree shape and branch length
Not all topologies are equally difficult to recover accurately. The pectinate topology (Fig. 5A), which shows the longest terminal branches (when its internal branches are all of about the same lengths), is most difficult, followed by the equiprobable (Fig. 5C) and the symmetrical topology (although this may be linked with the fact that internal branch lengths of that tree were roughly proportional with the number of descendant taxa, except for the branch below the ingroup; Fig. 5B; Fig. 6; Table 1). This confirms our second hypothesis. Modifying the distribution of branch lengths yielded results that are more difficult to interpret (S5). Our first assumption was that the results would improve (i.e., the resolving power would be greater and the artefactual resolution lower) as the terminal/internal branch length ratio decreases. This assumption is corroborated for the trees D, E and F (Fig. 5) for resolving power under ordered parsimony and 3ta, but only for trees D and E for unordered parsimony. Tree G yielded worse results, perhaps because some internal branches are shorter. Results on artefactual resolution are more complicated to interpret. Along with the branch lengths of a tree, the 3ts content of clades is another important parameter to consider when performance is assessed using the ITRI. It is directly connected to tree shape, i.e. the number of terminal taxa inside and outside each clade (Nelson and Ladiges, 1992). As a consequence, some clades and characters they support will affect more the ITRI than others. More specifically, clades with few taxa within them or with few taxa outside them have less 3ts content than clades containing an intermediate number of taxa (in our trees, clades with the maximal 3ts content have ten taxa inside and eleven taxa outside). These imbalanced clades will impact results only slightly compared to balanced clades. This may explain partly why tree shape influences phylogenetic reconstruction.
Comparisons of results between an ultrametric tree (Fig. 5C) and a paleontological tree of the same topology (Fig. 5D) show that the latter features better resolving power and less artifactual resolution for ordered parsimony and 3ta. This result is congruent with the claim that adding extinct taxa breaks long branches, which results in important improvement on the resolution of the optimal trees. This claim is supported both on empirical data (Gauthier et al., 1988) and simulations (Huelsenbeck, 1991).
Implications on simulation-based studies and evolutionary models
Our results highlight the advantages of models under which the relationships between character states can be represented by an unrooted tree (all molecular models) over 3ta hierarchical coding, if reversals can be simulated as in our study. Our simulations under Brownian motion can be represented as in Fig. 8. Firstly, characters are generated under a priori assumptions (here, the evolutionary model represented in Fig. 8A and the reference phylogeny shown on Fig. 8B). In a second step, these characters are interpreted as primary homology hypotheses by discretization and (for 3ta) conversion into hierarchical structures. Our procedure for simulation of characters, and by extension the use of evolutionary models, reflects quantitative characters that display informative reversals.
In parsimony, states that can be hypothesized to form morphoclines should be ordered if there is evidence. In this case, reversals can be represented by primary homology hypotheses. Empirical studies suggest that Brownian motion is a reasonable model for several types of characters, such as body size (Laurin, 2004), bone microanatomy (Canoville and Laurin, 2010), etc. Some characters do not seem to follow a strictly Brownian motion model, but instead may follow a speciational model (in which change occurs in both daughter-lineages at or near speciation, but no anagenesis takes place), such as morphological shape data in ratites (Laurin et al., 2012), a punctuational model (similar to the speciational model, but change occurs only on one branch) or an Ornstein-Uhlenbeck model (Uhlenbeck and Ornstein, 1930; Felsenstein, 1988). Some of these alternative models can be obtained from Brownian motion by modifying the branch lengths of a tree (Garland et al., 1993). Hence, data produced using such models might give results close to those that we report below, which should be applicable to much biological data.
In 3ta, the same representation of change is used for groups of parts (homologues) and for groups of whole organisms (taxa). In a transformation, the transformed part is considered a different object. In differentiation or modification, the differentiated or modified part is a particular form of the unmodified part. Many systematists accept this difference for taxa (e.g. birds as a differentiated taxon nested in Dinosauria) but not for parts (e.g. feathers as a modified part, nested in scales). In this context, reversal as a loss is perfectly acceptable. However, reversal as synapomorphy is viewed as pointless.
All this suggests that irreversible characters (Nopcsa, 1923; Goldberg and Igić, 2008; Kohlsdorf et al., 2010) are ‘more compatible’ with 3ta. Further work should be done to compare resolving power and artefactual resolution yielded by parsimony and 3ta when an irreversible evolution model is used (which can be understood as hierarchic). Another interesting field of research consists in developing methods that better simulate Darwinian evolution applied to digital life forms, as in the software Avida (Adami and Brown, 1994) to understand better the behaviour of parsimony and 3ta.