Peter's analysis of batches A to J created 3405 Newick files.
1476 of these 3405 contain 4 valid taxa or more.
552 out of those 1476 contain the UNKNOWN entity and for this run are discarded.
This leaves us with 924 valid trees of 4 or more taxa.
Across those 924 files are >11400 tips - a mean of over 12 taxa per tree file.
These are the source trees for the supertree analysis.
These 924 trees were converted into an MRP matrix. That MRP matrix contains 5745 taxa & 9498 characters.
There were easily visible errors in this matrix, such as misspelt taxa, but nevertheless as this is an intermediary proof-of-concept analysis, I proceeded to analyse the MRP matrix with Maximum Parsimony in TNT.
A single search replication using one random addition sequence (RAS) and Tree-Bisection-Reconnection (TBR). It took 4 hrs 41 minutes to complete overnight on my HP Envy Sleekbook. The length of this output tree is 10206 steps. This file are provided on github here: https://github.com/ContentMine/ijsem/tree/master/supertree-analysis
The output supertree is the 'strict.tre' file.
The supertree has 8775 nodes in total, of which 5745 are leaves.