 |
|
|
 |
| |
|
List Price: £26.99
Our Price: £20.31
Author:
Barry G. Hall
By Sinauer Associates Inc.,U.S.
Extraordinarily good, compact and value for money, 2008-06-03 Shortly after I got the second edition, I got the third. Within the covers of this slim volume is all the information you will need to generate the most elaborate hypotheses you may dare to venture on the basis of all the hard work you may have put into obtaining a data set of protein or DNA sequences.
The chapters are clearly laid out and this is a cook book - you can follow the instructions chapter by chapter using freely available software that it recommends. All the way from organizing your data, making an alignment and turning this into a simple tree, to computing genetic distances and finally producing trees by Baysian Inference or Maximum Likelihood.
Within certain relevant chapters are essays explaining the mathematical bases of tree building, models employed and the various kinds of trees and how they are built up - chapters on Trees, Neighbour Joining, Maximum Parsimony, Maximum Likelihood, Baysian Inference and there are downloadable programs that work fine as per the instructions like Fas2PhyNex which instantly transforms your Fasta file format into a Phylip and Nexus format. This book is a treasure trove of such packages and the simplest available descriptions of the sort of work you're up to at a level suitable for beginners and those with background (tells it to you straight without dumbing down). Those who go through your conclusions later may admire you for your profound understanding of the subject, on the basis of this book.
I feel the book could have tackled the following issues more clearly: 1. Establishing genetic distances for publications, uncorrected vs corrected - this book gives information on how to establish distances using the P distance model and the pairwise deletion option in MEGA - but I had to read further before realising it was sort of all there - but perhaps not emphasised enough (descriptions of new species frequently give uncorrected genetic distances at least). 2. Whether data from two different sequences for the same taxa should be combined to create a single data set for tree building, and when or when not to think about this (you can also make separate trees and combine them consensually). 3. What to do about gaps in the data - for example, I had several intact sequences of around 1kbp but a few were only 600bp - what problems would the shorter sequences cause? In the event, I had full and truncated data sets and trees for comparison. I found that having the odd short sequence was not too problematic - or at least, that's how it appeared.
This book is almost perfect, but it should be supplemented if you're a beginner (as with many books). I strongly recommend - Bioinformatics for Dummies. Using Bioinformatics, I realised that making alignments in T-coffee was better than using the MEGA multiple alignment system in terms of ironing out any errors that may be fiddly to correct by hand.
For information, whereas PHYML was superb for ML trees, I actually found that the ALRT bootstrap implementation just failed several times and went to use the traditional method. In a similar vein, I found the older versions of PHYML more effective that the latest, given sometimes the latest versions just did not deliver the goods.
I found that the author to be totally perspicacious and helpful and he has struggled so we don't have to. He has been wise to create such a slim volume - to paraphrase from the Seven Pillars of Wisdom by TE Lawrence - A Triumph. With a little supplementary reading (as this book recommends, do read the bits of the Mr Bayes Manual and take other pointers into account) the book is a gem - well worth the small cost and saving in Library space. I certainly look forward to future editions without begging for them as this book is so good.
Followed it fairly religiously and I will owe my upcoming award of a PhD in part to this helpful manual/friend.
|
|
|
|
|
 |