Chapter 6 Learning Bioinformatics
6.1 Books
6.1.1 Sequence analysis
BLAST: An essential guide to the Basic Local Alignment Search Tool
Introductory book to the topic of sequence aligments in general and Blast in particular. Includes Perl code.
Biological sequence analysis: Probabilistic models of proteins and nucleic acids
Advance level understanding of the theory of sequence aligment and Hidden Markov Models (HMM).
6.1.1.1 Sequence Analysis in a Nutshell: A Guide to Tools and Databases
General introduction to many of the sequence analysis tools covered in this course.
Molecular evolution and phylogenetics
Introductory level book on the topic of inferring phylogenies from sequence information.
6.1.2 Omics technologies
6.2 Online resources
6.2.1 Online courses
There are many free online courses by instructors from reputable universities at web sites like edX. You can find many topics related to Bioinformatics, Computational Biology, Statistics and Data analysis.
6.2.2 YouTube
Many Bioinformatics resources are putting a lot of resources into teaching how to use their services, with an emphasis on newcomers. Some of them provide introductory tutorials and case studies in videos hosted in their YouTube channels. For example there are channels for NCBI, Uniprot, EnsEMBL, and Bioconductor, to mention a few.
6.2.3 How to use the command line
Often using Bioinformatics tools requires knowledge of how to use the command line, specially in Linux/Unix environments. http://linuxcommand.org has a nice introduction to the Linux command line. Another interesting resource is https://www.codecademy.com, providing an interactive learning experience (It offers also paid courses). Also, here there is a tutorial on Bash scripting.
6.2.4 Programming languages
Learning a programming language is not absolutely necessary when using most of the bioinformatics tools described in this book. However, when analyzing high-throughput datasets it is neccessary to use them. In particular, Python and R are the most used languages in data science, including Bioinformatics. There are many free online resources to learn the basics of these programming languages. Check their web pages for tutorials for beginners.
6.2.5 Support Communities
Learning Bioinformatics, as with any other complex topic, can be daunting. Often Bioinformatics software come with good and detailed documentation and. It is also possible to frequently find short tutorials that can help newcomers get started. It is important to get a good understanding of the methods being used when the goal is to publish your results. The Bioinformatics community can sometimes help in the learning process by providing solutions to the most frequently encountered problems. Here I will highlight some of the most relevant online communities for the Bioinformatician.
Biostars
Biostars is an online community deboted to answering questions about “bioinformatics, computational genomics and biological data analysis”. Users can post questions, get answers in the same post. There is a rating system that enables to give credit to good questions and answers. It is possible to include comments for further clarification and discussion.
Bioinformatics Stack Exchange
Another useful resource for solving general questions related to Bioinformatics can be the Bioinformatics Stack Exchange network. These site follows the same phylosohpy of Biostars (indeed, Biostars originated in the old version of Stack Exchange) but offers, in my opinion, a better user inteface. The site is still in beta but most of the Biostars community can be found also in this new site.
Bioconductor support site
The Bioconductor support site provides support for questions related to the use of Bioconductor, using a similar interface to Biostars.
How to use the support sites
It is important to remember that online support sites are usually driven by voluntarees that lend their time and expertise to help others. Because the number of questions put in one of these support sites can be very large, it is very helpful if the questions themeselves are written in a way that simplify the work of those willing to help us with our doubts. For example, it is important to write simple yet meaningful titles that summarize what our problem is. This way the community can easily know if the question is related to a topic about which they have expertise. When describing your issues, be as explicit as possible. And whenever possible, include a minimal reproducible example. This is particularly important if you have problems with programming code but it can also apply to using a web server. For example, if you had some problem using a web server to generate a sequence aligment, include all the steps you performed to obtain the error (if any) you obtained. If possible, when needed, include a minimal example dataset to reproduce the error. In this example, it could be a small file with a few sequences that trigger the error.
The particulars will depend of the software and problem you are having. But as a rule of thumb it is always helpful to put yourself in the place of those reading your question, and imagining whether that is enough for them to understand your problem.