Key Bioinformatics Computer Skills

I’ve been asked several times about which computer skills are critical for bioinformatics. Important – note that I am just addressing the “computer skills” side of things here. This is my list for being a functional, comfortable bioinformatician.

  1. SQL and knowledge of databases. I always recommend that people start with MySQL, because it is crossplatform, very popular, and extremely well developed.
  2. Perl or Python. Preferably perl. It kills me to write this, because I like python so much more than perl, but from a “getting the most useful skills” perspective, I think you have to choose perl.
  3. basic Linux. Actually, being at a semi-sys admin level is even better. I always tell people to go “cold turkey” and just install Linux on their computer and commit to using it exclusively for a while. (Due to OpenOffice etc, this should be mostly doable these days). This will force a person to get comfortable. Learning to use a Mac from the command line is an ok second option, as is Solaris etc. Still, I’d have to say Linux would be preferred.
  4. basic bash shell scripting. There are still too many cases where this ends up being “just the thing to do”. And of course, this all applies to Mac.
  5. Some experience with Java or other “traditional languages” or a real understanding of  modern programming paradigms. This may seem lame or vague. But it is important to understand how traditional programming languages approach problems. At minimum, this ensures some exposure to concepts like object-oriented programming, functional programming, libraries, etc. I know that one can get all of this with python and, yes, even perl – but I fear that many bioinformatics people get away without knowing these things to their detriment.
  6. R + Bioconductor. So many great packages in Bioconductor. Comfort with R can solve a lot of problems quickly. R is only growing; if I could buy stock in R, I would!

This may seem like a lot, but many of these items fit together very well. For example, one could go “cold turkey” and just use Linux and commit to doing bioinformatics by using a combination of R, perl and shell scripting, and an SQL-based database (MySQL). It is very common in bioinformatics to link these pieces, so… not so bad, in the end, I think.

As always, comments welcome…

Anti-Lifehacker: Why Lifehacker is probably bad for you

Lifehacker is a website with (mostly) technological solutions for productivity – and it is super-popular.

Lifehacker sounds good – who doesn’t want to improve their productivity or upgrade the way they approach a problem?

But there are deep, but slightly subtle, problems with Lifehacker:

1. Lifehacker ignores the big cost of installing a new piece of software: time and energy.
2. Lifehacker does minimal testing of software – and never does the “I used it daily for 3 months” type of testing.
3. Lifehacker values newness (“newly available!”) over robust, well-tested, solutions.
4. Lifehacker does minimal comparative testing: if there are thirty “todo list” applications on the web, I want to know about the best ones – not just the names of all thirty. I really want someone to evaluate things for me.
5. Lifehacker focuses on free software but ignores one of the most important parts: mature software with a significant user base and robust support. Sure, I respect heroic, single person efforts. But I’d rather have a piece of software with strong, sustained support.

So what’s good about lifehacker?

1. It’s fun. That is, if you are a certain sort of person, it’s fun.
2. It does provide a snapshot – and repository – of new software developments in the productivity area.
3. It provides exposure for new software. And some of this software is probably great.

For me, I just worry about the time and energy… and the illusion that I am helping my productivity. So personally, I’ll spend my time writing these blog entries instead.