Our new ability to create large amounts of biological data has created an abundance of tools to deal with the manipulations and data-mining required. however, like every new tool, they have limitations, edge-cases, and quirks. I've discovered that sometimes the old ways are the best ways. grep, sort, awk, and sed deal very will with large data-sets and parallel processing, and I've found that i use them for 90% of bioinformatic tasks, and quite successfully. I will describe the type of problems i am dealing with, and show why shell is the optimal solution. a quick genetics intro, in computer terms, will be given, so no biological knowledge is needed.
Back to the Club's homepage