You will find here all my posts, if you like what you read, do not hesitate to add my Rss flux (yes, the icon up there) or share the content.

February 16, 2016

Markdown for flexible reporting

Transparency, multiple devices, multiple operating systems, paper/electronic,… when we write documents, we often need to adjust to different constrains. This is the reason why I think that the dissociation between the information and the presentation of the said information is an important step that allows you on the long run to become more efficient. This is clearly has no immediate benefit, but on the long I really started to see the advantage. Read more

May 27, 2015

Run jobs in parallel with python

How do you manage efficiently multiples jobs with modern computer cores ? This is a real problem that had me stuck for a while. I am working on a framework to analyze next generation sequencing data, consequently I have numerous files to manages, differents types of data that have to undergo a specific analysis. On the other hand, I have a few cores (24), some RAM (32Go) and decent hard drives. Read more

June 3, 2012

Logging shell commands outputs with python

The analysis of sequencing data (at least in the first steps) is often a matter of starting multiple scripts one after an other. I found out that to log the stdout and the stderr with python was not that much obivous for me. I struggled quite a bit with the logging module but finally I came with the following. import os import subprocess import logging import logging.handlers import datetime as dt class ShellLaunchAndLog: ''' logging of shell commands output ''' def __init__(self): self. Read more

April 20, 2012

Multiple text file size : small intro to find and awk

People are used to just open a file explorer, select multiple files, right click and check size. Unfortunately you can’t do that when you work in a terminal through an ssh connexion. In bash, one can do the following to track the size of all the files harboring a final .txt : find ./ -name "*.txt" -ls | awk '{total += $7} END {print "Total size: " total/1024/1024 " Mb"}' There are other commands to get this done, this is one I like because I can easily specify a file pattern that I am looking for. Read more