You will find here all my posts, if you like what you read, do not hesitate to add my Rss flux (yes, the icon up there) or share the content.

img

What is deep-learning?

A kid-friendly answer

Photo by Vlad Tchompalov on Unsplash Deep-learning and many cases of artificial intelligence are all about neural networks, what are they? how do they work? What are neural networks? Neural networks are special computer programs. They can learn and tell apart things they have seen before. For example, we know when we see an animal that it is a dog because we have seen many dogs before. Neural networks work in a similar manner. They were created to copy the way our heads work. Read more
img

Kaggle competition : pseudo-labelling efficiency in regression tasks

I needed some training on regression problems for a project. I luckily found out that the Kaggle Mercedes Benz competition; which aims at the development of a model that predicts how long a car in the manufacturing process stays on the test bench; just had started. I am starting to get a much better grasp on models and how they work one by one. Having in mind to learn new approaches and tricks I tried several published kernels and among them one from Hakeem which showed to me one new way on how to stack models and go beyond : Read more

February 16, 2016

Markdown for flexible reporting

Transparency, multiple devices, multiple operating systems, paper/electronic,… when we write documents, we often need to adjust to different constrains. This is the reason why I think that the dissociation between the information and the presentation of the said information is an important step that allows you on the long run to become more efficient. This is clearly has no immediate benefit, but on the long I really started to see the advantage. Read more

May 27, 2015

Run jobs in parallel with python

How do you manage efficiently multiples jobs with modern computer cores ? This is a real problem that had me stuck for a while. I am working on a framework to analyze next generation sequencing data, consequently I have numerous files to manages, differents types of data that have to undergo a specific analysis. On the other hand, I have a few cores (24), some RAM (32Go) and decent hard drives. Let’s combine everything together so that many files can be analyzed on multiple cores at the same time. Of note, I am mostly working with bacterial genomes, so my computing times in this respect might seem light with compared to human genome.. A good example (not too simple, not too complex ?) would be, how do I convert 10 sam files to 10 bam files IN PARALLEL ? Read more

June 3, 2012

Logging shell commands outputs with python

The analysis of sequencing data (at least in the first steps) is often a matter of starting multiple scripts one after an other. I found out that to log the stdout and the stderr with python was not that much obivous for me. I struggled quite a bit with the logging module but finally I came with the following. import os import subprocess import logging import logging.handlers import datetime as dt class ShellLaunchAndLog: ''' logging of shell commands output ''' def __init__(self): self._logFolder = '/tmp' def PopenLog(self, commandL, logFile): ''' Method to ensure shell scripts launch + logging ''' p = subprocess.Popen(commandL, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE) (stdout, stderr) = p.communicate() #create logger logger = logging.getLogger('sequence conversion') logger.setLevel(logging.DEBUG) # create file handler which logs even debug messages fh = logging.handlers.RotatingFileHandler(logFile, mode='a', maxBytes=0, backupCount=0, encoding=None, delay=0) fh.setLevel(logging.DEBUG) # create formatter and add it to the handlers formatR = logging.Formatter('%(asctime)s-%(levelname)s : %(message)s') fh.setFormatter(formatR) logger.addHandler(fh) logger.info(stdout) logger.error(stderr) logger.removeHandler(fh) def FolderListing(self): ''' lists the files contained in the current folder and logs the output into a file. ''' print(dt.datetime.now().strftime("%Y-%m-%d %H:%M") + ' Script started !') self.PopenLog(['ls', '-al'], os.path.join(self._logFolder, 'ls.log')) print(dt.datetime.now().strftime("%Y-%m-%d %H:%M") + ' Script finished !') Using this class in an interactive session of python allows you to log the results (and the errors) of the ‘ls -al’ in a file named ls.log in the /tmp folder. Read more

Copyright 2025 - Mikael Koutero. All rights reserved.

Privacy Statement