Pseudothreading with BASH

January 20, 2010 by

It’s more like a trick, but it’s better than repeat the same operation linearly. Some explanations:

TH_NUM=`ps aux | grep Python | grep -v "grep" | wc -l`

  • TH_MAX is the maximum number of “threads” that can be executed at the same time.
  • The first grep selects the threads that make use of python (you can change this, it depends on your script)
  • The second grep excludes the command you issued above 😉
  • wc counts the number of lines. The first time the result of the pipe is empty, so wc gives “0” as result.

#!/bin/bash
TH_MAX=10
for sample in `ls ./data`
do
while [ TRUE ]; do
TH_NUM=`ps aux | grep Python | grep -v "grep" | wc -l`
if [ "$TH_NUM" -le "$TH_MAX" ]
then
echo $( ./analyze_sample.py -s ${sample} ) > /dev/null &
echo -en " ${sample} "
break
else
echo -en "."
sleep 1
fi
done
done

Decode command line arguments given to a BASH script

August 25, 2009 by

Try the elegant way, at last!

For instance, imagine that your programme has the following arguments:
./myprog -s /source/directory -d /dest/directory -c deep

Then, the script myprog should contain a piece of code like this:


if [ $# -eq 0 ] ; then
echo "Usage: $0 -s -d -c"
exit 1
fi
while [ $# -gt 1 ] ; do
case $1 in
-s) source_dir=$2 ; shift 2 ;;
-d) dest_dir=$2 ; shift 2 ;;
-c) copy_mode=$2 ; shift 2 ;;
*) shift 1 ;;
esac
done

Of course, this is only a hint. Just tailor the decoding according to your needs.

RPy – simple and efficient access to R from Python

July 28, 2009 by

RPy is an interface that allows you to call R functions and handle R objects in Python.

R language

Using mutable objects as default parameter in Python

July 22, 2009 by

We just came across a weird Python behaviour (of course only weird if you don’t know why). If you use mutable objects such as lists as the default parameter in a function declaration, you might end up with some unintented behaviour. Everytime the function modifies the object, the default value is in effect modified as well (e.g. appending to the list). This is also explained in the Python documentation:

Default parameter values are evaluated when the function definition is executed. This means that the expression is evaluated once, when the function is defined, and that that same “pre-computed” value is used for each call. This is especially important to understand when a default parameter is a mutable object, such as a list or a dictionary: if the function modifies the object (e.g. by appending an item to a list), the default value is in effect modified. This is generally not what was intended. A way around this is to use None as the default, and explicitly test for it in the body of the function, …

Find the whole thing here.

A Tutorial on Independent Component Analysis

July 21, 2009 by

can be found here.

pthreads – Some useful links and solution for maximum thread number

July 21, 2009 by

Manual Reference Pages (including a list of functions that are not thread-safe)

Tutorial on pthreads

And if you wonder why you cannot create more than X threads on your system (for me this was always 382), this forum provides a solution.

Basically, the problem is that each thread created will occupy space for its stack. On my system, the default thread stack size is 8MB. Therefore, after 382, I simply run out of space.

Solution: Change the stack size to a smaller value, unless you really need 8MB.

pthread_attr_t tattr;
size_t size;
size = PTHREAD_STACK_MIN + WHAT_ELSE_YOU_NEED;
pthread_attr_init(&tattr);
pthread_attr_setstacksize(&tattr, size);

Another interesting page on pthreads is this.

Lectures about Gödel Escher and Bach

July 7, 2009 by

Here, they’re free and they look nice 🙂

UNIX Network Programming by Stevens

July 6, 2009 by

UNIX Network Programming Volume 1, Third Edition: The Sockets Networking API
by W. Richard Stevens; Bill Fenner; Andrew M. Rudoff

can be found online at safari books.

in addition:

just realised that it is not the complete version, just previews. sorry!

Profiling Code Using clock_gettime

July 6, 2009 by

Find a good explanation here written by Guy Rutenberg.

Set Thunderbird to use Gmail’s Trash folder

March 31, 2009 by

Sorting all my mail the other day, I realised that Gmail still keeps a copy of every email in the “All Mail” folder. Only if the messages are moved to Gmail’s “Trash” folder, they are eventually deleted. This cannot be set in Thunderbird directly afaik. Here is what you have to do.