Howto install Bioperl on Cygwin and Mac OSX

Hi all!

Here are the instructions to install Bioperl under Cygwin.

Following it are instructions for MacOSX or Unix, scroll to the end.

In case of problems, please mail me at rarnold@kimlab.org !

Pre-requisit: find a fast Internet connection and take some time off from other duties. This will last ~30-60 min.

but it is worth it, Bioperl can solve many problems for you in the future!

I propose to get a cup of coffee/tee before....

To install Bioperl in Cygwin, following steps must be done:

Achtung (Attention)!

For Windows XP, following step must be done:

    1. Close all cygwin windows.

    2. Open Start->Run

    3. rebase has to be run from an ash shell, so type C:\path\to\cygwin\bin\ash.exe (this is probably C:cygwin/ )

    4. Once the shell window in open: $ cd /bin

  1. $ ./rebaseall

    1. $ exit to close the window

for all windows versions, proceed as follows:

1. we need following packages installed in Cygwin:

perl, make, binutils, and gcc

we install missing ones by re-running the setup.exe of Cygwin: (if you cannot find it anymoer, simply download Cygwin again from the internet)

use search function in the package selection dialog

chose by clicking in the column 'New' and activating the 'Bin' checkboxes

perl: Larry Pages ....

make: the GNU version of the 'make' utility

binutils: binutils, the GNU assembler, linker and binary utilities

gcc: gcc-core

go to next...

2. open a cygwin terminal

type

perl -MCPAN -e shell

then type

o conf prerequisites_policy follow

o conf commit

install Bundle::CPAN

this installs the installation utility for further Perl add ons (called modules, as Bioperl)

if the installation is finsihed (nothing happens in the console), do following command

install Module::Build

Step 2)

now, we install the actual version of bioperl

type:

d /bioperl/

and you will get a list of different versions.

We use this one:

'CJFIELDS/BioPerl-1.6.1.tar.gz'

and install it by

install CJFIELDS/BioPerl-1.6.1.tar.gz

if you encounter problems, with the firewall, allow FTP access

if the process asks

Install [a]ll optional external modules, [n]one, or choose [i]nteractively? [n]

answer with 'n'

(Answer means type n and then hit the return button)

if it asks

Install [a]ll BioPerl scripts, [n]one, or choose groups [i]nteractively? [a]

answer with 'a'

then it will ask

Do you want to run tests that require connection to servers across the internet

answer with 'n'

if the process wants a decision like 'install [a]ll optional external modules, press return (this is selection 'n')

no it takes a while. Don't be confused if error messages appear in the first place or the program seems to do nothing for a while....

The process is finished

Writing /usr/lib/perl5/site_perl/5.10/i686-cygwin/auto/Bio/.packlist

Will try to install symlinks to /usr/local/bin

CJFIELDS/BioPerl-1.6.1.tar.gz

./Build install -- OK

if so, proceed:

type q

to quit the perl update program

You might now create a test perl program like this:

#!/usr/bin/perl

use Bio::DB::Fasta;

use strict;

my $file=$ARGV[0];

my $id=$ARGV[1];

my $db = Bio::DB::Fasta->new($file); # one file or many files

my $seqstring = $db->seq($id); # get a sequence as string

my $seqobj = $db->get_Seq_by_id($id); # get a PrimarySeq obj

my $desc = $db->header($id); # get the header, or description line

print $seqstring;

this reads a multi-fasta file given as first command line argument and returns the sequence of the entry given as second argument

Instructions for installing BioPerl 1.6.1 for MacOSX or Unix:

Open a shell and type gcc. If you get an error then you must install XCode from here.

Download BioPerl 1.6.1 from here .

Then follow the instructions below.

This is an interactive installation so if at some point, the installation prompts to download something from CPAN respond with yes.

It takes a while and there will be alot of output.

INSTALLING BIOPERL THE EASY WAY USING Build.PL

The advantage of this approach is it's stepwise, so it's easy to stop and analyze in case of any problem.

Download, then unpack the tar file. For example:

>tar xvfz BioPerl-1.6.1.tar.gz

>cd BioPerl-1.6.1

Now issue the build commands:

>sudo perl Build.PL

>./Build test

If you've installed everything perfectly and all the network

connections are working then you may pass all the tests run in the

'./Build test' phase. It's also possible that you may fail some tests.

Possible explanations: problems with local Perl installation, network

problems, previously undetected bug in BioPerl, flawed test script,

problems with CGI script using for sequence retrieval at public

database, and so on. Remember that there are over 900 modules in

BioPerl and the test suite is running more than 12000 individual

tests, a few failed tests may not affect your usage of BioPerl.

If you decide that the failed tests will not affect how you intend to

use BioPerl and you'd like to install anyway, or if all tests were

fine, do:

>sudo ./Build install

This is what most experienced BioPerl users would do. However, if

you're concerned about a failed test and need assistance or advice

then contact bioperl-l@bioperl.org. (You could provide us the detailed

results of the failed test(s): see the `THE TEST SYSTEM' below for

information on how to generate such results.)

To './Build install' you need write permission in the

perl5/site_perl/source area (or similar, depending on your

environment). Usually this will require you becoming root, so you will

want to talk to your systems manager if you don't have the necessary

privileges.

It is also straightforward to install the package outside of the this

standard Perl5 location. See INSTALLING BIOPERL IN A PERSONAL MODULE

AREA, below.

You might now create a test perl program like this:

#!/usr/bin/perl

use Bio::DB::Fasta;

use strict;

my $file=$ARGV[0];

my $id=$ARGV[1];

my $db = Bio::DB::Fasta->new($file); # one file or many files

my $seqstring = $db->seq($id); # get a sequence as string

my $seqobj = $db->get_Seq_by_id($id); # get a PrimarySeq obj

my $desc = $db->header($id); # get the header, or description line

print $seqstring;

this reads a multi-fasta file given as first command line argument and returns the sequence of the entry given as second argument.

Try downloading a proteome from Ensembl like Homo_sapiens.GRCh37.56.pep.all.fa and query for a sequence using it's Ensembl id like this:

perl test.pl Homo_sapiens.GRCh37.56.pep.all.fa ENSP00000375979