Howto install Bioperl on Cygwin and Mac OSX

Hi all!

Here are the instructions to install Bioperl under Cygwin.

Following it are instructions for MacOSX or Unix, scroll to the end.

In case of problems, please mail me at rarnold@kimlab.org !

Pre-requisit: find a fast Internet connection and take some time off from other duties. This will last ~30-60 min.
but it is worth it, Bioperl can solve many problems for you in the future!



I propose to get a cup of coffee/tee before....


To install Bioperl in Cygwin, following steps must be done:


Achtung (Attention)!

For Windows XP, following step must be done:

   
  1. Close all cygwin windows.
  2. Open Start->Run
  3. rebase has to be run from an ash shell, so type C:\path\to\cygwin\bin\ash.exe  (this is probably C:cygwin/ )
  4. Once the shell window in open: $ cd /bin
  5. $ ./rebaseall
  6. $ exit to close the window



for all windows versions, proceed as follows:

1. we need following packages installed in Cygwin:



perl, make, binutils, and gcc

we install missing ones by re-running the setup.exe of Cygwin:  (if you cannot find it anymoer, simply download Cygwin again from the internet)

use search function in the package selection dialog

chose by clicking in the column 'New' and activating the 'Bin' checkboxes

perl: Larry Pages ....
make: the GNU version of the 'make' utility
binutils: binutils, the GNU assembler, linker and binary utilities
gcc: gcc-core

go to next...

2. open a cygwin terminal

type

    perl -MCPAN -e shell

then type

    o conf prerequisites_policy follow
    o conf commit
    install Bundle::CPAN


this installs the installation utility for further Perl add ons (called modules, as Bioperl)

if the installation is finsihed (nothing happens in the console), do following command

   
    install Module::Build


Step 2)

now, we install the actual version of bioperl
type:
    d /bioperl/

and you will get a list of different versions.

We use this one:

'CJFIELDS/BioPerl-1.6.1.tar.gz'


and install it by

    install CJFIELDS/BioPerl-1.6.1.tar.gz

if you encounter problems, with the firewall, allow FTP access

if the process asks

Install [a]ll optional external modules, [n]one, or choose [i]nteractively? [n]
answer with 'n'

(Answer means type n and then hit the return button)

if it asks

Install [a]ll BioPerl scripts, [n]one, or choose groups [i]nteractively? [a]

answer with 'a'

then it will ask

Do you want to run tests that require connection to servers across the internet

answer with 'n'

if the process wants a decision like 'install [a]ll optional external modules, press return (this is selection 'n')


no it takes a while. Don't be confused if error messages appear in the first place or the program seems to do nothing for a while....

The process is finished


Writing /usr/lib/perl5/site_perl/5.10/i686-cygwin/auto/Bio/.packlist
Will try to install symlinks to /usr/local/bin
  CJFIELDS/BioPerl-1.6.1.tar.gz
  ./Build install  -- OK


if so, proceed:


type q

to quit the perl update program

You might now create a test perl program like this:
   
#!/usr/bin/perl
  use Bio::DB::Fasta;
  use strict;
  my $file=$ARGV[0];
  my $id=$ARGV[1];
  my $db = Bio::DB::Fasta->new($file);  # one file or many files
  my $seqstring = $db->seq($id);        # get a sequence as string
  my $seqobj = $db->get_Seq_by_id($id); # get a PrimarySeq obj
  my $desc = $db->header($id);          # get the header, or description line
  print $seqstring;

this reads a multi-fasta file given as first command line argument and returns the sequence of the entry given as second argument

Instructions for installing BioPerl 1.6.1 for MacOSX or Unix:

Open a shell and type gcc.  If you get an error then you must install XCode from here.

Download BioPerl 1.6.1 from here .

Then follow the instructions below.  

This is an interactive installation so if at some point, the installation prompts to download something from CPAN respond with yes.

It takes a while and there will be alot of output.

INSTALLING BIOPERL THE EASY WAY USING Build.PL

The advantage of this approach is it's stepwise, so it's easy to stop and analyze in case of any problem.

Download, then unpack the tar file. For example:

 >tar xvfz BioPerl-1.6.1.tar.gz
 >cd BioPerl-1.6.1

Now issue the build commands:

 >sudo perl Build.PL
 >./Build test

If you've installed everything perfectly and all the network
connections are working then you may pass all the tests run in the
'./Build test' phase. It's also possible that you may fail some tests.
Possible explanations: problems with local Perl installation, network
problems, previously undetected bug in BioPerl, flawed test script,
problems with CGI script using for sequence retrieval at public
database, and so on. Remember that there are over 900 modules in
BioPerl and the test suite is running more than 12000 individual
tests, a few failed tests may not affect your usage of BioPerl.

If you decide that the failed tests will not affect how you intend to
use BioPerl and you'd like to install anyway, or if all tests were
fine, do:

 >sudo ./Build install

This is what most experienced BioPerl users would do. However, if
you're concerned about a failed test and need assistance or advice
then contact bioperl-l@bioperl.org. (You could provide us the detailed
results of the failed test(s): see the `THE TEST SYSTEM' below for
information on how to generate such results.)

To './Build install' you need write permission in the
perl5/site_perl/source area (or similar, depending on your
environment). Usually this will require you becoming root, so you will
want to talk to your systems manager if you don't have the necessary
privileges.

It is also straightforward to install the package outside of the this
standard Perl5 location. See INSTALLING BIOPERL IN A PERSONAL MODULE
AREA, below.

You might now create a test perl program like this:
    
#!/usr/bin/perl
  use Bio::DB::Fasta;
  use strict;
  my $file=$ARGV[0];
  my $id=$ARGV[1];
  my $db = Bio::DB::Fasta->new($file);  # one file or many files
  my $seqstring = $db->seq($id);        # get a sequence as string
  my $seqobj = $db->get_Seq_by_id($id); # get a PrimarySeq obj
  my $desc = $db->header($id);          # get the header, or description line
  print $seqstring;

this reads a multi-fasta file given as first command line argument and returns the sequence of the entry given as second argument.
Try downloading a proteome from Ensembl like Homo_sapiens.GRCh37.56.pep.all.fa and query for a sequence using it's Ensembl id like this:

perl test.pl Homo_sapiens.GRCh37.56.pep.all.fa ENSP00000375979

Comments