Hi all!
Here are the instructions to install Bioperl under Cygwin.
Following it are instructions for MacOSX or Unix, scroll to the end.
In case of problems, please mail me at rarnold@kimlab.org !
Pre-requisit: find a fast Internet connection and take some time off from other duties. This will last ~30-60 min.
but it is worth it, Bioperl can solve many problems for you in the future!
I propose to get a cup of coffee/tee before....
To install Bioperl in Cygwin, following steps must be done:
Achtung (Attention)!
For Windows XP, following step must be done:
Close all cygwin windows.
Open Start->Run
rebase has to be run from an ash shell, so type C:\path\to\cygwin\bin\ash.exe (this is probably C:cygwin/ )
Once the shell window in open: $ cd /bin
$ ./rebaseall
$ exit to close the window
for all windows versions, proceed as follows:
1. we need following packages installed in Cygwin:
perl, make, binutils, and gcc
we install missing ones by re-running the setup.exe of Cygwin: (if you cannot find it anymoer, simply download Cygwin again from the internet)
use search function in the package selection dialog
chose by clicking in the column 'New' and activating the 'Bin' checkboxes
perl: Larry Pages ....
make: the GNU version of the 'make' utility
binutils: binutils, the GNU assembler, linker and binary utilities
gcc: gcc-core
go to next...
2. open a cygwin terminal
type
perl -MCPAN -e shell
then type
o conf prerequisites_policy follow
o conf commit
install Bundle::CPAN
this installs the installation utility for further Perl add ons (called modules, as Bioperl)
if the installation is finsihed (nothing happens in the console), do following command
install Module::Build
Step 2)
now, we install the actual version of bioperl
type:
d /bioperl/
and you will get a list of different versions.
We use this one:
'CJFIELDS/BioPerl-1.6.1.tar.gz'
and install it by
install CJFIELDS/BioPerl-1.6.1.tar.gz
if you encounter problems, with the firewall, allow FTP access
if the process asks
Install [a]ll optional external modules, [n]one, or choose [i]nteractively? [n]
answer with 'n'
(Answer means type n and then hit the return button)
if it asks
Install [a]ll BioPerl scripts, [n]one, or choose groups [i]nteractively? [a]
answer with 'a'
then it will ask
Do you want to run tests that require connection to servers across the internet
answer with 'n'
if the process wants a decision like 'install [a]ll optional external modules, press return (this is selection 'n')
no it takes a while. Don't be confused if error messages appear in the first place or the program seems to do nothing for a while....
The process is finished
Writing /usr/lib/perl5/site_perl/5.10/i686-cygwin/auto/Bio/.packlist
Will try to install symlinks to /usr/local/bin
CJFIELDS/BioPerl-1.6.1.tar.gz
./Build install -- OK
if so, proceed:
type q
to quit the perl update program
You might now create a test perl program like this:
#!/usr/bin/perl
use Bio::DB::Fasta;
use strict;
my $file=$ARGV[0];
my $id=$ARGV[1];
my $db = Bio::DB::Fasta->new($file); # one file or many files
my $seqstring = $db->seq($id); # get a sequence as string
my $seqobj = $db->get_Seq_by_id($id); # get a PrimarySeq obj
my $desc = $db->header($id); # get the header, or description line
print $seqstring;
this reads a multi-fasta file given as first command line argument and returns the sequence of the entry given as second argument
Instructions for installing BioPerl 1.6.1 for MacOSX or Unix:
Open a shell and type gcc. If you get an error then you must install XCode from here.
Download BioPerl 1.6.1 from here .
Then follow the instructions below.
This is an interactive installation so if at some point, the installation prompts to download something from CPAN respond with yes.
It takes a while and there will be alot of output.
INSTALLING BIOPERL THE EASY WAY USING Build.PL
The advantage of this approach is it's stepwise, so it's easy to stop and analyze in case of any problem.
Download, then unpack the tar file. For example:
>tar xvfz BioPerl-1.6.1.tar.gz
>cd BioPerl-1.6.1
Now issue the build commands:
>sudo perl Build.PL
>./Build test
If you've installed everything perfectly and all the network
connections are working then you may pass all the tests run in the
'./Build test' phase. It's also possible that you may fail some tests.
Possible explanations: problems with local Perl installation, network
problems, previously undetected bug in BioPerl, flawed test script,
problems with CGI script using for sequence retrieval at public
database, and so on. Remember that there are over 900 modules in
BioPerl and the test suite is running more than 12000 individual
tests, a few failed tests may not affect your usage of BioPerl.
If you decide that the failed tests will not affect how you intend to
use BioPerl and you'd like to install anyway, or if all tests were
fine, do:
>sudo ./Build install
This is what most experienced BioPerl users would do. However, if
you're concerned about a failed test and need assistance or advice
then contact bioperl-l@bioperl.org. (You could provide us the detailed
results of the failed test(s): see the `THE TEST SYSTEM' below for
information on how to generate such results.)
To './Build install' you need write permission in the
perl5/site_perl/source area (or similar, depending on your
environment). Usually this will require you becoming root, so you will
want to talk to your systems manager if you don't have the necessary
privileges.
It is also straightforward to install the package outside of the this
standard Perl5 location. See INSTALLING BIOPERL IN A PERSONAL MODULE
AREA, below.
You might now create a test perl program like this:
#!/usr/bin/perl
use Bio::DB::Fasta;
use strict;
my $file=$ARGV[0];
my $id=$ARGV[1];
my $db = Bio::DB::Fasta->new($file); # one file or many files
my $seqstring = $db->seq($id); # get a sequence as string
my $seqobj = $db->get_Seq_by_id($id); # get a PrimarySeq obj
my $desc = $db->header($id); # get the header, or description line
print $seqstring;
this reads a multi-fasta file given as first command line argument and returns the sequence of the entry given as second argument.
Try downloading a proteome from Ensembl like Homo_sapiens.GRCh37.56.pep.all.fa and query for a sequence using it's Ensembl id like this:
perl test.pl Homo_sapiens.GRCh37.56.pep.all.fa ENSP00000375979