Pregunta

I've got an array full of accession numbers, and I'm wondering if there's a way to automatically save genbank files using BioPerl. I know you can grab sequence information, but I want the entire GenBank record.

#!/usr/bin/env perl
use strict;
use warnings;
use Bio::DB::GenBank;

my @accession;
open (REFINED, "./refine.txt") || die "Could not open: $!";

while(<REFINED>){
    if(/^(\D+)\|(.*?)\|/){
    push(@accession, $2);
    }
}
close REFINED;
foreach my $number(@accession){

    my $db_obj = Bio::DB::GenBank->new;
    }
¿Fue útil?

Solución

You can save the full genbank records by using Bio::DB::EUtilities. Here is an example that will take a list of IDs and save genbank records for each in a file called myseqs.gb:

#!/usr/bin/env perl

use strict;
use warnings;
use Bio::DB::EUtilities;

my @ids = qw(1621261 89318838 68536103 20807972 730439);

my $factory = Bio::DB::EUtilities->new(-eutil   => 'efetch',
                                       -db      => 'protein',
                                       -rettype => 'gb',
                                       -email   => 'mymail@foo.bar',
                                       -id      => \@ids);

my $file = 'myseqs.gb';

# dump HTTP::Response content to a file (not retained in memory)
$factory->get_Response(-file => $file);

If you want to split the individual records returned instead of having them all in one file, this can easily be done with Bio::SeqIO. Check out the EUtilities HOWTO and the EUtilities Cookbook for more examples and explanation.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top