extract_seq_from_fasta

yoda_powers.bio.extract_seq_from_fasta(fasta_file, wanted_file, include=True)[source]

Function to extract sequence from fasta file

Notes

function need modules:

  • pathlib

  • BioPython

Parameters
  • fasta_file (str) – a path to fasta file directory

  • wanted_file (str) – a path file with id (one per line)

  • include (bool, optional) – if True keep id on wanted file, else keep id not in wanted file

Returns

the fasta dict with sequence extract

Return type

dict

Raises
  • ValueError – If wanted_file or fasta_file does not exist.

  • ValueError – If wanted_file or fasta_file` is not a valid file.

  • ValueError – If include is not valid boolean.

Example

>>> dict_sequences = extract_seq_from_fasta(fasta_file, wanted_file)
>>> dict_sequences
{'Seq2': SeqRecord(seq=Seq('ATGCCGATCGATG', SingleLetterAlphabet()), id='Seq2', name='Seq2', description='Seq2'
, dbxrefs=[]), 'Seq3': SeqRecord(seq=Seq('ATGCTCAGTCAGTAG', SingleLetterAlphabet()), id='Seq3', name='Seq3',
description='Seq3', dbxrefs=[])}