Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ligation of partially overlapping overhangs? #158

Open
manulera opened this issue Nov 22, 2023 · 4 comments
Open

Support for ligation of partially overlapping overhangs? #158

manulera opened this issue Nov 22, 2023 · 4 comments

Comments

@manulera
Copy link
Collaborator

Hi @BjornFJohansson,

Me and @dgruano were wondering what would be the best way to ligate fragments with partially overlapping overhangs.

I guess a possibility would be to write a custom alogrithm function for Assembly is there something already there?

@gruano for context, the class Assembly accepts a function as an argument that returns the regions that will be joined.

Here are the built-in algorithms:

https://github.com/BjornFJohansson/pydna/blob/master/src/pydna/common_sub_strings.py

@BjornFJohansson
Copy link
Owner

There is a new ligation module coming up! Ill push it and let you know.

@BjornFJohansson
Copy link
Owner

check out the "ligate" branch. It has a new "ligate" module.

@BjornFJohansson
Copy link
Owner

Maybe I misunderstood. Is it ligation or homologous recombination? A custom algorithm would certainly be possible, but I am not sure how to model the assembly with imperfect ssDNA domains.

@manulera
Copy link
Collaborator Author

manulera commented Dec 15, 2023

I am using the new implementation of the assembly for this, using these two functions for the algorithm, using a slightly modified version of @dgruano's function that returns the overlap length rather than true/false.

def sum_is_sticky(seq1: Dseq, seq2: Dseq, partial: bool=False) -> int:
    """Return true if the 3' end of seq1 and 5' end of seq2 ends are sticky and compatible for ligation."""
    type_seq1, sticky_seq1 = seq1.three_prime_end()
    type_seq2, sticky_seq2 = seq2.five_prime_end()

    if 'blunt' != type_seq2 and type_seq2 == type_seq1 and str(sticky_seq2) == str(reverse_complement(sticky_seq1)):
        return len(sticky_seq1)

    if not partial:
        return 0

    if type_seq1 != type_seq2 or type_seq2 == "blunt":
        return 0
    elif type_seq2 == "5'":
        sticky_seq1 = str(reverse_complement(sticky_seq1))
    elif type_seq2 == "3'":
        sticky_seq2 = str(reverse_complement(sticky_seq2))

    ovhg_len = min(len(sticky_seq1), len(sticky_seq2))
    for i in range(1, ovhg_len+1):
        if sticky_seq1[-i:] == sticky_seq2[:i]:
            return i
    else:
        return 0

def sticky_end_sub_strings(seqx: _Dseqrecord, seqy: _Dseqrecord, limit=0):
    """For now, if limit 0 / False only full overlaps are considered."""
    overlap = sum_is_sticky(seqx.seq, seqy.seq, limit )
    if overlap:
        return [(len(seqx)-overlap, 0, overlap)]
    return []

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants