Article From:https://www.cnblogs.com/lokwongho/p/9682667.html

Chapter2 WHICH DNA PATTERNS PLAY THE ROLE OF MOLECULAR CLOCKS 

Finding the order of modules

1.

Transcription factors regulate the transcriptional expression of genes by binding to specific sequences upstream of genes, but this sequence varies from individual to individual. This chapter describes greedy and stochastic algorithms for finding this sequence: finding the order of modules.

(NF-κB binding sites from the Drosophila melanogaster genome)

Two, some concepts:

1. Score、Profile The meaning is as follows

According to profile matrix, you can calculate a KMER in a pro.fileLower probability

Three.

Questions raised:Motif Finding Problem:

Given a collection of strings, find a set of k-mers, one from each string, that minimizes the score of the resulting motif.

Input: A collection of strings Dna and an integer k.

Output: A collection Motifs of k-mers, one from each string in Dna, minimizing Score(Motifs) among all possible choices of k-mers.

In a set of sequences, look for a group of k-mer whose Score is the lowest.(Or the minimum sum of Hamming distance from consensus sequence.

 

1 ergodic

MedianString(Dna, k)
        distance ← ∞
        for each k-mer Pattern from AA…AA to TT…TT
            if distance > d(Pattern, Dna)
                 distance ← d(Pattern, Dna)
                 Median ← Pattern
        return Median

 

2 Greedy method GreedyMotifSearch

GREEDYMOTIFSEARCH(Dna, k, t)
        BestMotifs ← motif matrix formed by first k-mers in each string
                      from Dna
        for each k-mer Motif in the first string from Dna
            Motif1 ← Motif
            for i = 2 to t
                form Profile from motifs Motif1, …, Motifi - 1
                Motifi ← Profile-most probable k-mer in the i-th string
                          in Dna
            Motifs ← (Motif1, …, Motift)
            if Score(Motifs) < Score(BestMotifs)
                BestMotifs ← Motifs
        output BestMotifs

Detailed explanation of http://www.mrgraeme.co.uk/greedy-motif-search/

 

*Greedy method GreedyMotifSearch with pseudocounts

pseudocounts:When profile matrix is formed, the 0 term is set to a smaller value.

GreedyMotifSearch(Dna, k, t)
        form a set of k-mers BestMotifs by selecting 1st k-mers in each string from Dna
        for each k-mer Motif in the first string from Dna
            Motif1 ← Motif
            for i = 2 to t
                apply Laplace's Rule of Succession to form Profile from motifs   Motif1, …, Motifi-1
                Motifi ← Profile-most probable k-mer in the i-th string in Dna
            Motifs ← (Motif1, …, Motift)
            if Score(Motifs) < Score(BestMotifs)
                BestMotifs ← Motifs
        output BestMotifs

 

3. Stochastic method Randomized Motif Search

RandomizedMotifSearch(Dna, k, t)
     #Randomly fetching k-mer from each DNA, generating a set of motifsRandomly select k-mers Motifs = (Motif1,... (Motift) in each stringFrom DnaBestMotifs MotifsWhile foreverProfile (Profile) (Motifs)Motifs forms Profile matrixMotifs (Motifs, Profile (Dna)) generates a set of motifs with the greatest probability from a set of DNA based on profile matrix.If Score (Motifs) < Score (BestMotifs)BestMotifs MotifsElseReturn BestMotifs

The reason that stochastic algorithms work is that a randomly selected set of Motifs may pick a potentially correct k-mer and then form a tilt in it until a better solution is found.

Improvement: In the previous algorithm, a new set of Motifs is generated randomly at each iteration, which may discard the potential correct module order by changing only one line at a time at random

GibbsSampler(Dna, k, t, N)
        randomly select k-mers Motifs = (Motif1, …, Motift) in each string from Dna
        BestMotifs ← Motifs
        for j ← 1 to N
            i ← Random(t)
            Profile ← profile matrix constructed from all strings in Motifs except for Motif[i]
            Motif[i] ← Profile-randomly generated k-mer in the i-th sequence
            if Score(Motifs) < Score(BestMotifs)
                BestMotifs ← Motifs
        return BestMotifs

 

Leave a Reply

Your email address will not be published. Required fields are marked *