Category:python学习笔记
Article From:https://www.cnblogs.com/shmily2018/p/9123144.html

 Why Python multithreading can not use multi-core CPU, but when we write code, multithreading is indeed concurrent and faster than single thread.

 

1. Multithreaded Python can not take advantage of multi-core CPU?

Reason:Because GIL, python has only one GIL. When running python, it is necessary to get the lock to execute, and the lock will be released when I/O operation is encountered.If it is a pure computing program, without I/O operation, the interpreter will release every 100 operations.This lock gives other threads the chance to execute (this number can pass sys.setcheckintervalTo adjust) only one GIL thread will run at the same time, all other threads will be waiting.1, if it is CPU intensive code (cycle, calculation, etc.)Due to the large amount of computation and computation, the computation will soon reach 100, then trigger the release of GIL and lose resources in the competition and multiple threads switching back and forth.So when multi thread meets CPU intensive code, single thread will be faster.2, if it is I\O intensive code (file processing, network)Crawler), opening multithreading is actually concurrent (not parallel), IO operation will wait for IO, and thread A will automatically switch to thread B when waiting.This raises the efficiency

 

Two. Other principles

Link: https://www.zhihu.com/question/23474039/answer/24695447

As an interpretative language that may be the only multi thread supporting language (Perl multithreading is disabled, PHP has no multithreading), Python multithreading is compromise, and at any time only one Python interpreter is explaining Python byTecode.

As pointed out in the commentary, Ruby is also supported by thread, and at least Ruby MRI has GIL. If your code is CPU intensive, the code of multiple threads is likely to be linearly executed. So in this case, multithreading is chicken ribs, and efficiency may not be as good as single thread.There are context switch but:
If your code is IO intensive, multithreading can significantly improve efficiency. For example, making reptiles, most of the time the crawler is waiting for socket to return data. At this time, there is release GIL in the C code, and the end result is that when a thread waits for IO, other threads can.Continue the implementation.
Conversely, you should not write CPU intensive code in Python. Efficiency is there... If you really need to use concurrent in CPU intensive code, use the multiprocessing library. This library is based on multi procEss implements the API interface of the class multi thread,
And a part of pickle is used to share the variables. Plus, if you don't know if your code is CPU intensive or IO intensive,

Teach you a way: multiprocessing this module has a dummy sub module, which is based on Multithread to achieve multiprocessing API.

Suppose you are using multiprocessing's Pool, using multiple processes to implement concurrency.
from multiprocessing
import Pool If you change the code to the following, it becomes a multithreaded implementation of concurrency.

from multiprocessing.dummy
import Pool The two way is to run and use whichever speed to use.

Link: https://www.zhihu.com/question/23474039/answer/269526476


Before we introduce threads in Python, we first define a problem. The multithread in Python is a pseudo multithread. Why do we say so? Let's first define a concept, the global interpreter lock (GIL).

PythonThe execution of the code is controlled by the Python virtual machine (interpreter). At the beginning of the design, Python was considered to be in the main loop, with only one thread executing, like running multiple processes in a single CPU system, which could store multiple programs in memory, but at any moment, only one programRun in CPU.
Similarly, although the Python interpreter can run multiple threads, only one thread runs in the interpreter. Access to the Python virtual machine is controlled by the global interpreter lock (GIL), which ensures that only one thread is running at the same time.

In a multithreaded environment, the Python virtual machine executes in the following ways.

1.Set the GIL.
2.Switch to a thread to execute.
3.Function。
4.Set the thread to sleep.
5.Unlock the GIL.
6.Repeat the above steps again.

For all I/O - oriented programs that call the built - in operating system C code, GIL will be released before this I/O call to allow other threads to run when the thread is waiting for I/O. If a thread does not use many I/O operations, it will remain in its own time slice.Take up the processor and GIL.
That is to say, I/O intensive Python programs can make full use of the benefits of multithreading than compute intensive Python programs. As we all know, for example, I have a 4 core CPU, so that in each unit time, only one thread per core can be run, and then the time slice will be rotated..
But Python is different. No matter how many cores you have, you can only run one thread per unit time, and then the time slice turns. It looks incredible? But that's the ghost of GIL. Before any Python thread executes, the GIL lock must first be obtained, and then each of the 100 characters is executed.In the code, the interpreter automatically releases the GIL lock.
Let other threads have the opportunity to execute. This GIL global lock actually gives all thread execution code to a lock, so multithreading can only be executed alternately in Python, even if 100 threads run on the 100 kernel CPU, only 1 cores can be used. Usually the interpreter we use is the official realityThe present CPython must really use multi-core unless rewrite an interpreter without GIL.

We might as well do an experiment.

#coding=utf-8 from multiprocessing
import Pool from threading
import Thread from multiprocessing
import Process
 
def loop():
  while True:
    pass
if __name__ == ‘__main__’:
  for i in range(3):
    t = Thread(target=loop)
    t.start()
 
 
  while True:
    pass

We found that CPU utilization was not enough, roughly equivalent to the single core level.

And what if we become a process?

Let’s change the code:

#coding=utf-8 from multiprocessing
import Pool from threading
import Thread from multiprocessing
import Process
 
def loop():
  while True:
    pass
if __name__ == ‘__main__’:
  for i in range(3):
    t = Process(target=loop)
    t.start()
 
 
  while True:
    pass

The result is directly to 100%, indicating that the process is multi-core.

In order to verify that this is the ghost of GIL in Python, I try to write the same code with Java and open the thread. Let’s take a look.

 

package com.darrenchan.thread;

public class TestThread {
    public static void main(String[] args) {
        for (int i = 0; i < 3; i++) {
            new Thread(new Runnable() {

                @Override
                public void run() {
                    while (true) {

                    }
                }
            }).start();
        }
        while(true){

        }
    }
}

It can be seen that multithreading in Java is multi-core, which is true multithreading. And Multithread in Python can only use single core, which is fake multithreading!

 

Three. Solutions

Link: link: https://www.zhihu.com/question/23474039/answer/269526476
 
That’s the case? We have no way to use multicore in Python? Certainly. Just now, multi process is a solution, and there is another link library that calls C language. For all I/O oriented programs that call the built in operating system C code, GIL will tune in to I/O.Before being released, it allows other threads to run when the thread waits for I/O. We can write some computing intensive tasks in the C language and then load the content of the.So link library into Python, because the C code is executed, and the GIL lock will be released, so that you can do it.Each core runs a thread.

Some small partners may not quite understand what computing intensive tasks are and what are I/O intensive tasks.

The characteristic of computing intensive tasks is to carry out a large number of calculations, consume CPU resources, such as the calculation of the circumference, video high definition decoding, and so on, all depend on the computing power of the CPU. Although this computing intensive task can be accomplished by multiple tasks, the more tasks, the more time spent on task switching.CPU is less efficient in executing tasks. Therefore, to efficiently utilize CPU, the number of computationally intensive tasks should be equal to the core number of CPU.

Computation intensive tasks are mainly consumed by CPU resources, so the efficiency of code operation is very important. Python scripting language is inefficient in running and is totally unsuitable for computationally intensive tasks. For computing intensive tasks, it is best to use C language.

The type of the second task is IO intensive, and the tasks involved in the network and disk IO are IO intensive tasks, which are characterized by the low consumption of CPU and the task of waiting for the IO operation to complete most of the time (because IO is far below the speed of CPU and memory). For IO densitySet tasks, the more tasks, the higher the efficiency of CPU, but there is a limit. Most of the common tasks are IO intensive tasks, such as Web applications.

IODuring the intensive task execution, 99% of the time spent on IO, and the time spent on CPU was very few, so replacing the scripting language with very fast running C language to run the very low speed of the Python was completely ineffective. For IO intensive tasks, the most appropriate language isIt is the language with the highest development efficiency (the least amount of code), the scripting language is the first choice, and the C language is the worst.

In summary, Python multithreading is equivalent to single core multithreading. Multithreading has two advantages: CPU parallel, IO parallel, single core multithreading equivalent to self breaking one arm. So, in Python, you can use multithreading, but don’t expect to use multi-core effectively. If you must use multithreading moreCore, which can only be realized by C extension, but it has lost the simple and easy use of Python. However, there is no need to worry too much. Although Python can not use multithreading to implement multi-core tasks, it can achieve multi-core tasks through multiple processes. Multiple Python processes have their own independenceGIL locks do not affect each other.

 

Leave a Reply

Your email address will not be published. Required fields are marked *