詳解python中asyncio模組

NO IMAGE

一直對asyncio這個庫比較感興趣,畢竟這是官網也非常推薦的一個實現高併發的一個模組,python也是在python 3.4中引入了協程的概念。也通過這次整理更加深刻理解這個模組的使用

asyncio 是幹什麼的?

非同步網路操作併發協程

python3.0時代,標準庫裡的非同步網路模組:select(非常底層) python3.0時代,第三方非同步網路庫:Tornado python3.4時代,asyncio:支援TCP,子程序

現在的asyncio,有了很多的模組已經在支援:aiohttp,aiodns,aioredis等等 https://github.com/aio-libs 這裡列出了已經支援的內容,並在持續更新

當然到目前為止實現協程的不僅僅只有asyncio,tornado和gevent都實現了類似功能

關於asyncio的一些關鍵字的說明:

event_loop 事件迴圈:程式開啟一個無限迴圈,把一些函式註冊到事件迴圈上,當滿足事件發生的時候,呼叫相應的協程函式

coroutine 協程:協程物件,指一個使用async關鍵字定義的函式,它的呼叫不會立即執行函式,而是會返回一個協程物件。協程物件需要註冊到事件迴圈,由事件迴圈呼叫。

task 任務:一個協程物件就是一個原生可以掛起的函式,任務則是對協程進一步封裝,其中包含了任務的各種狀態

future: 代表將來執行或沒有執行的任務的結果。它和task上沒有本質上的區別

async/await 關鍵字:python3.5用於定義協程的關鍵字,async定義一個協程,await用於掛起阻塞的非同步呼叫介面。

看了上面這些關鍵字,你可能扭頭就走了,其實一開始瞭解和研究asyncio這個模組有種抵觸,自己也不知道為啥,這也導致很長一段時間,這個模組自己也基本就沒有關注和使用,但是隨著工作上用python遇到各種效能問題的時候,自己告訴自己還是要好好學習學習這個模組。

定義一個協程


import time
import asyncio
now = lambda : time.time()
async def do_some_work(x):
print("waiting:", x)
start = now()
# 這裡是一個協程物件,這個時候do_some_work函式並沒有執行
coroutine = do_some_work(2)
print(coroutine)
# 建立一個事件loop
loop = asyncio.get_event_loop()
# 將協程加入到事件迴圈loop
loop.run_until_complete(coroutine)
print("Time:",now()-start)

在上面帶中我們通過async關鍵字定義一個協程(coroutine),當然協程不能直接執行,需要將協程加入到事件迴圈loop中

asyncio.get_event_loop:建立一個事件迴圈,然後使用run_until_complete將協程註冊到事件迴圈,並啟動事件迴圈

建立一個task

協程物件不能直接執行,在註冊事件迴圈的時候,其實是run_until_complete方法將協程包裝成為了一個任務(task)物件. task物件是Future類的子類,儲存了協程執行後的狀態,用於未來獲取協程的結果


import asyncio
import time
now = lambda: time.time()
async def do_some_work(x):
print("waiting:", x)
start = now()
coroutine = do_some_work(2)
loop = asyncio.get_event_loop()
task = loop.create_task(coroutine)
print(task)
loop.run_until_complete(task)
print(task)
print("Time:",now()-start)

結果為:


<Task pending coro=<do_some_work() running at /app/py_code/study_asyncio/simple_ex2.py:13>>
waiting: 2
<Task finished coro=<do_some_work() done, defined at /app/py_code/study_asyncio/simple_ex2.py:13> result=None>
Time: 0.0003514289855957031

建立task後,在task加入事件迴圈之前為pending狀態,當完成後,狀態為finished

關於上面通過loop.create_task(coroutine)建立task,同樣的可以通過 asyncio.ensure_future(coroutine)建立task

關於這兩個命令的官網解釋: https://docs.python.org/3/library/asyncio-task.html


asyncio.ensure_future(coro_or_future, *, loop=None)¶
Schedule the execution of a coroutine object: wrap it in a future. Return a Task object.
If the argument is a Future, it is returned directly.

https://docs.python.org/3/library/asyncio-eventloop.html


AbstractEventLoop.create_task(coro)
Schedule the execution of a coroutine object: wrap it in a future. Return a Task object.
Third-party event loops can use their own subclass of Task for interoperability. In this case, the result type is a subclass of Task.
This method was added in Python 3.4.2. Use the async() function to support also older Python versions.

繫結回撥

繫結回撥,在task執行完成的時候可以獲取執行的結果,回撥的最後一個引數是future物件,通過該物件可以獲取協程返回值。


import time
import asyncio
now = lambda : time.time()
async def do_some_work(x):
print("waiting:",x)
return "Done after {}s".format(x)
def callback(future):
print("callback:",future.result())
start = now()
coroutine = do_some_work(2)
loop = asyncio.get_event_loop()
task = asyncio.ensure_future(coroutine)
print(task)
task.add_done_callback(callback)
print(task)
loop.run_until_complete(task)
print("Time:", now()-start)

結果為:


<Task pending coro=<do_some_work() running at /app/py_code/study_asyncio/simple_ex3.py:13>>
<Task pending coro=<do_some_work() running at /app/py_code/study_asyncio/simple_ex3.py:13> cb=[callback() at /app/py_code/study_asyncio/simple_ex3.py:18]>
waiting: 2
callback: Done after 2s
Time: 0.00039196014404296875

通過add_done_callback方法給task任務新增回撥函式,當task(也可以說是coroutine)執行完成的時候,就會呼叫回撥函式。並通過引數future獲取協程執行的結果。這裡我們建立 的task和回撥裡的future物件實際上是同一個物件

阻塞和await

使用async可以定義協程物件,使用await可以針對耗時的操作進行掛起,就像生成器裡的yield一樣,函式讓出控制權。協程遇到await,事件迴圈將會掛起該協程,執行別的協程,直到其他的協程也掛起或者執行完畢,再進行下一個協程的執行

耗時的操作一般是一些IO操作,例如網路請求,檔案讀取等。我們使用asyncio.sleep函式來模擬IO操作。協程的目的也是讓這些IO操作非同步化。


import asyncio
import time
now = lambda :time.time()
async def do_some_work(x):
print("waiting:",x)
# await 後面就是呼叫耗時的操作
await asyncio.sleep(x)
return "Done after {}s".format(x)
start = now()
coroutine = do_some_work(2)
loop = asyncio.get_event_loop()
task = asyncio.ensure_future(coroutine)
loop.run_until_complete(task)
print("Task ret:", task.result())
print("Time:", now() - start)

在await asyncio.sleep(x),因為這裡sleep了,模擬了阻塞或者耗時操作,這個時候就會讓出控制權。 即當遇到阻塞呼叫的函式的時候,使用await方法將協程的控制權讓出,以便loop呼叫其他的協程。

併發和並行

併發指的是同時具有多個活動的系統

並行值得是用併發來使一個系統執行的更快。並行可以在作業系統的多個抽象層次進行運用

所以併發通常是指有多個任務需要同時進行,並行則是同一個時刻有多個任務執行

下面這個例子非常形象:

併發情況下是一個老師在同一時間段輔助不同的人功課。並行則是好幾個老師分別同時輔助多個學生功課。簡而言之就是一個人同時吃三個饅頭還是三個人同時分別吃一個的情況,吃一個饅頭算一個任務


import asyncio
import time
now = lambda :time.time()
async def do_some_work(x):
print("Waiting:",x)
await asyncio.sleep(x)
return "Done after {}s".format(x)
start = now()
coroutine1 = do_some_work(1)
coroutine2 = do_some_work(2)
coroutine3 = do_some_work(4)
tasks = [
asyncio.ensure_future(coroutine1),
asyncio.ensure_future(coroutine2),
asyncio.ensure_future(coroutine3)
]
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(tasks))
for task in tasks:
print("Task ret:",task.result())
print("Time:",now()-start)

執行結果:


Waiting: 1
Waiting: 2
Waiting: 4
Task ret: Done after 1s
Task ret: Done after 2s
Task ret: Done after 4s
Time: 4.004154920578003

總時間為4s左右。4s的阻塞時間,足夠前面兩個協程執行完畢。如果是同步順序的任務,那麼至少需要7s。此時我們使用了aysncio實現了併發。asyncio.wait(tasks) 也可以使用 asyncio.gather(*tasks) ,前者接受一個task列表,後者接收一堆task。

關於asyncio.gather和asyncio.wait官網的說明:

https://docs.python.org/3/library/asyncio-task.html


Return a future aggregating results from the given coroutine objects or futures.
All futures must share the same event loop. If all the tasks are done successfully, the returned future's result is the list of results (in the order of the original sequence, not necessarily the order of results arrival). If return_exceptions is true, exceptions in the tasks are treated the same as successful results, and gathered in the result list; otherwise, the first raised exception will be immediately propagated to the returned future.

https://docs.python.org/3/library/asyncio-task.html


Wait for the Futures and coroutine objects given by the sequence futures to complete. Coroutines will be wrapped in Tasks. Returns two sets of Future: (done, pending).
The sequence futures must not be empty.
timeout can be used to control the maximum number of seconds to wait before returning. timeout can be an int or float. If timeout is not specified or None, there is no limit to the wait time.
return_when indicates when this function should return.

協程巢狀

使用async可以定義協程,協程用於耗時的io操作,我們也可以封裝更多的io操作過程,這樣就實現了巢狀的協程,即一個協程中await了另外一個協程,如此連線起來。


import asyncio
import time
now = lambda: time.time()
async def do_some_work(x):
print("waiting:",x)
await asyncio.sleep(x)
return "Done after {}s".format(x)
async def main():
coroutine1 = do_some_work(1)
coroutine2 = do_some_work(2)
coroutine3 = do_some_work(4)
tasks = [
asyncio.ensure_future(coroutine1),
asyncio.ensure_future(coroutine2),
asyncio.ensure_future(coroutine3)
]
dones, pendings = await asyncio.wait(tasks)
for task in dones:
print("Task ret:", task.result())
# results = await asyncio.gather(*tasks)
# for result in results:
#   print("Task ret:",result)
start = now()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
print("Time:", now()-start)

如果我們把上面程式碼中的:


dones, pendings = await asyncio.wait(tasks)
for task in dones:
print("Task ret:", task.result())

替換為:


results = await asyncio.gather(*tasks)
for result in results:
print("Task ret:",result)

這樣得到的就是一個結果的列表

不在main協程函式裡處理結果,直接返回await的內容,那麼最外層的run_until_complete將會返回main協程的結果。 將上述的程式碼更改為:


import asyncio
import time
now = lambda: time.time()
async def do_some_work(x):
print("waiting:",x)
await asyncio.sleep(x)
return "Done after {}s".format(x)
async def main():
coroutine1 = do_some_work(1)
coroutine2 = do_some_work(2)
coroutine3 = do_some_work(4)
tasks = [
asyncio.ensure_future(coroutine1),
asyncio.ensure_future(coroutine2),
asyncio.ensure_future(coroutine3)
]
return await asyncio.gather(*tasks)
start = now()
loop = asyncio.get_event_loop()
results = loop.run_until_complete(main())
for result in results:
print("Task ret:",result)
print("Time:", now()-start)

或者返回使用asyncio.wait方式掛起協程。

將程式碼更改為:


import asyncio
import time
now = lambda: time.time()
async def do_some_work(x):
print("waiting:",x)
await asyncio.sleep(x)
return "Done after {}s".format(x)
async def main():
coroutine1 = do_some_work(1)
coroutine2 = do_some_work(2)
coroutine3 = do_some_work(4)
tasks = [
asyncio.ensure_future(coroutine1),
asyncio.ensure_future(coroutine2),
asyncio.ensure_future(coroutine3)
]
return await asyncio.wait(tasks)
start = now()
loop = asyncio.get_event_loop()
done,pending = loop.run_until_complete(main())
for task in done:
print("Task ret:",task.result())
print("Time:", now()-start)

也可以使用asyncio的as_completed方法


import asyncio
import time
now = lambda: time.time()
async def do_some_work(x):
print("waiting:",x)
await asyncio.sleep(x)
return "Done after {}s".format(x)
async def main():
coroutine1 = do_some_work(1)
coroutine2 = do_some_work(2)
coroutine3 = do_some_work(4)
tasks = [
asyncio.ensure_future(coroutine1),
asyncio.ensure_future(coroutine2),
asyncio.ensure_future(coroutine3)
]
for task in asyncio.as_completed(tasks):
result = await task
print("Task ret: {}".format(result))
start = now()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
print("Time:", now()-start)

從上面也可以看出,協程的呼叫和組合非常靈活,主要體現在對於結果的處理:如何返回,如何掛起

協程的停止

future物件有幾個狀態:

Pending Running Done Cacelled

建立future的時候,task為pending,事件迴圈呼叫執行的時候當然就是running,呼叫完畢自然就是done,如果需要停止事件迴圈,就需要先把task取消。可以使用asyncio.Task獲取事件迴圈的task


import asyncio
import time
now = lambda :time.time()
async def do_some_work(x):
print("Waiting:",x)
await asyncio.sleep(x)
return "Done after {}s".format(x)
coroutine1 =do_some_work(1)
coroutine2 =do_some_work(2)
coroutine3 =do_some_work(2)
tasks = [
asyncio.ensure_future(coroutine1),
asyncio.ensure_future(coroutine2),
asyncio.ensure_future(coroutine3),
]
start = now()
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(asyncio.wait(tasks))
except KeyboardInterrupt as e:
print(asyncio.Task.all_tasks())
for task in asyncio.Task.all_tasks():
print(task.cancel())
loop.stop()
loop.run_forever()
finally:
loop.close()
print("Time:",now()-start)

啟動事件迴圈之後,馬上ctrl c,會觸發run_until_complete的執行異常 KeyBorardInterrupt。然後通過迴圈asyncio.Task取消future。可以看到輸出如下:


Waiting: 1
Waiting: 2
Waiting: 2
^C{<Task finished coro=<do_some_work() done, defined at /app/py_code/study_asyncio/simple_ex10.py:13> result='Done after 1s'>, <Task pending coro=<do_some_work() running at /app/py_code/study_asyncio/simple_ex10.py:15> wait_for=<Future pending cb=[Task._wakeup()]> cb=[_wait.<locals>._on_completion() at /usr/local/lib/python3.5/asyncio/tasks.py:428]>, <Task pending coro=<do_some_work() running at /app/py_code/study_asyncio/simple_ex10.py:15> wait_for=<Future pending cb=[Task._wakeup()]> cb=[_wait.<locals>._on_completion() at /usr/local/lib/python3.5/asyncio/tasks.py:428]>, <Task pending coro=<wait() running at /usr/local/lib/python3.5/asyncio/tasks.py:361> wait_for=<Future pending cb=[Task._wakeup()]>>}
False
True
True
True
Time: 1.0707225799560547

True表示cannel成功,loop stop之後還需要再次開啟事件迴圈,最後在close,不然還會丟擲異常

迴圈task,逐個cancel是一種方案,可是正如上面我們把task的列表封裝在main函式中,main函式外進行事件迴圈的呼叫。這個時候,main相當於最外出的一個task,那麼處理包裝的main函式即可。

不同執行緒的事件迴圈

很多時候,我們的事件迴圈用於註冊協程,而有的協程需要動態的新增到事件迴圈中。一個簡單的方式就是使用多執行緒。當前執行緒建立一個事件迴圈,然後在新建一個執行緒,在新執行緒中啟動事件迴圈。當前執行緒不會被block。


import asyncio
from threading import Thread
import time
now = lambda :time.time()
def start_loop(loop):
asyncio.set_event_loop(loop)
loop.run_forever()
def more_work(x):
print('More work {}'.format(x))
time.sleep(x)
print('Finished more work {}'.format(x))
start = now()
new_loop = asyncio.new_event_loop()
t = Thread(target=start_loop, args=(new_loop,))
t.start()
print('TIME: {}'.format(time.time() - start))
new_loop.call_soon_threadsafe(more_work, 6)
new_loop.call_soon_threadsafe(more_work, 3)

啟動上述程式碼之後,當前執行緒不會被block,新執行緒中會按照順序執行call_soon_threadsafe方法註冊的more_work方法, 後者因為time.sleep操作是同步阻塞的,因此執行完畢more_work需要大致6 3

新執行緒協程


import asyncio
import time
from threading import Thread
now = lambda :time.time()
def start_loop(loop):
asyncio.set_event_loop(loop)
loop.run_forever()
async def do_some_work(x):
print('Waiting {}'.format(x))
await asyncio.sleep(x)
print('Done after {}s'.format(x))
def more_work(x):
print('More work {}'.format(x))
time.sleep(x)
print('Finished more work {}'.format(x))
start = now()
new_loop = asyncio.new_event_loop()
t = Thread(target=start_loop, args=(new_loop,))
t.start()
print('TIME: {}'.format(time.time() - start))
asyncio.run_coroutine_threadsafe(do_some_work(6), new_loop)
asyncio.run_coroutine_threadsafe(do_some_work(4), new_loop)

上述的例子,主執行緒中建立一個new_loop,然後在另外的子執行緒中開啟一個無限事件迴圈。 主執行緒通過run_coroutine_threadsafe新註冊協程物件。這樣就能在子執行緒中進行事件迴圈的併發操作,同時主執行緒又不會被block。一共執行的時間大概在6s左右。

您可能感興趣的文章:

python併發2之使用asyncio處理併發python中利用佇列asyncio.Queue進行通訊詳解Python使用asyncio包處理併發詳解Python中使用asyncio 封裝檔案讀寫探索Python3.4中新引入的asyncio模組在Python3中使用asyncio庫進行快速資料抓取的教程