如果不懂 numpy，請別說自己是 python 程序員

LibraryPKU 2019-07-02

展開全文

（給Python開發(fā)者加星標,，提升Python技能）

作者：牧馬人（本文來自作者投稿）

0. 前言

大約七八年前,，我曾經(jīng)用 pyOpenGL 畫過地球磁層頂?shù)娜S模型，這段代碼至今仍然還運行在某科研機構(gòu)里,。在那之前,，我一直覺得自己是一個合（you）格（xiu）的 python 程序員，似乎無所不能,。但磁層頂模型的顯示效果令我沮喪——盡管這個模型只有十幾萬個頂點,，拖拽、縮放卻非?？D,。最終，我把頂點數(shù)量刪減到兩萬左右,，以兼顧模型質(zhì)量和響應(yīng)速度,，才勉強交付了這個任務(wù)。從此我開始懷疑 python 的性能,，甚至一度懷疑 python 是否還是我的首選工具,。

幸運的是，后來我遇到了 numpy 這個神器,。numpy 是 python 科學(xué)計算的基礎(chǔ)軟件包,，提供多了維數(shù)組對象，多種派生對象（掩碼數(shù)組,、矩陣等）以及用于快速操作數(shù)組的函數(shù)及 API,，它包括數(shù)學(xué)、邏輯,、數(shù)組形狀變換,、排序、選擇,、I/O ,、離散傅立葉變換、基本線性代數(shù),、基本統(tǒng)計運算,、隨機模擬等等。

了解 numpy之后,，我才想明白當(dāng)初磁層頂?shù)娜S模型之所以慢,，是因為使用了 list（python 數(shù)組）而不是 ndarray（numpy 數(shù)組）存儲數(shù)據(jù),。有了 numpy，python 程序員才有可能寫出媲美 C 語言運行速度的代碼,。熟悉 numpy,，才能學(xué)會使用 pyOpenGL / pyOpenCV / pandas / matplotlib 等數(shù)據(jù)處理及可視化的模塊。

事實上,，numpy 的數(shù)據(jù)組織結(jié)構(gòu),，尤其是數(shù)組（numpy.ndarray）,，幾乎已經(jīng)成為所有數(shù)據(jù)處理與可視化模塊的標準數(shù)據(jù)結(jié)構(gòu)了（這一點，類似于在機器學(xué)習(xí)領(lǐng)域 python 幾乎已經(jīng)成為首選工具語言）。越來越多的基于 python 的科學(xué)和數(shù)學(xué)軟件包使用 numpy 數(shù)組,，雖然這些工具通常都支持 python 的原生數(shù)組作為參數(shù),，但它們在處理之前會還是會將輸入的數(shù)組轉(zhuǎn)換為 numpy 的數(shù)組,，而且也通常輸出為 numpy 數(shù)組,。在 python 的圈子里，numpy 的重要性和普遍性日趨增強,。換句話說,，為了高效地使用當(dāng)今科學(xué)/數(shù)學(xué)基于 python 的工具（大部分的科學(xué)計算工具），你只知道如何使用 python 的原生數(shù)組類型是不夠的,，還需要知道如何使用 numpy 數(shù)組,。

總結(jié)：在這個 AI 和 ML 霸屏的時代，如果不懂 numpy,，請別說自己是 python 程序員,。

1. list VS ndarray

numpy 的核心是 ndarray 對象（numpy 數(shù)組），它封裝了 python 原生的同數(shù)據(jù)類型的 n 維數(shù)組（python 數(shù)組）,。numpy 數(shù)組和 python 數(shù)組之間有幾個重要的區(qū)別：

numpy 數(shù)組一旦創(chuàng)建,，其元素數(shù)量就不能再改變了。增刪 ndarray 元素的操作,，意味著創(chuàng)建一個新數(shù)組并刪除原來的數(shù)組,。python 數(shù)組的元素則可以動態(tài)增減不同，
numpy 數(shù)組中的元素都需要具有相同的數(shù)據(jù)類型,，因此在內(nèi)存中的大小相同,。 python 數(shù)組則無此要求。
numpy 數(shù)組的方法涵蓋了大量數(shù)學(xué)運算和復(fù)雜操作,，許多方法在最外層的 numpy 命名空間中都有對應(yīng)的映射函數(shù),。和 python 數(shù)組相比，numpy 數(shù)組的方法功能更強大,，執(zhí)行效率更高,，代碼更簡潔。

然而,，以上的差異并沒有真正體現(xiàn)出 ndarray 的優(yōu)勢之所在,，ndarray 的精髓在于 numpy 的兩大特征：矢量化（vectorization）和廣播（broadcast）,。矢量化可以理解為代碼中沒有顯式的循環(huán)、索引等,，廣播可以理解為隱式地對每個元素實施操作,。矢量化和廣播理解起來有點抽象，我們還是舉個栗子來說明一下吧,。

**例題 ** a 和 b 是等長的兩個整數(shù)數(shù)組，求 a 和 b 對應(yīng)元素之積組成的數(shù)組,。

1.用 python 數(shù)組實現(xiàn)：

c = list()for i in range(len(a)): c.append(a[i]*b[i])

用 numpy 數(shù)組實現(xiàn)：

c = a*b

這個栗子是不是體現(xiàn)了矢量化和廣播的強大力量呢,？請仔細體會！

總結(jié)：

矢量化代碼更簡潔,，更易于閱讀
更少的代碼行通常意味著更少的錯誤
代碼更接近于標準的數(shù)學(xué)符號
矢量化代碼更 pythonic

2. dtype AND shape

子曰：找對象先了解品行,，學(xué)對象先了解屬性。 ndarray 對象有很多屬性,，詳見下表,。

屬性說明

ndarray.dtype 元素類型

ndarray.shape 數(shù)組的結(jié)構(gòu)

ndarray.ndim 秩，即軸的數(shù)量或維度的數(shù)量

ndarray.size 數(shù)組元素的個數(shù)

ndarray.itemsize 每個元素的大小,，以字節(jié)為單位

ndarray.flags 數(shù)組的內(nèi)存信息

ndarray.real 元素的實部

ndarray.imag 元素的虛部

ndarray.data 數(shù)組元素的實際存儲區(qū)

基于以下三個原因,，我認為，dtype 和 shape 是 ndarray 最重要的兩個屬性,，重要到幾乎可以忽略其他的屬性,。

我們趟過的坑，幾乎都是 dtype 挖的
我們的迷茫,，幾乎都是因為 shape 和我們期望的不一樣
我們的工作,，很多都是在改變 shape

ndarray.astype() 可以修改元素類型， ndarray.reshape() 可以重新定義數(shù)組的結(jié)構(gòu),，這兩個方法的重要性和其對應(yīng)的屬性一樣,。記住這兩個屬性和對應(yīng)的兩個方法，就算是登堂入室了,。想了解 numpy 支持的元素類型,，請點擊《數(shù)學(xué)建模三劍客MSN》

3. 創(chuàng)建數(shù)組

(1) 創(chuàng)建簡單數(shù)組

numpy.array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)numpy.empty(shape, dtype=float, order='C')numpy.zeros(shape, dtype=float, order='C')numpy.ones(shape, dtype=float, order='C')numpy.eye(N, M=None, k=0, dtype=float, order='C')

應(yīng)用示例：

>>> import numpy as np>>>> np.array([1, 2, 3])array([1, 2, 3])>>> np.empty((2, 3))array([[2.12199579e-314, 6.36598737e-314, 1.06099790e-313],       [1.48539705e-313, 1.90979621e-313, 2.33419537e-313]])>>> np.zeros(2)array([0., 0.])>>> np.ones(2)array([1., 1.])>>> np.eye(3)array([[1., 0., 0.],       [0., 1., 0.],       [0., 0., 1.]])

(2) 創(chuàng)建隨機數(shù)組

numpy.random.random(size=None)numpy.random.randint(low, high=None, size=None, dtype='l')

應(yīng)用示例：

>>> np.random.random(3)array([0.29334156, 0.45858765, 0.99297047])>>> np.random.randint(2, size=10)array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0])>>> np.random.randint(5, size=(2, 4))array([[4, 0, 2, 1],       [3, 2, 2, 0]])>>> np.random.randint(3,10,(2,4))array([[4, 8, 9, 6],       [7, 7, 7, 9]])

(3) 在數(shù)值范圍內(nèi)創(chuàng)建數(shù)組

numpy.arange(start, stop, step, dtype=None)numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)numpy.logspace(start, stop, num=50, endpoint=True, base=10.0, dtype=None)

應(yīng)用示例：

>>> np.arange(5)array([0, 1, 2, 3, 4])>>> np.arange(0,5,2)array([0, 2, 4])>>> np.linspace(0, 5, 5)array([0.  , 1.25, 2.5 , 3.75, 5.  ])>>> np.linspace(0, 5, 5, endpoint=False)array([0., 1., 2., 3., 4.])>>> np.logspace(1,3,3)array([  10.,  100., 1000.])>>> np.logspace(1, 3, 3, endpoint=False)array([ 10.        ,  46.41588834, 215.443469  ])

(4) 從已有數(shù)組創(chuàng)建數(shù)組

numpy.asarray(a, dtype=None, order=None)numpy.empty_like(a, dtype=None, order='K', subok=True)numpy.zeros_like(a, dtype=None, order='K', subok=True)numpy.ones_like(a, dtype=None, order='K', subok=True)[source]

應(yīng)用示例：

>>> np.asarray([1,2,3])array([1, 2, 3])>>> np.empty_like(np.asarray([1,2,3]))array([0, 0, 0])>>> np.zeros_like(np.asarray([1,2,3]))array([0, 0, 0])>>> np.ones_like(np.asarray([1,2,3]))array([1, 1, 1])

(5) 構(gòu)造復(fù)雜數(shù)組

[1] 重復(fù)數(shù)組 tile

>>> a = np.arange(3)>>> aarray([0, 1, 2])>>> np.tile(a, 2)array([0, 1, 2, 0, 1, 2])>>> np.tile(a, (2,3))array([[0, 1, 2, 0, 1, 2, 0, 1, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]])

[2] 重復(fù)元素 repeat

>>> a = np.arange(3)>>> aarray([0, 1, 2])>>> a.repeat(2)array([0, 0, 1, 1, 2, 2])

[3] 一維數(shù)組網(wǎng)格化: meshgrid

>>> lon = np.arange(30, 120, 10)>>> lonarray([ 30, 40, 50, 60, 70, 80, 90, 100, 110])>>> lat = np.arange(10, 50, 10)>>> latarray([10, 20, 30, 40])>>> lons, lats = np.meshgrid(lon, lat)>>> lonsarray([[ 30, 40, 50, 60, 70, 80, 90, 100, 110], [ 30, 40, 50, 60, 70, 80, 90, 100, 110], [ 30, 40, 50, 60, 70, 80, 90, 100, 110], [ 30, 40, 50, 60, 70, 80, 90, 100, 110]])>>> latsarray([[10, 10, 10, 10, 10, 10, 10, 10, 10], [20, 20, 20, 20, 20, 20, 20, 20, 20], [30, 30, 30, 30, 30, 30, 30, 30, 30], [40, 40, 40, 40, 40, 40, 40, 40, 40]])

[4] 指定范圍和分割方式的網(wǎng)格化: mgrid

>>> lats, lons= np.mgrid[10:50:10, 30:120:10]>>> lats array([[10, 10, 10, 10, 10, 10, 10, 10, 10],       [20, 20, 20, 20, 20, 20, 20, 20, 20],       [30, 30, 30, 30, 30, 30, 30, 30, 30],       [40, 40, 40, 40, 40, 40, 40, 40, 40]])>>> lonsarray([[ 30,  40,  50,  60,  70,  80,  90, 100, 110],       [ 30,  40,  50,  60,  70,  80,  90, 100, 110],       [ 30,  40,  50,  60,  70,  80,  90, 100, 110],       [ 30,  40,  50,  60,  70,  80,  90, 100, 110]])>>> lats, lons = np.mgrid[10:50:5j, 30:120:10j]>>> latsarray([[10., 10., 10., 10., 10., 10., 10., 10., 10., 10.],       [20., 20., 20., 20., 20., 20., 20., 20., 20., 20.],       [30., 30., 30., 30., 30., 30., 30., 30., 30., 30.],       [40., 40., 40., 40., 40., 40., 40., 40., 40., 40.],       [50., 50., 50., 50., 50., 50., 50., 50., 50., 50.]])>>> lonsarray([[ 30.,  40.,  50.,  60.,  70.,  80.,  90., 100., 110., 120.],       [ 30.,  40.,  50.,  60.,  70.,  80.,  90., 100., 110., 120.],       [ 30.,  40.,  50.,  60.,  70.,  80.,  90., 100., 110., 120.],       [ 30.,  40.,  50.,  60.,  70.,  80.,  90., 100., 110., 120.],       [ 30.,  40.,  50.,  60.,  70.,  80.,  90., 100., 110., 120.]])

上面的例子中用到了虛數(shù)。構(gòu)造復(fù)數(shù)的方法如下：

>>> complex(2,5)(2+5j)124. 數(shù)組操作(1) 切片和索引對于一維數(shù)組的索引和切片,，numpy和python的list一樣,，甚至更靈活。a = np.arange(9)>>> a[-1] # 最后一個元素8>>> a[2:5] # 返回第2到第5個元素array([2, 3, 4])>>> a[:7:3] # 返回第0到第7個元素,，步長為3array([0, 3, 6])>>> a[::-1] # 返回逆序的數(shù)組array([8, 7, 6, 5, 4, 3, 2, 1, 0])

假設(shè)有一棟2層樓,，每層樓內(nèi)的房間都是3行4列，那我們可以用一個三維數(shù)組來保存每個房間的居住人數(shù)（當(dāng)然,，也可以是房間面積等其他數(shù)值信息）,。

>>> a = np.arange(24).reshape(2,3,4)    # 2層3行4列>>> aarray([[[ 0,  1,  2,  3],        [ 4,  5,  6,  7],        [ 8,  9, 10, 11]],       [[12, 13, 14, 15],        [16, 17, 18, 19],
        [20, 21, 22, 23]]])>>> a[1][2][3]                          # 雖然可以這樣
23>>> a[1,2,3]                            # 但這才是規(guī)范的用法23>>> a[:,0,0]                            # 所有樓層的第1排第1列array([ 0, 12])
>>> a[0,:,:]                            # 1樓的所有房間,，等價與a[0]或a[0,...]array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>> a[:,:,1:3]                          # 所有樓層所有排的第2到4列
array([[[ 1,  2],        [ 5,  6],        [ 9, 10]],
       [[13, 14],        [17, 18],       [21, 22]]])
>>> a[1,:,-1]                           # 2層每一排的最后一個房間array([15, 19, 23])

提示：對多維數(shù)組切片或索引得到的結(jié)果，維度不是確定的,。

(2) 改變數(shù)組的結(jié)構(gòu)

numpy 數(shù)組的存儲順序和數(shù)組的維度是不相干的,，因此改變數(shù)組的維度是非常便捷的操作，除 resize() 外,，這一類操作不會改變所操作的數(shù)組本身的存儲順序,。

>>> a = np.array([[1,2,3],[4,5,6]])

>>> a.shape # 查看數(shù)組維度

(2, 3)

>>> a.reshape(3,2) # 返回3行2列的數(shù)組

array([[1, 2],

[3, 4],

[5, 6]])

>>> a.ravel() # 返回一維數(shù)組

array([1, 2, 3, 4, 5, 6])

>>> a.transpose() # 行變列（類似于矩陣轉(zhuǎn)置）

array([[1, 4],

[2, 5],

[3, 6]])

>>> a.resize((3,2)) # 類似于reshape，但會改變所操作的數(shù)組

>>> aarray([[1, 2],

[3, 4],

[5, 6]])

np.rollaxis() 用于改變軸的順序,，返回一個新的數(shù)組,。用法如下：

numpy.rollaxis(a, axis, start=0)

a: 數(shù)組
axis: 要改變的軸。其他軸的相對順序保持不變
start: 要改變的軸滾動至此位置之前,。默認值為0

應(yīng)用示例：

>>> a = np.ones((3,4,5,6))>>> np.rollaxis(a, 3, 1).shape(3, 6, 4, 5)>>> np.rollaxis(a, 2).shape(5, 3, 4, 6)>>> np.rollaxis(a, 1, 4).shape(3, 5, 6, 4)

(3) 數(shù)組合并

[1] append

對于剛剛上手 numpy 的程序員來說,，最大的困惑就是不能使用 append() 方法向數(shù)組內(nèi)添加元素了，甚至連 append() 方法都找不到了,。其實,，numpy 仍然保留了 append() 方法，只不過這個方法不再是 numpy 數(shù)組的方法,，而是是升級到最外層的 numpy 命名空間了,，并且該方法的功能不再是追加元素，而是合并數(shù)組了,。

>>> np.append([1, 2, 3], [[4, 5, 6], [7, 8, 9]])array([1, 2, 3, 4, 5, 6, 7, 8, 9])>>> np.append([[1, 2, 3]], [[4, 5, 6]], axis=0)array([[1, 2, 3],      [4, 5, 6]])>>> np.append(np.array([[1, 2, 3]]), np.array([[4, 5, 6]]), axis=1)array([[1, 2, 3, 4, 5, 6]])

[2] concatenate

concatenate() 和 append() 的用法非常類似,，不過是把兩個合并對象寫成了一個元組。

>>> a = np.array([[1, 2], [3, 4]])>>> b = np.array([[5, 6]])>>> np.concatenate((a, b), axis=0)array([[1, 2],

[3, 4],

[5, 6]])>>> np.concatenate((a, b.T), axis=1)array([[1, 2, 5],

[3, 4, 6]])>>> np.concatenate((a, b), axis=None)array([1, 2, 3, 4, 5, 6])

[3] stack

除了 append() 和 concatenate() ,，數(shù)組合并還有更直接的水平合并（hstack）,、垂直合并（vstack）、深度合并（dstack）等方式,。假如你比我還懶,，那就只用 stack 吧，足夠了,。

>>> a = np.arange(9).reshape(3,3)
>>> b = np.arange(9,18).reshape(3,3)
>>> a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
>>> b
array([[ 9, 10, 11],
       [12, 13, 14],
       [15, 16, 17]])
>>> np.hstack((a,b))                        # 水平合并
array([[ 0,  1,  2,  9, 10, 11],
       [ 3,  4,  5, 12, 13, 14],
       [ 6,  7,  8, 15, 16, 17]])
>>> np.vstack((a,b))                        # 垂直合并
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14],
       [15, 16, 17]])
>>> np.dstack((a,b))                        # 深度合并
array([[[ 0,  9],
        [ 1, 10],
        [ 2, 11]],


       [[ 3, 12],
        [ 4, 13],
        [ 5, 14]],


       [[ 6, 15],
        [ 7, 16],
        [ 8, 17]]])

(4) 數(shù)組拆分

拆分是合并的逆過程,，概念是一樣的，但稍微有一點不同：

>>> a = np.arange(4).reshape(2,2)

>>> a

array([[0, 1],

[2, 3]])

>>> x, y = np.hsplit(a, 2) # 水平拆分,，返回list

>>> x

array([[0],

[2]])

>>> y

array([[1],

[3]])

>>> x, y = np.vsplit(a, 2) # 垂直拆分,，返回list

>>> x

array([[0, 1]])

>>> y

array([[2, 3]])

>>> a = np.arange(8).reshape(2,2,2)

>>> a

array([[[0, 1],

[2, 3]],

[[4, 5],

[6, 7]]])

>>> x,y = np.dsplit(a, 2) # 深度拆分，返回list

>>> x

array([[[0],

[2]],

[[4],

[6]]])

>>> y

array([[[1],

[3]],

[[5],

[7]]])

(5) 數(shù)組排序

排序不是 numpy 數(shù)組的強項,，但 python 數(shù)組的排序速度依然只能望其項背,。

[1] numpy.sort()

numpy.sort() 函數(shù)返回輸入數(shù)組的排序副本。

numpy.sort(a, axis=-1, kind='quicksort', order=None)

a: 要排序的數(shù)組
axis: 沿著它排序數(shù)組的軸，如果沒有,，數(shù)組會被展開,，沿著最后的軸排序
kind: 排序方法，默認為’quicksort’（快速排序）,，其他選項還有 ‘mergesort’（歸并排序）和 ‘heapsort’（堆排序）
order: 如果數(shù)組包含字段,，則是要排序的字段

應(yīng)用示例：

>>> a = np.array([3, 1, 2])

>>> np.sort(a)

array([1, 2, 3])

>>> dt = np.dtype([('name', 'S10'),('age', int)])

>>> a = np.array([('raju',21),('anil',25),('ravi', 17), ('amar',27)], dtype = dt)

>>> a

array([(b'raju', 21), (b'anil', 25), (b'ravi', 17), (b'amar', 27)],

dtype=[('name', 'S10'), ('age', '<i4')])

>>> np.sort(a, order='name')

array([(b'amar', 27), (b'anil', 25), (b'raju', 21), (b'ravi', 17)],

dtype=[('name', 'S10'), ('age', '<i4')])

[2] numpy.argsort()

函數(shù)返回的是數(shù)組值從小到大的索引值。

numpy.argsort(a, axis=-1, kind='quicksort', order=None)

a: 要排序的數(shù)組
axis: 沿著它排序數(shù)組的軸,，如果沒有,，數(shù)組會被展開，沿著最后的軸排序
kind: 排序方法,，默認為’quicksort’（快速排序）,，其他選項還有 ‘mergesort’（歸并排序）和 ‘heapsort’（堆排序）
order: 如果數(shù)組包含字段，則是要排序的字段

應(yīng)用示例：

>>> a = np.array([3, 1, 2])>>> np.argsort(a)array([1, 2, 0], dtype=int64)

(6) 查找和篩選

[1] 返回數(shù)組中最大值和最小值的索引

numpy.argmax(a, axis=None, out=None)numpy.argmin(a, axis=None, out=None)

[2] 返回數(shù)組中非零元素的索引

numpy.nonzero(a)

[3] 返回數(shù)組中滿足給定條件的元素的索引

numpy.where(condition[, x, y])

應(yīng)用示例：

>>> a = np.arange(10)>>> aarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])>>> np.where(a < 5)(array([0, 1, 2, 3, 4], dtype=int64),)>>> a = a.reshape((2, -1))>>> aarray([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]])>>> np.where(a < 5)(array([0, 0, 0, 0, 0], dtype=int64), array([0, 1, 2, 3, 4], dtype=int64))>>> np.where(a < 5, a, 10*a)array([[ 0, 1, 2, 3, 4],

[50, 60, 70, 80, 90]])

[4] 返回數(shù)組中被同結(jié)構(gòu)布爾數(shù)組選中的各元素

numpy.extract(condition, arr)

應(yīng)用示例：

>>> a = np.arange(12).reshape((3, 4))

>>> a

array([[ 0, 1, 2, 3],

[ 4, 5, 6, 7],

[ 8, 9, 10, 11]])

>>> condition = np.mod(a, 3)==0

>>> condition

array([[ True, False, False, True],

[False, False, True, False],

[False, True, False, False]])

>>> np.extract(condition, a)

array([0, 3, 6, 9])

(7) 增減元素

[1] 在給定索引之前沿給定軸在輸入數(shù)組中插入值,，并返回新的數(shù)組

numpy.insert(arr, obj, values, axis=None)

應(yīng)用示例：

>>> a = np.array([[1, 1], [2, 2], [3, 3]])

>>> a

array([[1, 1],

[2, 2],

[3, 3]])

>>> np.insert(a, 1, 5)

array([1, 5, 1, 2, 2, 3, 3])

>>> np.insert(a, 1, 5, axis=0)

array([[1, 1],

[5, 5],

[2, 2],

[3, 3]])

>>> np.insert(a, 1, [5,7], axis=0)

array([[1, 1],

[5, 7],

[2, 2],

[3, 3]])

>>> np.insert(a, 1, 5, axis=1)

array([[1, 5, 1],

[2, 5, 2],

[3, 5, 3]])

[2] 在給定索引之前沿給定軸刪除指定子數(shù)組，并返回新的數(shù)組

numpy.delete(arr, obj, axis=None)

應(yīng)用示例：

>>> a = np.array([[1, 2], [3, 4], [5, 6]])

>>> a

array([[1, 2],

[3, 4],

[5, 6]])

>>> np.delete(a, 1)

array([1, 3, 4, 5, 6])

>>> np.delete(a, 1, axis=0)

array([[1, 2],

[5, 6]])

>>> np.delete(a, 1, axis=1)

array([[1],

[3],

[5]])

[3] 去除重復(fù)元素

numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None)

arr：輸入數(shù)組,，如果不是一維數(shù)組則會展開
return_index：如果為true,，返回新列表元素在舊列表中的位置（下標），并以列表形式儲
return_inverse：如果為true,，返回舊列表元素在新列表中的位置（下標）,，并以列表形式儲
return_counts：如果為true，返回去重數(shù)組中的元素在原數(shù)組中的出現(xiàn)次數(shù)

應(yīng)用示例：

>>> a = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]])

>>> np.unique(a)

array([0, 1, 2, 3, 4])

>>> np.unique(a, axis=0)

array([[1, 0, 0],

[2, 3, 4]])

>>> u, indices = np.unique(a, return_index=True)

>>> u

array([0, 1, 2, 3, 4])

>>> indices

array([1, 0, 6, 7, 8], dtype=int64)

>>> u, indices = np.unique(a, return_inverse=True)

>>> u

array([0, 1, 2, 3, 4])

>>> indices

array([1, 0, 0, 1, 0, 0, 2, 3, 4], dtype=int64)

>>> u, num = np.unique(a, return_counts=True)

>>> u

array([0, 1, 2, 3, 4])

>>> num

array([4, 2, 1, 1, 1], dtype=int64)

(8) 數(shù)組IO

numpy 為 ndarray 對象引入了新的二進制文件格式,，用于存儲重建 ndarray 所需的數(shù)據(jù),、圖形、dtype 和其他信息,。.npy 文件存儲單個數(shù)組,，.npz 文件存取多個數(shù)組。

[1] 保存單個數(shù)組到文件

numpy.save(file, arr, allow_pickle=True, fix_imports=True)

file: 要保存的文件,，擴展名為 .npy,，如果文件路徑末尾沒有擴展名 .npy，該擴展名會被自動加上
arr: 要保存的數(shù)組
allow_pickle: 可選,，布爾值,，允許使用 python pickles 保存對象數(shù)組，python 中的 pickle 用于在保存到磁盤文件或從磁盤文件讀取之前,，對對象進行序列化和反序列化
fix_imports: 可選,，為了方便 pyhton2 讀取 python3 保存的數(shù)據(jù)

[2] 保存多個數(shù)組到文件

numpy.savze() 函數(shù)用于將多個數(shù)組寫入文件，默認情況下,，數(shù)組是以未壓縮的原始二進制格式保存在擴展名為 .npz 的文件中,。

numpy.savez(file, *args, **kwds)

file: 要保存的文件，擴展名為 .npz，如果文件路徑末尾沒有擴展名 .npz,，該擴展名會被自動加上
args: 要保存的數(shù)組,，可以使用關(guān)鍵字參數(shù)為數(shù)組起一個名字，非關(guān)鍵字參數(shù)傳遞的數(shù)組會自動起名為 arr_0, arr_1, …
kwds: 要保存的數(shù)組使用關(guān)鍵字名稱

[3] 從文件加載數(shù)組

numpy.load(file, mmap_mode=None, allow_pickle=True, fix_imports=True, encoding='ASCII')

file: 類文件對象（支持 seek() 和 read()方法）或者要讀取的文件路徑
arr: 打開方式,，None | ‘r+’ | ‘r’ | ‘w+’ | ‘c’
allow_pickle: 可選,，布爾值，允許使用 python pickles 保存對象數(shù)組,，python 中的 pickle 用于在保存到磁盤文件或從磁盤文件讀取之前,，對對象進行序列化和反序列化
fix_imports: 可選，為了方便 pyhton2 讀取 python3 保存的數(shù)據(jù)
encoding: 編碼格式,，‘latin1’ | ‘ASCII’ | ‘bytes’

應(yīng)用示例：

a = np.array([[1,2,3],[4,5,6]])

b = np.arange(0, 1.0, 0.1)

c = np.sin(b)

# c 使用了關(guān)鍵字參數(shù) sin_array

np.savez('runoob.npz', a, b, sin_array = c)

r = np.load('runoob.npz')

print(r.files) # 查看各個數(shù)組名稱

print(r['arr_0']) # 數(shù)組 a

print(r['arr_1']) # 數(shù)組 b

print(r['sin_array']) # 數(shù)組 c

[4] 使用文本文件存取數(shù)組

numpy 也支持以文本文件存取數(shù)據(jù),。savetxt() 函數(shù)是以簡單的文本文件格式存儲數(shù)據(jù)，對應(yīng)的使用 loadtxt() 函數(shù)來獲取數(shù)據(jù),。

應(yīng)用示例：

a = np.array([1,2,3,4,5]) np.savetxt('out.txt',a) b = np.loadtxt('out.txt')  print(b)

5. 常用函數(shù)

(1) 舍入函數(shù)

[1] 四舍五入

numpy.around(a, decimals=0, out=None)

應(yīng)用示例：


>>> np.around([-0.42, -1.68, 0.37, 1.64])
array([-0., -2.,  0.,  2.])
>>> np.around([-0.42, -1.68, 0.37, 1.64], decimals=1)
array([-0.4, -1.7,  0.4,  1.6])
>>> np.around([.5, 1.5, 2.5, 3.5, 4.5]) # rounds to nearest even value
array([ 0.,  2.,  2.,  4.,  4.])

[2] 去尾和進一

numpy.floor(a)numpy.ceil(a)

應(yīng)用示例：

>>> np.floor([-0.42, -1.68, 0.37, 1.64])
array([-1., -2.,  0.,  1.])
>>> np.ceil([-0.42, -1.68, 0.37, 1.64])
array([-0., -1.,  1.,  2.])

(2) 數(shù)學(xué)函數(shù)

函數(shù) 說明

numpy.deg2rad() / numpy.radians() 度轉(zhuǎn)弧度

numpy.rad2deg() / numpy.degrees() 弧度轉(zhuǎn)度

numpy.sin() 正弦函數(shù)

numpy.arcsin() 反正弦函數(shù)

numpy.cos() 余弦函數(shù)

numpy.arccos() 反余弦函數(shù)

numpy.tan() 正切函數(shù)

numpy.arctan() 反正切函數(shù)

numpy.hypot() 計算直角三角形斜邊

numpy.square() 平方

numpy.sqrt() 開平方

numpy.power 乘方

numpy.exp() 指數(shù)

numpy.log() 對數(shù)

numpy.log2() 對數(shù)

numpy.log10() 對數(shù)

(3) 統(tǒng)計函數(shù)

函數(shù) 說明

numpy.sum(a[, axis, dtype, out, keepdims]) 按指定的軸求元素之和

numpy.nansum(a[, axis, dtype, out, keepdims]) 按指定的軸求元素之和,，numpy.nan視為0

numpy.cumsum(a[, axis, dtype, out]) 按指定的軸求元素累進和

numpy.prod(a[, axis, dtype, out, keepdims]) 按指定的軸求元素之積

numpy.diff(a[, n, axis]) 返回相鄰元素的差

numpy.ptp() 返回數(shù)組中元素最大值與最小值的差

numpy.var() 返回數(shù)組方差

numpy.std() 返回數(shù)組標準差

numpy.median() 返回數(shù)組元素的中位數(shù)

numpy.mean(a, axis=None, dtype=None, out=None, keepdims=) 返回所有元素的算數(shù)平均值

numpy.average() 根據(jù)權(quán)重數(shù)據(jù)，返回數(shù)據(jù)數(shù)組所有元素的夾權(quán)平均值

6. 牛刀小試

**例題 ** vertices 是若干三維空間隨機點的集合,，p 是三維空間的一點,，找出 vertices 中距離 p 點最近的一個點，并計算它們的距離,。

用 python 數(shù)組實現(xiàn)：

import math

vertices = [[3,4,5], [7,8,9], [4,9,3]]

p = [2,7,4]

d = list()

for v in vertices:

d.append(math.sqrt(math.pow(v[0]-p[0], 2)+math.pow(v[1]-p[1], 2)+math.pow(v[2]-p[2], 2)))

print(vertices[d.index(min(d))], min(d))

用 numpy 數(shù)組實現(xiàn)：


import numpy as np
vertices = np.array([[3,4,5], [7,8,9], [4,9,3]])
p = np.array([2,7,4])
d = np.sqrt(np.sum(np.square((vertices-p)), axis=1))
print(vertices[d.argmin()], d.min())

用隨機方式生成1000個點,，比較兩種的方法的效率。

【本文作者】

許向武：山東遠思信息科技有限公司CEO,，網(wǎng)名牧碼人（天元浪子）,，齊國土著，太公之后,。少小離家,，獨闖江湖，后歸隱于華不注山,。素以敲擊鍵盤為業(yè),，偶爾游戲于各網(wǎng)絡(luò)對局室，擅長送財送分,，深為眾棋友所喜聞樂見,。