Skip to content

Commit 3fc3c43

Browse files
committed
Add cypy notebook
1 parent 7955cd5 commit 3fc3c43

File tree

2 files changed

+375
-4
lines changed

2 files changed

+375
-4
lines changed

source-code/README.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,13 @@ Sample code for performing computations on a GPU.
55

66
## What is it?
77

8-
1. `pycuda.ipynb`: jupyter notebook illustrating pyCUDA.
9-
1. `curand.ipynb`: jupyter notebook illustrating generating random
8+
1. `pycuda.ipynb`: Jupyter notebook illustrating pyCUDA.
9+
1. `curand.ipynb`: Jupyter notebook illustrating generating random
1010
numbers on a GPU.
11-
1. `scikit_cuda.ipynb`: jupyter notebook illustrating linear algebra
11+
1. `scikit_cuda.ipynb`: Jupyter notebook illustrating linear algebra
1212
on a GPU device.
13-
1. `numba.ipynb`: jupyter notebook illustrating using numba for
13+
1. `numba.ipynb`: Jupyter notebook illustrating using numba for
1414
GPU computing.
15+
16+
1. `cupy.ipynb`: Jupyter notebook illustrating some aspects of
17+
the `cupy` package.

source-code/cupy.ipynb

Lines changed: 368 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,368 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "fc532d4b-d164-4aaf-b70b-ae1f0b9d30a3",
6+
"metadata": {},
7+
"source": [
8+
"## Requirements"
9+
]
10+
},
11+
{
12+
"cell_type": "code",
13+
"execution_count": 15,
14+
"id": "c5430b2b-a033-4d85-a76a-eb9c958eb66d",
15+
"metadata": {},
16+
"outputs": [],
17+
"source": [
18+
"import cupy as cp\n",
19+
"import numpy as np\n",
20+
"import scipy"
21+
]
22+
},
23+
{
24+
"cell_type": "markdown",
25+
"id": "7f0a018f-5bb9-4c47-a7d5-07941b175cd6",
26+
"metadata": {},
27+
"source": [
28+
"## Create data"
29+
]
30+
},
31+
{
32+
"cell_type": "code",
33+
"execution_count": 2,
34+
"id": "ed36c851-e698-4c1d-9256-b57808feee78",
35+
"metadata": {},
36+
"outputs": [],
37+
"source": [
38+
"nr_rows, nr_cols = 2_000, 2_000"
39+
]
40+
},
41+
{
42+
"cell_type": "code",
43+
"execution_count": 3,
44+
"id": "074bef9c-ad2e-46ae-8aa9-373fb1f1e8db",
45+
"metadata": {},
46+
"outputs": [],
47+
"source": [
48+
"A = np.random.uniform(size=(nr_rows, nr_cols))\n",
49+
"B = np.random.uniform(size=A.shape)"
50+
]
51+
},
52+
{
53+
"cell_type": "markdown",
54+
"id": "c2627a05-de0f-476e-b055-1a82b01b9cf4",
55+
"metadata": {},
56+
"source": [
57+
"## Matrix-matrix multiplication"
58+
]
59+
},
60+
{
61+
"cell_type": "code",
62+
"execution_count": 4,
63+
"id": "ef334f46-3e67-425a-b4b2-8d269aaa8ba8",
64+
"metadata": {},
65+
"outputs": [
66+
{
67+
"name": "stdout",
68+
"output_type": "stream",
69+
"text": [
70+
"CPU times: user 975 ms, sys: 1.03 s, total: 2.01 s\n",
71+
"Wall time: 127 ms\n"
72+
]
73+
}
74+
],
75+
"source": [
76+
"%%time\n",
77+
"C = A@B"
78+
]
79+
},
80+
{
81+
"cell_type": "code",
82+
"execution_count": 5,
83+
"id": "9cd7c352-e8d6-4066-8aeb-1f9c05f137e4",
84+
"metadata": {},
85+
"outputs": [
86+
{
87+
"name": "stdout",
88+
"output_type": "stream",
89+
"text": [
90+
"CPU times: user 427 ms, sys: 2.04 s, total: 2.47 s\n",
91+
"Wall time: 1.26 s\n"
92+
]
93+
}
94+
],
95+
"source": [
96+
"%%time\n",
97+
"A_dev = cp.array(A, copy=True)\n",
98+
"B_dev = cp.array(B, copy=True)"
99+
]
100+
},
101+
{
102+
"cell_type": "markdown",
103+
"id": "daae2ec0-4532-489e-b23f-0a683153f590",
104+
"metadata": {},
105+
"source": [
106+
"Although copying is requested, that doesn't seem to happen."
107+
]
108+
},
109+
{
110+
"cell_type": "code",
111+
"execution_count": 6,
112+
"id": "63fab20c-3a22-46fe-9069-ae016d63ae28",
113+
"metadata": {},
114+
"outputs": [
115+
{
116+
"name": "stdout",
117+
"output_type": "stream",
118+
"text": [
119+
"CPU times: user 964 ms, sys: 2.58 s, total: 3.55 s\n",
120+
"Wall time: 3.67 s\n"
121+
]
122+
}
123+
],
124+
"source": [
125+
"%%time\n",
126+
"C_dev = A_dev@B_dev"
127+
]
128+
},
129+
{
130+
"cell_type": "code",
131+
"execution_count": 7,
132+
"id": "ade24d9f-da31-4d98-a029-d7504addfc37",
133+
"metadata": {},
134+
"outputs": [
135+
{
136+
"name": "stdout",
137+
"output_type": "stream",
138+
"text": [
139+
"CPU times: user 2.44 ms, sys: 37 µs, total: 2.48 ms\n",
140+
"Wall time: 2.05 ms\n"
141+
]
142+
}
143+
],
144+
"source": [
145+
"%%time\n",
146+
"C_dev = A_dev@B_dev"
147+
]
148+
},
149+
{
150+
"cell_type": "markdown",
151+
"id": "cdd42d52-58f6-4714-b804-a3aaac324df0",
152+
"metadata": {},
153+
"source": [
154+
"If possible, it helps to create the data on the GPU directive."
155+
]
156+
},
157+
{
158+
"cell_type": "code",
159+
"execution_count": 8,
160+
"id": "4ac0edcb-2a7a-4f5f-85a2-e5bbf09478d6",
161+
"metadata": {},
162+
"outputs": [
163+
{
164+
"name": "stdout",
165+
"output_type": "stream",
166+
"text": [
167+
"CPU times: user 360 ms, sys: 15 ms, total: 375 ms\n",
168+
"Wall time: 378 ms\n"
169+
]
170+
}
171+
],
172+
"source": [
173+
"%%time\n",
174+
"X_dev = cp.random.uniform(0.0, 1.0, size=(nr_rows, nr_cols))\n",
175+
"Y_dev = cp.random.uniform(0.0, 1.0, size=(nr_rows, nr_cols))"
176+
]
177+
},
178+
{
179+
"cell_type": "code",
180+
"execution_count": 9,
181+
"id": "f91ab450-6990-4117-9f5e-564600f35c9a",
182+
"metadata": {},
183+
"outputs": [
184+
{
185+
"name": "stdout",
186+
"output_type": "stream",
187+
"text": [
188+
"CPU times: user 432 µs, sys: 1.49 ms, total: 1.92 ms\n",
189+
"Wall time: 1.08 ms\n"
190+
]
191+
}
192+
],
193+
"source": [
194+
"%%time\n",
195+
"Z_dev = X_dev@Y_dev"
196+
]
197+
},
198+
{
199+
"cell_type": "markdown",
200+
"id": "fff63e52-7f55-4074-9690-5ebaf75d5a5e",
201+
"metadata": {},
202+
"source": [
203+
"## Matrix power"
204+
]
205+
},
206+
{
207+
"cell_type": "code",
208+
"execution_count": 10,
209+
"id": "d13ef081-3cad-48ba-a789-8fa951b5ac09",
210+
"metadata": {},
211+
"outputs": [],
212+
"source": [
213+
"D = np.random.uniform(size=(1_000, 1_000))"
214+
]
215+
},
216+
{
217+
"cell_type": "code",
218+
"execution_count": 11,
219+
"id": "ce65be0a-18fd-4d29-a933-08cd61266a62",
220+
"metadata": {},
221+
"outputs": [
222+
{
223+
"name": "stdout",
224+
"output_type": "stream",
225+
"text": [
226+
"30.3 ms ± 5.26 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
227+
]
228+
}
229+
],
230+
"source": [
231+
"%timeit np.linalg.matrix_power(D, 10)"
232+
]
233+
},
234+
{
235+
"cell_type": "code",
236+
"execution_count": 21,
237+
"id": "2cca5817-c795-4240-8a56-6361aade49cf",
238+
"metadata": {},
239+
"outputs": [],
240+
"source": [
241+
"D_dev = cp.random.uniform(0.0, 1.0, size=(1_000, 1_000))"
242+
]
243+
},
244+
{
245+
"cell_type": "code",
246+
"execution_count": 13,
247+
"id": "e09dab4d-7a43-4cce-b0cd-0bb82c6f49c1",
248+
"metadata": {},
249+
"outputs": [
250+
{
251+
"name": "stdout",
252+
"output_type": "stream",
253+
"text": [
254+
"43.7 ms ± 2.55 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)\n"
255+
]
256+
}
257+
],
258+
"source": [
259+
"%timeit cp.linalg.matrix_power(D_dev, 10)"
260+
]
261+
},
262+
{
263+
"cell_type": "markdown",
264+
"id": "87ee8654-ca5d-456c-a19c-d46842f15adf",
265+
"metadata": {},
266+
"source": [
267+
"## Singular Value Decomposition (SVD)"
268+
]
269+
},
270+
{
271+
"cell_type": "code",
272+
"execution_count": 16,
273+
"id": "dbae0849-7d44-4d8e-8a2a-104d08e63016",
274+
"metadata": {},
275+
"outputs": [
276+
{
277+
"name": "stdout",
278+
"output_type": "stream",
279+
"text": [
280+
"261 ms ± 81.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
281+
]
282+
}
283+
],
284+
"source": [
285+
"%%timeit\n",
286+
"t = scipy.linalg.svd(D)"
287+
]
288+
},
289+
{
290+
"cell_type": "code",
291+
"execution_count": 22,
292+
"id": "33e09563-e5a0-47ca-b860-7b0581d7a11d",
293+
"metadata": {},
294+
"outputs": [
295+
{
296+
"name": "stdout",
297+
"output_type": "stream",
298+
"text": [
299+
"728 ms ± 16.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
300+
]
301+
}
302+
],
303+
"source": [
304+
"%%timeit\n",
305+
"t_dev = cp.linalg.svd(D_dev)"
306+
]
307+
},
308+
{
309+
"cell_type": "code",
310+
"execution_count": 17,
311+
"id": "2552a033-b7b0-423f-a162-ad559d696821",
312+
"metadata": {},
313+
"outputs": [
314+
{
315+
"name": "stdout",
316+
"output_type": "stream",
317+
"text": [
318+
"49.8 ms ± 4.96 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
319+
]
320+
}
321+
],
322+
"source": [
323+
"%%timeit\n",
324+
"t = scipy.linalg.qr(D)"
325+
]
326+
},
327+
{
328+
"cell_type": "code",
329+
"execution_count": 23,
330+
"id": "b74986ef-b699-40db-ae50-c759b5c5a9f7",
331+
"metadata": {},
332+
"outputs": [
333+
{
334+
"name": "stdout",
335+
"output_type": "stream",
336+
"text": [
337+
"34 ms ± 36.5 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
338+
]
339+
}
340+
],
341+
"source": [
342+
"%%timeit\n",
343+
"t_dev = cp.linalg.qr(D_dev)"
344+
]
345+
}
346+
],
347+
"metadata": {
348+
"kernelspec": {
349+
"display_name": "Python 3 (ipykernel)",
350+
"language": "python",
351+
"name": "python3"
352+
},
353+
"language_info": {
354+
"codemirror_mode": {
355+
"name": "ipython",
356+
"version": 3
357+
},
358+
"file_extension": ".py",
359+
"mimetype": "text/x-python",
360+
"name": "python",
361+
"nbconvert_exporter": "python",
362+
"pygments_lexer": "ipython3",
363+
"version": "3.11.6"
364+
}
365+
},
366+
"nbformat": 4,
367+
"nbformat_minor": 5
368+
}

0 commit comments

Comments
 (0)