Nsight로 GPU hardware feature profiling

peanut0613 2022. 7. 6. 17:17

2022. 7. 6. 17:17

- nvidia nvprof : kepler,maxwell 등 예전 GPU 지원

- nvidia nsight compute : Pascal, Turing, volta 로 최신 GPU 지원

-> pascal 이전 gpu는 이제 클라우드에서 서비스종료하는 추세라 nvprof 보다는 nsight 를 사용해야한다.

https://developer.nvidia.com/blog/using-nsight-compute-to-inspect-your-kernels/ 이글을 참조했다.

nsight compute가 가 위치하는곳은 다음과같다.

# nsight comupte가 위치하는곳 : Cuda 안에 숨켜져있었음

/usr/local/cuda-11.2/nsight-compute-2020.3.1/nv-nsight-cu-cli

아래는 피처목록을 저장해주는 명령어이다.

`/usr/local/cuda-11.4/nsight-compute-2021.2.2/nv-nsight-cu-cli --devices 0 --query-metrics >my_metrics.txt`

실행 명령어

- 위아래 두개에 별 차이가 없는데 그이유는 아직잘 모르겠다.

- 파이썬파일 실행하면서 gpu 피처들을 프로파일링해준다.

sudo /usr/local/cuda-11.4/nsight-compute-2021.2.2/nv-nsight-cu-cli python3.7 ex.py —import --replay-mode application
sudo /usr/local/cuda-11.4/nsight-compute-2021.2.2/ncu python3.7 ex.py —-replay-mode application

실행결과

아래처럼 주르륵 리포트 형식으로 나옴. 맨위에 참조 링크 들어가보면 전체 아웃풋 볼수 있음.

그런데.... 어마어마하게 느림.....

from : https://docs.nvidia.com/nsight-compute/ProfilingGuide/#metric-collection

The number and type of metrics specified by a section has significant impact on the overhead during profiling. To allow you to quickly choose between a fast, less detailed profile and a slower, more comprehensive analysis, you can select the respective section set. See Overhead for more information on profiling overhead.

⇒ 모든 매트릭을 볼때 오버헤드가 엄청 커지면서 느려진다고 이해

오버헤드에 영향을 주는 요인 (https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html#overhead 참조)

Number and type of collected metrics
The collected section set
Number of collected sections
Number of profiled kernels
GPU Architecture

⇒ nsight 를 사용하기 위해서는 사용할 피처만 먼저 골라서 정리 → 그 피처만 출력하도록 하는식으로 최대한 오버헤드를 줄여서 사용하는것 권장한다

'<하드웨어> > GPU' 카테고리의 다른 글

DCGM 피처정리2 (timeseries 데이터 위주) (0)	2022.07.27
DCGMI 실행 명령어 정리 (0)	2022.07.06
[Ubuntu] DCGM 설치하고 실행해보기 (0)	2022.07.04
Ubuntu18.04+cuda11.4+python3.7+tensorflow2.7.0+cuDNN8.2.4 설치 (0)	2022.04.10
DeviceQuery 결과 csv파일로 저장 (0)	2022.03.21

DARAM BLOG

Nsight로 GPU hardware feature profiling

'<하드웨어> > GPU' 카테고리의 다른 글

+ Recent posts

티스토리툴바