2021-09-07

Google Colaboratory word2vec GoogleNews-vectors-negative300.bin.gz 最速で配置する

!wget  "https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz"

2021-06-24

Google Colaboratory の無料枠で GPU を使って ffmpeg で動画フォーマット変換

とある11分の動画をGoogle Colaboratory の無料枠を使って変換してみた
GPUの方が3倍ぐらいはやかった
ffmpegはデフォルトでインストールされている
アカウント作って，ブラウザから動画ファイルをアップロードするかGoogleドライブからコピーすればすぐ実行できる

GPUの場合 2分弱

%%time
!ffmpeg -i hoge.ts -vcodec h264_nvenc -s 640x360 out640GPU.mp4

Wall time: 1min 54s

CPUの場合 5.5分

%%time
!ffmpeg -i hoge.ts -vcodec h264 -s 640x360 out640CPU.mp4

Wall time: 5min 30s

割り当てられてた GPU

!nvidia-smi
 
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.27       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   45C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

割り当てられてた CPU

!cat /proc/cpuinfo

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 85
model name	: Intel(R) Xeon(R) CPU @ 2.00GHz
stepping	: 3
microcode	: 0x1
cpu MHz		: 2000.190
cache size	: 39424 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat md_clear arch_capabilities
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa
bogomips	: 4000.38
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 85
model name	: Intel(R) Xeon(R) CPU @ 2.00GHz
stepping	: 3
microcode	: 0x1
cpu MHz		: 2000.190
cache size	: 39424 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat md_clear arch_capabilities
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa
bogomips	: 4000.38
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

2021-05-13

WSL で Winsowsの Symantec Endpoint Protection で通信が遮断されてしまっている場合の対応 SEP

Symantec Endpoint Protectio
「ネットワークとホストのエクスプロイト緩和機能」オプション
設定の変更
「ファイアウォール」タブ
不一致IPトラフィックの設定
IPトラフィックを許可する： ON

2021-05-11

WSL Version 1 から Version2 への移行方法 WSL2

> wsl --list --verbose
  NAME                   STATE           VERSION
* Ubuntu                 Stopped         1
  docker-desktop-data    Running         2
  docker-desktop         Running         2

> wsl --set-version Ubuntu 2
変換中です。この処理には数分かかることがあります...
WSL 2 との主な違いについては、https://aka.ms/wsl2 を参照してください
変換が完了しました。

> wsl --list --verbose
  NAME                   STATE           VERSION
* Ubuntu                 Stopped         2
  docker-desktop-data    Running         2
  docker-desktop         Running         2

2021-04-17

Pytorch CUDA 11 Windows 10 にインストールメモ

1. CUDA 11.3 ダウンロード＆インストール
developer.nvidia.com

テスト
コマンドプロンプトから実行

>nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Mar_21_19:24:09_Pacific_Daylight_Time_2021
Cuda compilation tools, release 11.3, V11.3.58
Build cuda_11.3.r11.3/compiler.29745058_0

2. Pytorchインストール
コマンドプロンプトから実行

> conda update --all
> conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge

テスト

>python
Python 3.8.8 (default, Apr 13 2021, 15:08:03) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import print_function
>>> import torch
>>> x = torch.rand(5, 3)
>>> print(x)
tensor([[0.7916, 0.8205, 0.8727],
        [0.9985, 0.6213, 0.1450],
        [0.3143, 0.1618, 0.5846],
        [0.3925, 0.1929, 0.1867],
        [0.6710, 0.9961, 0.2065]])

>>> print('CUDA:', torch.cuda.is_available())
CUDA: True

2021-04-15

SIGNATE 日本取引所グループニュース分析チャレンジ Docker Anaconda Windows 10 環境構築インストールメモ

1. データのダウンロード
https://github.com/JapanExchangeGroup/J-Quants-Tutorial

右上の「Code」から「Download Zip」
J-Quants-Tutorial-main.zip を任意のディレクトリで解凍

2. Docker インストール
https://hub.docker.com/editions/community/docker-ce-desktop-windows/

PowerShellから実行してエラーになっていなければOK

> docker --version
Docker version 20.10.5, build 55c4c88

3. Docker Image のインストール
PowerShellから実行

> docker pull continuumio/anaconda3:2019.03

DockerのImages画面に以下が表示されるハズ
continuumio/anaconda3

4. Docker Image の起動

# 0.で解凍したディレクトリのhandsonに移動
> cd hoge/J-Quants-Tutorial-main/handson

# データ配置先のディレクトリを作成
> mkdir data_dir

# Docker起動
PowerShellから実行
> docker run --name tutorial -v ${pwd}/data_dir:/path/to -v ${pwd}/Chapter02/archive:/opt/ml -v ${pwd}:/notebook -e PYTHONPATH=/opt/ml/src -v ${pwd}:/notebook -p8888:8888 -it continuumio/anaconda3:2019.03 jupyter notebook --ip 0.0.0.0 --allow-root --no-browser --no-mathjax --NotebookApp.disable_check_xsrf=True  --NotebookApp.token='' --NotebookApp.password='' /notebook

5. ライブラリのインストール
CLIの起動
docker の「Conainers / Apps 」で continuumio/anaconda3:2019.03 にマウスオーバーすると現れる
「>_]クリック

# bash
# apt-get update
# apt-get install -y --no-install-recommends g++ gcc
# pip install shap==0.37.0 slicer==0.0.3 xgboost==1.3.0.post0

2021-02-16

pandas pivot_table で集計してできたDFの操作メモ

行名や列名を多次元配列として扱えばOK

pivot_df=aris_kadou_df.pivot_table(index=['社員番号','要員名','所属部門'], 
    columns=['PJ区分'],
    values=['稼働時間計','稼働金額計'], aggfunc=np.sum)

pivot_df = pivot_df.fillna(0)

pivot_df['直接稼働率'] = 
  pivot_df['稼働時間計']['直接プロジェクト'] / 
  (pivot_df['稼働時間計']['直接プロジェクト']+
   pivot_df['稼働時間計']['間接プロジェクト']+
   pivot_df['稼働時間計']['販管プロジェクト'])