How to use NIPA GPU cluster

1 minute read

Before explaining the procedure, the following is the summary.

wesuggestsigning/slp_edu:latest
https://nipapublic.gabia.com
https://docs.kakaocloud.com/ha-gpu

Procedure

Log on to the website.

https://nipapublic.gabia.com

Then the following screen will appear:

NIPA login

You need to use your ID and password given from NIPA.

Note that the following documents which may be found in the “Documentation” section may be useful.
- 사용자 가이드 (Can be directly accessed at: https://docs.kakaocloud.com/ha-gpu)
- Kubeflow 사용방법 (Can be directly accessed at: https://docs.kakaocloud.com/ha-gpu)
- 튜토리얼 (Can be directly accessed at: https://docs.kakaocloud.com/ha-gpu)

Creating the note book.

SentencePiece is a toolkit for sub-word tokenization. In this tutorial, we assume that you are using Ubuntu Linux. Link to the SentencePiece github page

 conda config --add channels conda-forge
 conda config --set channel_priority strict

 conda install libsentencepiece sentencepiece sentencepiece-python sentencepiece-spm

 pip install sentencepiece

STOP data set

In this tutorial, we will use the STOP dataset. Information about this dataset may be found at the following link: https://facebookresearch.github.io/spoken_task_oriented_parsing/ Stop dataset was developed for benchmarking the Spoken Language Understanding (SLU) task.

You may download it from the following link: Link to the download page

Algorithm

Running it

spm_train --input=train-all.trans_without_uttid.txt \
       --model_prefix=model_unigram_256  \
       --vocab_size=256 \
       --character_coverage=1.0 \
       --model_type=unigram

Chanwoo Kim

How to use NIPA GPU cluster

Procedure

STOP data set

Algorithm

Running it

Doing tokenization and detokenization

Summary

Share on

You may also enjoy

Ptb_keras_lm

Sentence Piece Tutorial

[Github Blog] Jekyll - minimal mistakes 시작하기

Keras model definition