엘라스틱서치 - Ubuntu 서버에 데이터 노드 추가하기

Ubuntu 서버 장비에 엘라스틱서치 데이터 노드를 추가해보자.

데이터 노드는 이미 운용되고 있는 마스터 노드 서버가 있을 때, 마스터 노드를 데이터 저장 용도로 사용하지 않도록 하여 인덱스 생성, 저장 등의 운용 환경과 데이터 저장 환경을 분리하고자 할 때 추가한다.

Ubuntu 서버 장비 준비

Cloud service 등을 이용하여 Ubuntu 환경의 서버 장비를 준비한다.

ES 설치

준비된 서버 환경에 엘라스틱서치를 설치한다.

마스터 노드가 이미 준비되어 있는 상황이기 때문에, 기존 마스터 노드로 운용하고 있는 ES와 동일한 스펙으로 준비한다.

기존의 마스터 노드는 엘라스틱서치 6.4 버전으로 운용되고 있다. 설치 링크는 여기

Elasticsearch PGP Key 가져오기

엘라스틱서치에서 제공하는 Signing key 를 설치한다.

$ wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

APT 레포지토리 설치

Debian 기반 시스템에서 https 통신을 가능하게 해주는 패키지를 설치한다.

$ sudo apt-get install apt-transport-https

레포지토리 정보를 /etc/apt/sources.list.d/elastic-6.x.list 파일에 적어넣어야 한다.

$ echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list

ES 설치 디렉토리 생성

엘라스틱서치를 설치하면 기본 경로(/etc/elasticsearch)에 config 파일들이 생성된다.

이 경로는 원하는 경로로 지정하여 운용해도 상관없다.

기존 마스터 노드가 운용되고 있는 경로와 동일한 경로에 생성하는 것을 권고한다.

$ sudo mkdir -p /path/to/install/elasticsearch/data-node

ES 설치

엘라스틱서치를 설치하기 전에 java 가 설치되어 있는지 확인해야 한다.

엘라스틱서치는 자바 기반으로 만들어져서 설치를 위해서 java 가 필수적으로 요구된다.

java 가 설치되어 있지 않으면, 엘라스틱서치를 설치할 수 없다.

java 설치가 완료되어 있다면 엘라스틱서치를 설치한다.

버전을 명시하여 설치할 수 있다.

$ sudo apt-get update && sudo apt-get install elasticsearch=6.4.2 -V

ES 설치 경로 변경 및 권한 설정

설치가 완료되었으면 /etc/elasticsearch 에 config 파일들이 생성되어 있을 것 이다.

이 config 파일들을 위에서 생성한 디렉토리(/path/to/install/elasticsearch/data-node)로 옮겨준다.

파일 이동 권한이 없을 수 있기에 루트 사용자로 이동하여 진행하였고, 필자가 사용한 서버 환경에서 전체 파일 이동 커맨드가 (mv -r) 올바르게 동작하지 않아 일일히 파일 이동 처리를 해 주었다.

더 좋은 방법이 있다면, 원하는 방식으로 파일 이동을 수행하면 된다.

$ sudo su -l
$ cd /etc/elasticsearch
$ mv * /path/to/install/elasticsearch/data-node
$ mv .elasticsearch.keystore.initial_md5sum /path/to/install/elasticsearch/data-node
$ rm -rf /etc/elasticsearch

그 이후 루트 디렉토리로 이동하여 지정한 경로의 권한 및 사용자를 elasticsearch 로 지정해 준다.

$ chown -R elasticsearch:elasticsearch /path/to/install/elasticsearch

ES data, logs, snapshot 파일이 저장될 디렉토리 셋팅

엘라스틱서치가 운용되면서 인덱스, 로그, 스냅샷 등의 파일이 저장될 디렉토리를 생성하여야 한다.

기존 마스터 노드의 파일 저장 디렉토리와 동일하게 설정할 것을 권고한다.

$ mkdir -p /data/elasticsearch/data-node/data
$ mkdir -p /data/elasticsearch/data-node/logs
$ mkdir -p /data/elasticsearch/data-node/snapshots

엘라스틱서치가 해당 위치에 파일 읽기 및 쓰기가 가능하도록 권한과 사용자를 수정해준다.

$ sudo chown -R elasticsearch:elasticsearch /data/elasticsearch

JVM option 설정

jvm.options 에서 엘라스틱서치가 시스템 자원을 어떻게 사용할지에 대한 옵션을 지정한다.

## JVM configuration


################################################################
## IMPORTANT: JVM heap size
################################################################
##
## You should always set the min and max JVM heap
## size to the same value. For example, to set
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
##
## See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
## for more information
##
################################################################


# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space


-Xms6g
-Xmx6g


################################################################
## Expert settings
################################################################
##
## All settings below this section are considered
## expert settings. Don't tamper with them unless
## you understand what you are doing
##
################################################################


## GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly


## optimizations


# pre-touch memory pages used by the JVM during initialization
-XX:+AlwaysPreTouch


## basic


# explicitly set the stack size
-Xss1m


# set to headless, just in case
-Djava.awt.headless=true


# ensure UTF-8 encoding by default (e.g. filenames)
-Dfile.encoding=UTF-8


# use our provided JNA always versus the system one
-Djna.nosys=true


# turn off a JDK optimization that throws away stack traces for common
# exceptions because stack traces are important for debugging
-XX:-OmitStackTraceInFastThrow


# flags to configure Netty
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0


# log4j 2
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true


-Djava.io.tmpdir=${ES_TMPDIR}


## heap dumps


# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError


# specify an alternative path for heap dumps; ensure the directory exists and
# has sufficient space
-XX:HeapDumpPath=data


# specify an alternative path for JVM fatal error logs
-XX:ErrorFile=logs/hs_err_pid%p.log


## JDK 8 GC logging


8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:logs/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m


# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m
# due to internationalization enhancements in JDK 9 Elasticsearch need to set the provider to COMPAT otherwise
# time/date parsing will break in an incompatible way for some date patterns and locals
9-:-Djava.locale.providers=COMPAT


# temporary workaround for C2 bug with JDK 10 on hardware with AVX-512
10-:-XX:UseAVX=2

위 정보에서 엘라스틱서치 서비스에 할당할 시스템 자원과 GC 로그등을 지정할 수 있다.

본인의 서버 환경에 적절하도록 수정하면 된다.

elasticsearch.yml 설정

jvm.options 가 시스템 자원에 대한 설정이라면, elasticsearch.yml은 엘라스틱서치가 접근할 클러스터 및 노드 정보에 대한 설정이다.

또한 서버로 운용되는 엘라스틱서치에 특성에 따라 네트워크 정보를 지정해주고, 운용할 데이터의 저장 장소도 지정해준다.

시스템 자원 외 모든 엘라스틱서치의 설정 정보를 지정해준다고 보면 된다.

discovery.zen.ping.unicast.hosts 에는 마스터 노드가 운용되고 있는 ip 를 작성해 준다.

최신 엘라스틱서치 버전에서는 이 옵션에 이름이 변경되었다. (discovery.seed_hosts)

공식문서를 참고하여 설치된 엘라스틱서치 버전의 맞는 올바른 설정을 하도록 하자.

cluster.name: nd.qa.listingsearch
node.name: data.${HOSTNAME}

node.master: False
node.data: True
node.ingest: False
node.ml: False

discovery.zen.ping.unicast.hosts: [ "<master_node_ip>:<port_number>" ]
discovery.zen.minimum_master_nodes: 1

path.data: /data/elasticsearch/data-node/data
path.logs: /data/elasticsearch/data-node/logs
path.repo: /data/elasticsearch/data-node/snapshots

indices.memory.index_buffer_size: 30%

network.host: 0.0.0.0
http.port: 9200
transport.tcp.port: 9300

xpack.monitoring.collection.enabled: true
http.cors.allow-origin: "*"
http.cors.enabled: true
http.cors.allow-methods : OPTIONS, HEAD, GET, POST, PUT, DELETE
http.cors.allow-headers : X-Requested-With,X-Auth-Token,Content-Type, Content-Length

ES Config 경로 수정

별도 디렉토리에 ES config 파일들을 옮겨놓았기 때문에, config 파일의 경로를 지정해 주어야 한다.

$ cd /usr/share/elasticsearch/bin
$ vim elasticsearch

vim 모드로 elasticsearch 파일을 실행시키면, 엘라스틱서치의 환경정보가 나온다.

이 곳에 ES_PATH_CONF 를 입력해주어야 한다.

ES_PATH_CONF=/path/to/install/elasticsearch/data-node

ES Service 실행

root 계정으로 이동하여 엘라스틱서치 서비스를 실행한다.

$ sudo su -
$ systemctl start elasticsearch.service

systemctl 커맨드는 명령 실행에 대한 피드백을 제공하지 않기 때문에, 올바르게 서비스가 실행되었는지 직접 확인해보아야 한다.

$ systemctl status elasticsearch.service

systemctl 은 커맨드 실행에 대한 결과를 로그파일에 따로 저장하기 때문에, 로그파일로 직접 이동하여 자세히 확인해 볼 수 있다.

특정 서비스의 로그 정보를 확인하는 커맨드를 이용하면 편리하다.

$ sudo journalctl --unit elasticsearch