greenplum安装教程_greenplum安装部署

greenplum安装教程_greenplum安装部署greenplum install [TOC] OS:rhel7.5 x86_64 DB:GreenPlum db 6.x 下载源码包安装 gp download 安装前参考最佳实践对操作系统进行优…

greenplum安装参考

OS:rhel7.5 x86_64 DB:GreenPlum db 6.x

下载源码包安装   gp download  

安装前参考最佳实践对操作系统进行优化!!

greenplum or hgdw Best practices

1 pretasks

修改主机名

cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

# gp configure
192.168.80.146	gp1
192.168.80.147	gp2
192.168.80.148	gp3
192.168.80.149	gp4

代码100分

创建操作系统用户

使用root用户登录主节点主机系统,在主节点创建数据库用gpadmin户,切换到新用户运行运行数据仓库安装包。

代码100分groupadd -g 1001 gpadmin
useradd -u 1001 -g gpadmin gpadmin; echo gpadmin | passwd -f --stdin gpadmin

2 install

安装数据仓库软件

gp4.x/gp5.x 以前安装分为三部分    1. 安装master(一般是个bin的可执行文件,安装,并可以指定安装目录)    2. gpseginstall 安装各个seg    3. gp群参数校验      4. gpinitsystem 集群初始化     gp6.2 开始不提供zip 格式压缩包,仅提供rpm包,安装分为两步    1. 安装master(rpm -ivh / yum install -y),不可以指定安装目录,默认安装到/usr/local/    2. gp6 没有 gpseginstall工具。所以要么自己打包master 安装好的gp目录并传到seg上,要么各个节点单独yum 安装。      1.每个节点主机,单独yum      2.打包主节点的安装目录,并分发给seg主机。   3. 集群性能校验    4. gpinitsystem 集群初始化

先只在主节点安装软件

 rpm -ivh xxx.rpm

环境变量

添加下面初始化变量到gpadmin用户.bash_profile文件,使立即生效:source .bash_profile:

代码100分cat ~/.bash_profile
source /opt/gp/greenplum/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/hgdata/master/hgseg-1
export PGPORT=2345
export PGUSER=hgadmin
export PGDATABASE=gpdb


greenplum_path.sh — 该文件包含数据仓库的基础环境变量。默认自动生成

创建hostfile

创建几个host文件,用于后续使用gpssh,gpscp 等脚本host参数文件

hostfile_exkeys 内容是集群所有主机名或ip,用于ssh互信配置 hostfile_all 内容是集群所有主机名或ip,包含master,segment,standby等 hostfile_seg 内容是所有 segment主机名或ip

cat hostfile_exkeys
hgdw1
hgdw2
hgdw3
hgdw4
hgdw5
hgdw6
cat hostfile_all
hgdw1
hgdw2
hgdw3
hgdw4
hgdw5
hgdw6
cat hostfile_seg 
hgdw2
hgdw3
hgdw4
hgdw5
hgdw6


配置节点互信


配置互信

root用户和gpadin用户都需要做

1. 主机生成密钥
ssh-keygen -t rsa

2. 将公钥复制到各个节点机器的authorized_keys文件中

ssh-copy-id -i id_rsa.pub root@hgdw1
ssh-copy-id -i id_rsa.pub root@hgdw3
ssh-copy-id -i id_rsa.pub root@hgdw3
ssh-copy-id -i id_rsa.pub root@hgdw4
ssh-copy-id -i id_rsa.pub root@hgdw4
ssh-copy-id -i id_rsa.pub root@hgdw6

3. 使用gpssh-exkeys 工具,打通n-n的免密登陆

root和gpadmin用户都执行一遍
gpssh-exkeys -f  /home/gpadmin/hostfile_exkeys

hgssh-exkeys -f  /home/hgadmin/hostfile_exkeys
[STEP 1 of 5] create local ID and authorize on local host
  ... /root/.ssh/id_rsa file exists ... key generation skipped

[STEP 2 of 5] keyscan all hosts and update known_hosts file

[STEP 3 of 5] retrieving credentials from remote hosts
  ... send to hgdw2
  ... send to hgdw3
  ... send to hgdw4
  ... send to hgdw5
  ... send to hgdw6

[STEP 4 of 5] determine common authentication file content

[STEP 5 of 5] copy authentication files to all remote hosts
  ... finished key exchange with hgdw2
  ... finished key exchange with hgdw3
  ... finished key exchange with hgdw4
  ... finished key exchange with hgdw5
  ... finished key exchange with hgdw6

[INFO] completed successfully


验证
hgssh -f  /home/hgadmin/hostfile_all -e "date"
[hgdw1] date
[hgdw1] Wed Apr  1 15:47:36 CST 2020
[hgdw3] date
[hgdw3] Wed Apr  1 15:47:36 CST 2020
[hgdw4] date
[hgdw4] Wed Apr  1 15:47:35 CST 2020
[hgdw6] date
[hgdw6] Wed Apr  1 15:47:35 CST 2020
[hgdw5] date
[hgdw5] Wed Apr  1 15:47:36 CST 2020
[hgdw2] date
[hgdw2] Wed Apr  1 15:47:36 CST 2020

同步个节点环境变量

gpscp -f  /home/hgadmin/hostfile_seg /home/hgadmin/.bash_profile  hgadmin@=:/home/hgadmin/.bash_profile

在所有计算节点上安装配置数据仓库

root
source /usr/local/hgdw/hgdw_path.sh
hgssh -f  /home/hgadmin/hostfile_seg -e "mkdir -p /opt/software"
hgssh -f  /home/hgadmin/hostfile_seg -e "chmod 777 /opt/software"


-greenpllum
gpscp -f  /home/hgadmin/hostfile_seg /opt/software/greenplum-db-6.5.0-rhel7-x86_64.rpm  hgadmin@=:/opt/software/
su - root
source /usr/local/greenplum-db/greenplum_path.sh
gpssh -f  /home/hgadmin/hostfile_seg -e "yum install apr -y"
gpssh -f  /home/hgadmin/hostfile_seg -e "yum install apr-util -y"
gpssh -f  /home/hgadmin/hostfile_seg -e "rpm -ivh /opt/software/greenplum-db-6.5.0-rhel7-x86_64.rpm"
gpssh -f  /home/hgadmin/hostfile_seg -e "chown hgadmin:hgadmin /usr/local/greenplum* -R"

-hgdw
gpscp -f  /home/hgadmin/hostfile_seg /opt/software/hgdw-3.0.zip  hgadmin@=:/opt/software/
hgssh -f  /home/hgadmin/hostfile_seg -e "unzip -d /usr/local /opt/software/hgdw-3.0.zip"
hgssh -f  /home/hgadmin/hostfile_seg -e "chown hgadmin:hgadmin /usr/local/hgdw -R"


创建数据存储区域

创建或者选择一个用于主节点的数据存储区域。该目录需要有足够的磁盘空间并确保归属于gpadmin用户和组。


hgadmin
创建master 数据目录
mkdir -p /data/hgdata/master

创建segment 数据目录
本次计划每个主机安装2个 segment

hgssh -f   /home/hgadmin/hostfile_seg -e "mkdir -p /data/hgdata/hgsegp"
hgssh -f   /home/hgadmin/hostfile_seg -e "mkdir -p /data/hgdata/hgsegm"



同步系统时钟

在计算节点主机上,NTP应该配置主节点主机作为主时间源,而备用主节点主机作为备选时源。在主节点和备用主节点上配置NTP到用户首选的时间服务器。

配置NTP

安装相应软件包:
yum install ntp -y

检测是否安装成功:
rpm -qa | grep ntp
ntpdate-4.2.6p5-25.el7.centos.x86_64
ntp-4.2.6p5-25.el7.centos.x86_64


开机自启动
systemctl start ntpd.service
systemctl enable ntpd.service
systemctl enable ntpdate.service

查看状态
systemctl status  ntpd.service

主节点配置
vi /etc/ntp.conf

server 127.127.1.0 prefer
127.127.1.0 stratum 8

$ ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*LOCAL(0)        .LOCL.           5 l   43   64  377    0.000    0.000   0.000


备用主节点
vi /etc/ntp.conf
server gp1 prefer  

计算节点
vi /etc/ntp.conf
server gp1 prefer  

测试
ntpq -p


检查系统环境

瀚高数据仓库提供命令用以检查系统的配置和性能,这些命令可以在瀚高数据仓库安装的$GPHOME/bin目录下找到。初始化瀚高数据仓库系统之前,应执行硬件性能检查。

瀚高数据仓库提供的gpcheckperf命令可用来在瀚高数据仓库集群主机上检查硬件和系统级gpcheckperf在指定的主机上启动一个会话并执行下面的性能测试:

检查网络性能 检查磁盘I/O性能 检查内存带宽

在使用gpcheckperf之前,必须已经在所有相关需要做性能测试的主机之间建立了互信。如果还没有做,可以使用gpssh-exkeys命令来建立互信。gpcheckperf会调用gpssh和gpscp,所以这些命令必须已经存在在$PATH中。

使用gpadmin用户

1.检查网络性能

gpcheckperf -f /home/gpadmin/hostfile_gpcheckperfnet -r N -d /tmp > subnet1.out

/opt/gp/greenplum//bin/gpcheckperf -f /home/gpadmin/hostfile_gpcheckperfnet -r N -d /tmp

-------------------
--  NETPERF TEST
-------------------

====================
==  RESULT
====================
Netperf bisection bandwidth test
gp1 -> gp2 = 81.110000
gp3 -> gp4 = 80.940000
gp2 -> gp1 = 87.330000
gp4 -> gp3 = 87.120000

Summary:
sum = 336.50 MB/sec
min = 80.94 MB/sec
max = 87.33 MB/sec
avg = 84.12 MB/sec
median = 87.12 MB/sec

2.检查磁盘I/O和内存带宽

使用gpadmin用户登录主节点主机。
创建一个名为hostfile_gpcheckperfio的文件,包含所有计算节点主机名,每个名称一行。不要包含主节点主机名。


gpcheckperf -f /home/gpadmin/hostfile_gpcheckperfio -r ds -d /data/segmentp -d /data/segmentm > io.out

[gpadmin@gp1 ~]$ cat io.out 
/opt/gp/greenplum//bin/gpcheckperf -f /home/gpadmin/hostfile_gpcheckperfio -r ds -d /data/segmentp -d /data/segmentm

--------------------
--  DISK WRITE TEST
--------------------

--------------------
--  DISK READ TEST
--------------------

--------------------
--  STREAM TEST
--------------------

====================
==  RESULT
====================

 disk write avg time (sec): 18.36
 disk write tot bytes: 6273368064
 disk write tot bandwidth (MB/s): 334.08
 disk write min bandwidth (MB/s): 140.84 [gp4]
 disk write max bandwidth (MB/s): 193.24 [gp3]


 disk read avg time (sec): 8.63
 disk read tot bytes: 6273368064
 disk read tot bandwidth (MB/s): 761.29
 disk read min bandwidth (MB/s): 266.85 [gp3]
 disk read max bandwidth (MB/s): 494.44 [gp4]


 stream tot bandwidth (MB/s): 17422.21
 stream min bandwidth (MB/s): 6680.49 [gp4]
 stream max bandwidth (MB/s): 10741.71 [gp3]

初始化数据仓库

gpinitsystem,能够在主节点和每个计算节点实例上按照正确的顺序完成初始化和启动操作。 在数据仓库系统初始化并启动之后,就可以像使用常规的PostgreSQL数据库一样,通过连接到瀚高数据仓库主节点来创建和管理数据库。

创建数据仓库配置文件

数据仓库配置文件会告诉gpinitsystem命令按照什么样的方式配置系统。在软件的安装目录的$GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config文件可以作为配置的例子参考?? 1.  以gpadmin用户登录: $ su – gpadmin 2. 拷贝一个gpinitsystem_config示例文件。例如 $cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpinitsystem_config 3.  打开刚拷贝的文件并编辑。 DATA_DIRECTORY参数指定每个计算节点主机配置多少个计算节点实例。如果在host文件中为每个计算节点主机列出了多个网口,这些实例将平均分布到所有列出的网口上。 下面是一个gpinitsystem_config文件的例子: ARRAY_NAME=”HighGo DataWarehouse ” SEG_PREFIX=gpseg PORT_BASE=40000 declare -a DATA_DIRECTORY=(/data1/primary /data1/primary /data1/primary /data2/primary /data2/primary /data2/primary) MASTER_HOSTNAME=mdw MASTER_DIRECTORY=/data/master MASTER_PORT=5432 TRUSTED_SHELL=ssh CHECK_POINT_SEGMENT=8 ENCODING=UNICODE 4.  作为可选项,可以配置镜像计算节点实例,取消文件中的注释并根据环境情况配置参数。下面是gpinitsystem_config文件中可选配置镜像的例子: MIRROR_PORT_BASE=50000 REPLICATION_PORT_BASE=41000 MIRROR_REPLICATION_PORT_BASE=51000 declare -a MIRROR_DATA_DIRECTORY=(/data1/mirror /data1/mirror /data1/mirror /data2/mirror /data2/mirror /data2/mirror) 注意:可以在初始化时值配置主计算节点实例,而在之后使用gpaddmirrors命令部署镜像计算节点实例。 5.  保存关闭该文件。

su - gpadmin
cp /usr/local/hgdw/docs/cli_help/hgconfigs/hginitsystem_config /home/hgadmin/hginitsystem_config

cat hginitsystem_config 
# FILE NAME: gpinitsystem_config

# Configuration file needed by the gpinitsystem

################################################
#### REQUIRED PARAMETERS
################################################

#### Name of this Greenplum system enclosed in quotes.
ARRAY_NAME="HGDW"

#### Naming convention for utility-generated data directories.
SEG_PREFIX=gpseg

#### Base number by which primary segment port numbers 
#### are calculated.
PORT_BASE=6000

#### File system location(s) where primary segment data directories 
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per
#### physical host (if multiple addresses for a host are listed in 
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses).
declare -a DATA_DIRECTORY=(/data/hgdata/hgsegp)

#### OS-configured hostname or IP address of the master host.
MASTER_HOSTNAME=hgdw1

#### File system location where the master data directory 
#### will be created.
MASTER_DIRECTORY=/data/hgdata/master

#### Port number for the master instance. 
MASTER_PORT=5432

#### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=ssh

#### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=18

#### Default server-side character set encoding.
ENCODING=UNICODE

################################################
#### OPTIONAL MIRROR PARAMETERS
################################################

#### Base number by which mirror segment port numbers 
#### are calculated.
MIRROR_PORT_BASE=7000

#### File system location(s) where mirror segment data directories 
#### will be created. The number of mirror locations must equal the
#### number of primary locations as specified in the 
#### DATA_DIRECTORY parameter.
declare -a MIRROR_DATA_DIRECTORY=(/data/hgdata/hgsegm)


################################################
#### OTHER OPTIONAL PARAMETERS
################################################

#### Create a database of this name after initialization.
DATABASE_NAME=hgdw

#### Specify the location of the host address file here instead of
#### with the the -h option of gpinitsystem.
#MACHINE_LIST_FILE=/home/gpadmin/gpconfigs/hostfile_gpinitsystem

初始化数据库

运行gpinitsystem命令将根据指定的配置文件创建一个瀚高数据仓库系统。 运行初始化命令

hginitsystem -c /home/hgadmin/hginitsystem_config -h /home/hgadmin/hostfile_seg -b 10GB -m 2000

命令将自动检查安装信息,确认可以连接到每个主机,可以访问配置文件中指定的每个目录。如果检查都通过了,将会提示确认配置信息。例如:

=> Continue with Greenplum creation? Yy/Nn

输入y以开始初始化。

该命令将初始化主节点实例和系统中的每个计算节点实例。每个计算节点实例的安装是并行的。 计算节点实例的数量决定这个过程使用的时间。

在成功安装初始化结束时, 命令将会启动瀚高数据仓库系统,并可以看到如下提示信息:

=> Database successfully started



hginitsystem -c /home/hgadmin/hginitsystem_config -h /home/hgadmin/hostfile_seg -b 10GB -m 2000 
20200401:18:46:57:026913 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking configuration parameters, please wait...
20200401:18:46:57:026913 hginitsystem:hgdw1:hgadmin-[INFO]:-Reading HGDW configuration file /home/hgadmin/hginitsystem_config
20200401:18:46:57:026913 hginitsystem:hgdw1:hgadmin-[INFO]:-Locale has not been set in /home/hgadmin/hginitsystem_config, will set to default value
20200401:18:46:57:026913 hginitsystem:hgdw1:hgadmin-[INFO]:-Locale set to en_US.utf8
20200401:18:46:57:026913 hginitsystem:hgdw1:hgadmin-[FATAL]:-Expected file " /home/hgadmin/hostfile_seg" not found Script Exiting!
[hgadmin@hgdw1 master]$ hginitsystem -c /home/hgadmin/hginitsystem_config -h /home/hgadmin/hostfile_seg -b 10GB -m 2000 
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking configuration parameters, please wait...
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Reading HGDW configuration file /home/hgadmin/hginitsystem_config
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Locale has not been set in /home/hgadmin/hginitsystem_config, will set to default value
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Locale set to en_US.utf8
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking configuration parameters, Completed
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Commencing multi-home checks, please wait...
.....
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Configuring build for standard array
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Commencing multi-home checks, Completed
20200401:18:47:08:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Building primary segment instance array, please wait...
.....
20200401:18:47:10:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Building group mirror array type , please wait...
.....
20200401:18:47:12:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking Master host
20200401:18:47:12:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking new segment hosts, please wait...
..........
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checking new segment hosts, Completed
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HGDW Database Creation Parameters
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:---------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master Configuration
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:---------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master instance name       = HGDW
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master hostname            = hgdw1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master port                = 5432
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master instance dir        = /data/hgdata/master/gpseg-1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master LOCALE              = en_US.utf8
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HGDW segment prefix   = gpseg
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master Database            = hgdw
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master connections         = 2000
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master buffers             = 10GB
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Segment connections        = 6000
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Segment buffers            = 10GB
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Checkpoint segments        = 18
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Encoding                   = UNICODE
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Postgres param file        = Off
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Initdb to be used          = /usr/local/hgdw/bin/initdb
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-GP_LIBRARY_PATH is         = /usr/local/hgdw/lib
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HEAP_CHECKSUM is           = on
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HBA_HOSTNAMES is           = 0
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Ulimit check               = Passed
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Array host connect type    = Single hostname per node
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master IP address [1]      = ::1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master IP address [2]      = 172.17.105.139
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master IP address [3]      = 192.168.122.1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Master IP address [4]      = fe80::f816:3eff:fe21:d1e6
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Standby Master             = Not Configured
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Number of primary segments = 1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total Database segments    = 5
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Trusted shell              = ssh
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Number segment hosts       = 5
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Mirror port base           = 7000
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Number of mirror segments  = 1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Mirroring config           = ON
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Mirroring type             = Group
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:----------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HGDW Primary Segment Configuration
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:----------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw2 	/data/hgdata/hgsegp/gpseg0 	6000 	2 	0
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw3 	/data/hgdata/hgsegp/gpseg1 	6000 	3 	1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw4 	/data/hgdata/hgsegp/gpseg2 	6000 	4 	2
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw5 	/data/hgdata/hgsegp/gpseg3 	6000 	5 	3
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw6 	/data/hgdata/hgsegp/gpseg4 	6000 	6 	4
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:---------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HGDW Mirror Segment Configuration
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:---------------------------------------
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw3 	/data/hgdata/hgsegm/gpseg0 	7000 	7 	0
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw4 	/data/hgdata/hgsegm/gpseg1 	7000 	8 	1
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw5 	/data/hgdata/hgsegm/gpseg2 	7000 	9 	2
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw6 	/data/hgdata/hgsegm/gpseg3 	7000 	10 	3
20200401:18:47:20:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-hgdw2 	/data/hgdata/hgsegm/gpseg4 	7000 	11 	4

Continue with HGDW creation Yy|Nn (default=N):
> y
20200401:18:47:43:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Building the Master instance database, please wait...
20200401:18:50:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Starting the Master in admin mode
20200401:18:50:52:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Commencing parallel build of primary segment instances
20200401:18:50:52:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
.....
20200401:18:50:52:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
...........................................................................................................................................................................................
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Parallel process exit status
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as completed           = 5
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as killed              = 0
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as failed              = 0
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Deleting distributed backout files
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Removing back out file
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-No errors generated from parallel processes
20200401:18:54:00:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Restarting the HGDW instance in production mode
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Starting hgstop with args: -a -l /home/hgadmin/gpAdminLogs -m -d /data/hgdata/master/gpseg-1
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Gathering information and validating the environment...
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Obtaining HGDW Master catalog information
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Obtaining Segment details from master...
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-HGDW Version: "postgres (HGDW Database) 3.0"
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Commencing Master instance shutdown with mode="smart"
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Master segment instance directory=/data/hgdata/master/gpseg-1
20200401:18:54:00:032368 hgstop:hgdw1:hgadmin-[INFO]:-Stopping master segment and waiting for user connections to finish ...
server shutting down
20200401:18:54:02:032368 hgstop:hgdw1:hgadmin-[INFO]:-Attempting forceful termination of any leftover master process
20200401:18:54:02:032368 hgstop:hgdw1:hgadmin-[INFO]:-Terminating processes for segment /data/hgdata/master/gpseg-1
20200401:18:54:03:032394 hgstart:hgdw1:hgadmin-[INFO]:-Starting hgstart with args: -a -l /home/hgadmin/gpAdminLogs -d /data/hgdata/master/gpseg-1
20200401:18:54:03:032394 hgstart:hgdw1:hgadmin-[INFO]:-Gathering information and validating the environment...
20200401:18:54:03:032394 hgstart:hgdw1:hgadmin-[INFO]:-HGDW Binary Version: "postgres (HGDW Database) 3.0"
20200401:18:54:03:032394 hgstart:hgdw1:hgadmin-[INFO]:-HGDW Catalog Version: "301908232"
20200401:18:54:03:032394 hgstart:hgdw1:hgadmin-[INFO]:-Starting Master instance in admin mode
20200401:18:54:11:032394 hgstart:hgdw1:hgadmin-[INFO]:-Obtaining HGDW Master catalog information
20200401:18:54:11:032394 hgstart:hgdw1:hgadmin-[INFO]:-Obtaining Segment details from master...
20200401:18:54:11:032394 hgstart:hgdw1:hgadmin-[INFO]:-Setting new master era
20200401:18:54:11:032394 hgstart:hgdw1:hgadmin-[INFO]:-Master Started...
20200401:18:54:11:032394 hgstart:hgdw1:hgadmin-[INFO]:-Shutting down master
20200401:18:54:12:032394 hgstart:hgdw1:hgadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
........
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-Process results...
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-----------------------------------------------------
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-   Successful segment starts                                            = 5
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-   Failed segment starts                                                = 0
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-----------------------------------------------------
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-Successfully started 5 of 5 segment instances 
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-----------------------------------------------------
20200401:18:54:21:032394 hgstart:hgdw1:hgadmin-[INFO]:-Starting Master instance hgdw1 directory /data/hgdata/master/gpseg-1 
20200401:18:54:29:032394 hgstart:hgdw1:hgadmin-[INFO]:-Command pg_ctl reports Master hgdw1 instance active
20200401:18:54:29:032394 hgstart:hgdw1:hgadmin-[INFO]:-Connecting to dbname="template1" connect_timeout=15
20200401:18:54:29:032394 hgstart:hgdw1:hgadmin-[INFO]:-No standby master configured.  skipping...
20200401:18:54:29:032394 hgstart:hgdw1:hgadmin-[INFO]:-Database successfully started
20200401:18:54:29:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Completed restart of HGDW instance in production mode
20200401:18:54:30:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Commencing parallel build of mirror segment instances
20200401:18:54:30:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
.....
20200401:18:54:30:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
............
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Parallel process exit status
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as completed           = 5
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as killed              = 0
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Total processes marked as failed              = 0
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:------------------------------------------------
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Scanning utility log file for any warning messages
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[WARN]:-*******************************************************
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[WARN]:-Scan of log file indicates that some warnings or errors
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[WARN]:-were generated during the array creation
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Please review contents of log file
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-/home/hgadmin/gpAdminLogs/hginitsystem_20200401.log
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-To determine level of criticality
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-These messages could be from a previous run of the utility
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-that was called today!
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[WARN]:-*******************************************************
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-HGDW Database instance successfully created
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-------------------------------------------------------
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-To complete the environment configuration, please 
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-update hgadmin .bashrc file with the following
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-1. Ensure that the hgdw_path.sh file is sourced
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/data/hgdata/master/gpseg-1"
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-   to access the HGDW scripts for this instance:
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-   or, use -d /data/hgdata/master/gpseg-1 option for the HGDW scripts
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-   Example gpstate -d /data/hgdata/master/gpseg-1
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Script log file = /home/hgadmin/gpAdminLogs/hginitsystem_20200401.log
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-To initialize a Standby Master Segment for this HGDW instance
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Review options for gpinitstandby
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-------------------------------------------------------
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-The Master /data/hgdata/master/gpseg-1/pg_hba.conf post gpinitsystem
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-has been configured to allow all hosts within this new
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-new array must be explicitly added to this file
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-Refer to the HGDW Admin support guide which is
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-located in the /usr/local/hgdw/docs directory
20200401:18:54:42:027116 hginitsystem:hgdw1:hgadmin-[INFO]:-------------------------------------------------------
[hgadmin@hgdw1 master]$ 

失败回退


安装中途失败,提示使用 bash /home/gpadmin/gpAdminLogs/backout_gpinitsystem_gpadmin_* 回退,执行该脚本即可

删除重建

安装完成,出于种种原因,若需要集群删除重装,使用 gpdeletesystem 工具

使用命令


gpdeletesystem -d /data/hgdata/master/gpseg-1 -f

-d 后面跟 MASTER_DATA_DIRECTORY(master 的数据目录),会清除master,segment所有的数据目录。
-f force, 终止所有进程,强制删除。

环境变量确认

必须在瀚高数据仓库主节点(包括备用主节点)设置相应的环境变量。在$GPHOME目录下的greenplum_path.sh文件提供了诸多瀚高数据仓库相关的环境变量。可以在gpadmin用户启动脚本(比如.bashrc)中加载这个文件。 瀚高数据仓库的管理命令还需要设置MASTER_DATA_DIRECTORY环境变量。这个变量应该指向gpinitsystem命令初始化时指定的主节点数据目录位置。

vi ~/.bash_profile
source /opt/gp/greenplum/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=hgdw
#export LD_LIBRARY_PATH=$GPHOME/lib:$GPHOME/ext/python/lib


如果有备用主节点,应该将环境变量的文件拷贝到备用主节点。例如:

scp ~/.bash_profile gp2:/home/gpadmin

3 常用命令

查看状态
gpstate
快速打开
gpstart
快速关闭
gpstop -M fast
不关闭数据库的情况下生效某些参数
stop -u  生效

4 poststep

优化数据库参数,参考相关文档 greenplum or hgdw Best practices(Turing)

5 其他配置

允许客户端连接

瀚高数据仓库第一次初始化之后,只允许gpadmin用户从本地进行连接(gpinitsystem时指定的系统用户)。如果希望其他用户或者客户端机器也能够访问瀚高数据仓库,必须设置数据库的访问权限。

hgdw=# show listen_addresses ;
 listen_addresses 
------------------
 *
(1 row)
 listen_addresses默认就是*

修改pg_hba.conf文件

cat pg_hba.conf  |grep -v "#"
host	all	gpadmin	192.168.80.147/32	trust
host	replication	gpadmin	192.168.80.147/32	trust
host	all	gpadmin	192.168.122.1/32	trust
host	replication	gpadmin	192.168.122.1/32	trust
local    all         gpadmin         ident
host     all         gpadmin         127.0.0.1/28    trust
host     all         gpadmin         192.168.80.146/32       trust
host     all         gpadmin         192.168.122.1/32       trust
host     all         gpadmin         192.168.80.147/32       trust
host     all         gpadmin         192.168.122.1/32       trust
host     all         gpadmin         ::1/128       trust
host     all         gpadmin         fe80::a00:27ff:fe7a:276e/128       trust
local    replication gpadmin         ident
host     replication gpadmin         samenet       trust
# gp configure
host	all	all	192.168.80.0/24		md5

创建数据库并加载数据

验证安装成功之后,用户就可以创建数据库并加载数据了。关于创建数据库、模式、表或其他数据库对象的更多信息,请查看瀚高数据仓库管理员手册。


test_product.csv

0001,T 恤 , 衣服 ,1000,500,2017-09-20
0002, 打孔器 , 办公用品 ,500,320,2017-09-11
0003, 运动 T 恤 , 衣服 ,4000,2800,
0004, 菜刀 , 厨房用具 ,3000,2800,2017-09-20
0005, 高压锅 , 厨房用具 ,6800,5000,2017-01-15
0006, 叉子 , 厨房用具 ,500,,2017-09-20
0007, 切菜板 , 厨房用具 ,880,790,2017-04-28
0008, 圆珠笔 , 办公用品 ,100,,2017-11-11


nohup gpfdist -d /home/gpadmin/exttable -p 8888 -l /home/gpadmin/exttable/gpfdist.log 2>&1 &


create external table test_ext(
    p_id int,
    p_name text,
    p_type text,
    sale_price int,
    pruchase_price int,
    regist_date date
 )
 location ("gpfdist://192.168.80.146:8888/test_product.csv")
format "CSV" (delimiter as "," null as "")
;

hgdw=# select * from test_ext;
 p_id |  p_name   |  p_type   | sale_price | pruchase_price | regist_date 
------+-----------+-----------+------------+----------------+-------------
    1 | T恤       |  衣服     |       1000 |            500 | 2017-09-20
    2 |  打孔器   |  办公用品 |        500 |            320 | 2017-09-11
    3 |  运动T恤  |  衣服     |       4000 |           2800 | 
    4 |  菜刀     |  厨房用具 |       3000 |           2800 | 2017-09-20
    5 |  高压锅   |  厨房用具 |       6800 |           5000 | 2017-01-15
    6 |  叉子     |  厨房用具 |        500 |                | 2017-09-20
    7 |  切菜板   |  厨房用具 |        880 |            790 | 2017-04-28
    8 |  圆珠笔   |  办公用品 |        100 |                | 2017-11-11
(8 rows)


create table test_product as select * from test_ext; 

Created By Chris on 20200327

 

 

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
转载请注明出处: https://daima100.com/8464.html

(0)
上一篇 2023-02-27
下一篇 2023-02-27

相关推荐

  • Python字符串的切割函数

    Python字符串的切割函数在Python中,字符串是一种基本的数据类型。字符串的操作在编程中非常重要,字符串的切割也是其中的一个重要操作。Python提供了丰富的字符串处理函数,其中包括字符串的切割函数codesplit()/code。字符串切割函数将一个字符串分割为多个子字符串,返回一个列表。在本文中,我们将从多个角度对Python字符串的切割函数进行详细阐述。

    2024-02-13
    45
  • 实现Python和JavaScript字符串以给定字符串开始的判断函数

    实现Python和JavaScript字符串以给定字符串开始的判断函数字符串以给定字符串开头的判断,可以通过比较字符串的前若干个字符和给定字符串是否相等来实现。在Python中可以使用字符串切片(slicing)或字符串的方法startswith()来实现,而在JavaScript中可以使用字符串的方法slice()或startsWith()来实现。

    2024-01-19
    53
  • MySQL5.7特性:JSON数据类型「建议收藏」

    MySQL5.7特性:JSON数据类型「建议收藏」概述 MySQL5.7的发行声明中,官方称之为里程碑式的版本,除了运行速度大幅度提升之外,还添加了之前版本没有的功能,如本文所述的原生JSON数据类型功能。 在此版本之前,MySQL所有的JSON数…

    2023-02-14
    111
  • redis 6.0下redis-cluster

    redis 6.0下redis-cluster伴随着Redis6.0的发布,作为最令人怦然心动的特性之一,Redis官方同时推出Redis集群的proxy了:redis-cluster-proxy,https://github.com/Redis

    2023-02-24
    108
  • 使用Python Tkinter实现复选框

    使用Python Tkinter实现复选框Python Tkinter是一个强大的GUI开发库,可以用来开发各种桌面应用程序。在Tkinter中,复选框是一种常见的GUI控件,用户可以通过勾选或取消勾选复选框来选择或取消选择相应的选项。

    2024-03-27
    26
  • MySQL学习(3)-[亲测有效]

    MySQL学习(3)-[亲测有效]ps:此随笔基于mysql 5.7.*版本。 已知root账户密码进行登录 格式:mysql [-h地址] [-p端口] -u用户名 -p密码 省略不写地址或端口则自动使用默认。(地址:localh

    2023-06-02
    92
  • 使用random模块生成随机数

    使用random模块生成随机数Python中的random模块包含了许多随机数函数。其中,random.random是生成0到1之间的随机实数的函数。可以用这个函数来模拟实验或者测试中的随机性。

    2024-04-11
    20
  • 书写高质量SQL的一些建议[亲测有效]

    书写高质量SQL的一些建议[亲测有效]1、查询SQL尽量不要使用select *,而是select具体字段。 反例子: select * from employee; 正例子: selectid,namefrom employee; 理…

    2023-02-23
    106

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注