GPText - how to setup a GPText 3.10 sandbox on a Greenplum 6.24.2 Vagrant Singlenode VM
search cancel

GPText - how to setup a GPText 3.10 sandbox on a Greenplum 6.24.2 Vagrant Singlenode VM

book

Article ID: 296911

calendar_today

Updated On:

Products

VMware Tanzu Greenplum

Issue/Introduction

Tanzu Greenplum Text enables processing mass quantities of raw text data (such as social media feeds or e-mail databases) into mission-critical information that guides business and project decisions. Tanzu Greenplum Text joins the Greenplum Database massively parallel-processing database server with Apache SolrCloud enterprise search. Tanzu Greenplum Text includes powerful text search as well as support for text analysis. 

This setup guide shows how to quickly install a GPText sandbox on a pre-existing Greenplum Vagrant Singlenode VM.

Environment

Product Version: 6.23

Resolution

Reference Documents:

For instructions on how to create a Vagrant Singlenode Greenplum VM on your local machine, please see the documentation at https://github.com/greenplum-db/go-gpdb

Please note that https://github.com/greenplum-db/go-gpdb will only work on Intel-based Macs.

GPText installation instructions can be found at https://docs.vmware.com/en/VMware-Greenplum-Text/3.10/greenplum-text/topics-installing.html 

Setup Guide Steps:

  • Download the GPText 3.10 installation package to your Vagrant Singlenode Greenplum VM from https://network.pivotal.io/products/vmware-greenplum#/releases/1287433/file_groups/13262 - my VM is CentOS 7.5, so I've chosen that version:
[gpadmin@gptext-m ~]$ file greenplum-text-3.10.0-rhel7_x86_64.tar.gz 
greenplum-text-3.10.0-rhel7_x86_64.tar.gz: gzip compressed data, from Unix, last modified: Thu Dec 15 12:33:58 2022
[gpadmin@gptext-m ~]$ 
[gpadmin@gptext-m ~]$ cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)
[gpadmin@gptext-m ~]$
  • Confirm pre-existing Greenplum 6.24.2 installation on the Vagrant Singlenode VM:
[gpadmin@gptext-m ~]$ gpdb env
INFO[2023-04-27 11:57:03] Listing all the environment installed        

Found 1 installation, choose from the list

Index   Environment File           Master Port        Status              GPCC Instance Name                        GPCC Instance URL
------  -----------------------    -----------------  ------------------  ----------------------------------------  ------------------------------------------
1       env_6.24.2_20230427115239  3000               RUNNING             


Enter your choice from the above list (eg.s 1 or 2 etc): 1

Source the environment file to set the environment

source /usr/local/src/gpdbinstall/env/env_6.24.2_20230427115239

[gpadmin@gptext-m ~]$ 
[gpadmin@gptext-m ~]$ source /usr/local/src/gpdbinstall/env/env_6.24.2_20230427115239
[gpadmin@gptext-m ~]$ 
 
  • Yum update & install the 'java-1.8.0-openjdk nc lsof' prerequisites:
[gpadmin@gptext-m ~]$ sudo yum update
[gpadmin@gptext-m ~]$ sudo yum install java-1.8.0-openjdk nc lsof
 
  • Ensure that 'java-1.8.0-openjdk' is the java provider & confirm the java version is 1.8.0:
[gpadmin@gptext-m ~]$ sudo alternatives --config java

There is 1 program that provides 'java'.

  Selection    Command
-----------------------------------------------
*+ 1           java-1.8.0-openjdk.x86_64 (/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.362.b08-1.el7_9.x86_64/jre/bin/java)

Enter to keep the current selection[+], or type selection number: 
[gpadmin@gptext-m ~]$ 
[gpadmin@gptext-m ~]$ java -version
openjdk version "1.8.0_362"
OpenJDK Runtime Environment (build 1.8.0_362-b08)
OpenJDK 64-Bit Server VM (build 25.362-b08, mixed mode)
[gpadmin@gptext-m ~]$ 

 

  • Decompress the GPText installer and 'gptext_install_config' file.
  • Make the installer executable:
[gpadmin@gptext-m ~]$ tar xvfz greenplum-text-3.10.0-rhel7_x86_64.tar.gz 
./greenplum-text-3.10.0-rhel7_x86_64.bin
gptext_install_config
[gpadmin@gptext-m ~]$ 
[gpadmin@gptext-m ~]$ chmod +x /home/gpadmin/greenplum-text-3.10.0-rhel7_x86_64.bin 
[gpadmin@gptext-m ~]$ 
  • With sudo, create the installation directories needed & ensure that they're owned by gpadmin:
[gpadmin@gptext-m ~]$ sudo mkdir /usr/local/greenplum-text-3.10
[gpadmin@gptext-m ~]$ sudo mkdir /usr/local/greenplum-solr
[gpadmin@gptext-m ~]$ sudo chown gpadmin:gpadmin /usr/local/greenplum-text-3.10
[gpadmin@gptext-m ~]$ sudo chmod 775 /usr/local/greenplum-text-3.10
[gpadmin@gptext-m ~]$ sudo chown gpadmin:gpadmin /usr/local/greenplum-solr
[gpadmin@gptext-m ~]$ sudo chmod 775 /usr/local/greenplum-solr
[gpadmin@gptext-m ~]$ 
  • Make sure that gpadmin has full access to the Greenplum binary directories.
[gpadmin@gptext-m ~]$ sudo chown -R gpadmin:gpadmin /usr/local/6.24.2/
[gpadmin@gptext-m ~]$ 
  • Replace all instances of 'localhost' in the 'gptext_install_config' file with the hostname of the singlenode VM, as below:
[gpadmin@gptext-m ~]$ hostname
gptext-m
[gpadmin@gptext-m ~]$ 
[gpadmin@gptext-m ~]$ sed -i 's/localhost/gptext-m/g' gptext_install_config 
[gpadmin@gptext-m ~]$ 
  • Run the GPText installer:
[gpadmin@gptext-m ~]$ ./greenplum-text-3.10.0-rhel7_x86_64.bin -c gptext_install_config 

********************************************************************************
Provide the installation path for Greenplum Text Search or press ENTER to
accept the default installation path: /usr/local/greenplum-text-3.10.0
********************************************************************************



********************************************************************************
Install Greenplum Text Search into </usr/local/greenplum-text-3.10.0>? [yes|no]
********************************************************************************

yes

********************************************************************************
/usr/local/greenplum-text-3.10.0 does not exist.
Create /usr/local/greenplum-text-3.10.0 ? [yes|no]
(Selecting no will exit the installer)
********************************************************************************

yes
unpacking finished successfully
check installer
Not setting GPTEXT_JAVA_HOME in gptext_install_config:
  /home/gpadmin/gptext_install_config. Use default java instead
20230427:12:11:22:001841 install:gptext-m:gpadmin-[INFO]:-Validating hosts connection...
20230427:12:11:23:001841 install:gptext-m:gpadmin-[INFO]:-Detect zookeeper cluster
20230427:12:11:25:001841 install:gptext-m:gpadmin-[INFO]:-Detect SolrCloud cluster
20230427:12:11:26:001841 install:gptext-m:gpadmin-[INFO]:-Variable GPTEXT_CUSTOM_CONFIG_DIR is not set. Skip detecting GPText custom configuration directory.
20230427:12:11:26:001841 install:gptext-m:gpadmin-[INFO]:-Install GPText binary to host(s)...
20230427:12:11:26:001841 install:gptext-m:gpadmin-[INFO]:-Adding dynamic_library_path guc value: /usr/local/greenplum-text-3.10.0/lib/gpdb6 ...
20230427:12:11:27:001841 install:gptext-m:gpadmin-[INFO]:-Try to deploy zookeeper cluster
20230427:12:11:28:001841 install:gptext-m:gpadmin-[INFO]:-Deploy SolrCloud instances ...
20230427:12:11:28:001841 install:gptext-m:gpadmin-[INFO]:-Variable GPTEXT_CUSTOM_CONFIG_DIR is not set. Skip creating GPText custom configuration directory.
20230427:12:11:28:001841 install:gptext-m:gpadmin-[INFO]:-Create GPText config info ...
20230427:12:11:30:001841 install:gptext-m:gpadmin-[INFO]:------------------------------------------------
20230427:12:11:30:001841 install:gptext-m:gpadmin-[INFO]:-start zookeeper:
20230427:12:11:30:001841 install:gptext-m:gpadmin-[INFO]:------------------------------------------------
20230427:12:11:30:001841 install:gptext-m:gpadmin-[INFO]:-Starting zookeeper instance for segment at /data/master/zoo1 on host gptext-m
20230427:12:11:30:001841 install:gptext-m:gpadmin-[INFO]:-Starting zookeeper instance for segment at /data/master/zoo0 on host gptext-m
20230427:12:11:30:001841 install:gptext-m:gpadmin-[INFO]:-Starting zookeeper instance for segment at /data/master/zoo3 on host gptext-m
20230427:12:11:30:001841 install:gptext-m:gpadmin-[INFO]:-Starting zookeeper instance for segment at /data/master/zoo4 on host gptext-m
20230427:12:11:30:001841 install:gptext-m:gpadmin-[INFO]:-Starting zookeeper instance for segment at /data/master/zoo2 on host gptext-m
20230427:12:11:30:001841 install:gptext-m:gpadmin-[INFO]:-Create gpsolr path in zookeeper
20230427:12:11:33:001841 install:gptext-m:gpadmin-[INFO]:-Create nodes in zookeeper
20230427:12:11:50:001841 install:gptext-m:gpadmin-[INFO]:-Cleaning up temp files on hosts...
20230427:12:11:50:001841 install:gptext-m:gpadmin-[INFO]:-Installation complete.
[gpadmin@gptext-m ~]$ 
  • Source the GPText binaries:
[gpadmin@gptext-m ~]$ source /usr/local/greenplum-text-3.10.0/greenplum-text_path.sh 
[gpadmin@gptext-m ~]$ 
  • Create a new DB and run the GPText DDL creation utility against the new DB:
[gpadmin@gptext-m ~]$ createdb brianh
[gpadmin@gptext-m ~]$ 
[gpadmin@gptext-m ~]$ gptext-installsql brianh
20230427:12:15:47:003587 gptext-installsql:gptext-m:gpadmin-[INFO]:-Install GPText udf ...
20230427:12:15:47:003587 gptext-installsql:gptext-m:gpadmin-[INFO]:-Creating 'gptext' schema and UDFs in database brianh...
20230427:12:15:47:003587 gptext-installsql:gptext-m:gpadmin-[INFO]:-Validating gptext installation
20230427:12:15:47:003587 gptext-installsql:gptext-m:gpadmin-[INFO]:-Done.
[gpadmin@gptext-m ~]$ 
  • Start GPText and then verify the GPText state:
[gpadmin@gptext-m ~]$ gptext-start
20230427:12:17:19:003741 gptext-start:gptext-m:gpadmin-[INFO]:-Execute GPText cluster start.
20230427:12:17:20:003741 gptext-start:gptext-m:gpadmin-[INFO]:-Check zookeeper cluster state ...
20230427:12:17:21:003741 gptext-start:gptext-m:gpadmin-[INFO]:------------------------------------------------
20230427:12:17:21:003741 gptext-start:gptext-m:gpadmin-[INFO]:-Start GPText's instances.
20230427:12:17:21:003741 gptext-start:gptext-m:gpadmin-[INFO]:------------------------------------------------
20230427:12:17:21:003741 gptext-start:gptext-m:gpadmin-[INFO]:-   Host       Solr Dir
20230427:12:17:21:003741 gptext-start:gptext-m:gpadmin-[INFO]:-   gptext-m   /data/primary/solr0
20230427:12:17:21:003741 gptext-start:gptext-m:gpadmin-[INFO]:-   gptext-m   /data/primary/solr1
20230427:12:17:43:003741 gptext-start:gptext-m:gpadmin-[INFO]:-Start command execute success, checking whether instances are working ...
20230427:12:17:49:003741 gptext-start:gptext-m:gpadmin-[INFO]:-Double checking whether processes are running ...
20230427:12:17:51:003741 gptext-start:gptext-m:gpadmin-[INFO]:-GPText start success..
20230427:12:17:51:003741 gptext-start:gptext-m:gpadmin-[INFO]:-Done.
[gpadmin@gptext-m ~]$ 
[gpadmin@gptext-m ~]$ 
[gpadmin@gptext-m ~]$ gptext-state
20230427:12:17:55:004432 gptext-state:gptext-m:gpadmin-[INFO]:-Execute GPText state ...
20230427:12:17:55:004432 gptext-state:gptext-m:gpadmin-[INFO]:-Check zookeeper cluster state ...
20230427:12:17:56:004432 gptext-state:gptext-m:gpadmin-[INFO]:-Check GPText cluster status...
20230427:12:17:56:004432 gptext-state:gptext-m:gpadmin-[INFO]:-Current GPText Version: 3.10.0
20230427:12:17:56:004432 gptext-state:gptext-m:gpadmin-[INFO]:-All nodes are up and running.
20230427:12:17:56:004432 gptext-state:gptext-m:gpadmin-[INFO]:-Done.
[gpadmin@gptext-m ~]$ 
  • Verify Greenplum & GPText versions:
[gpadmin@gptext-m ~]$ psql brianh -c "select version();"
                                                                                                version                                                                    
                            
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------
 PostgreSQL 9.4.26 (Greenplum Database 6.24.2 build commit:2605b114ee5662dc6d972568ce50961fae74c69a) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compi
led on Apr 21 2023 05:19:53
(1 row)

[gpadmin@gptext-m ~]$ 
[gpadmin@gptext-m ~]$ psql brianh -c "select gptext.version();"
             version             
---------------------------------
 Greenplum Text Analytics 3.10.0
(1 row)

[gpadmin@gptext-m ~]$