Threat Detection On-Premise Deployment Guide

Perforce Helix Threat Detection
On-Premise Installation and Deployment Guide
Version 2
On-Premise Installation and Deployment Guide
Prerequisites and Terminology
Each server needs to be identified as an ‘analytics’ or ‘reporting’ server. There needs to be an
odd number of analytics servers, and one of the analytics servers is identified as the ‘master’
server.
Some valid configurations are: one analytics server and one reporting server; three analytics
servers and two reporting servers; etc.
The steps to configure the two types of servers are given separately in this document.
The base OS for all servers is Ubuntu 14.04 LTS. The hostnames for the servers are arbitrary
but the instructions in this document will refer to the master analytics server using these
names:
● <SPARK_MASTER>
● <NAMENODE>
● <HMASTER>
● <ZOOKEEPER>
In all cases the tag must be replaced with the actual hostname of the master analytics server.
The reporting server will also be referred to as <REPORTING>. This can be replaced with the
hostname of any of the reporting servers.
Before you begin you should have the following files available. The files will be copied onto the
analytics and reporting servers in the steps below.
●
Analytics deployment bundle:
wget --no-check-certificate ...
●
Reporting deployment bundle:
wget --no-check-certificate …
Reference Architecture
The architecture consists of the following components:
●
●
●
Investigator / API Server
Analytics Master
Analytics Data.
Investigator / API Server
This component is responsible for taking the results of the analytics, and present it to a
consumption interface. This includes the Investigator interface, static reports, as well as through
a RESTful interface for integration into 3rd party systems.
Analytics Master
This component is responsible for managing the analytics data-tier and for the orchestration of
the analytics jobs.
Analytics Data
This component is responsible for storing the data and running the analytical models on the
data which creates the metrics such as baselines, behaviour risk scores, entity risk scores, and
others. It stores and serves up log data to the Analytics Master component, as well as storing
the metrics that result from the analytical models.
System Requirements
These system requirements provide guidelines on the resources that are necessary to run the
software based on typical usage. These guidelines are subject to re-evaluation based on usage
patterns within each organization, which can vary.
POC (Proof of Concept) System
Maximum: 30 days of data / 1k users
CPU Cores
16
Memory
32 GB
HDD
100 GB
Network
GigE
Production System
Investigator / API Server (x2 for High Availability System)
Minimum
Recommended
8
16
Memory
16 GB
24 GB
HDD
100 GB
100 GB
GigE
10GbE
CPU Cores
Network
Analytics Master (x2 for High Availability System)
CPU Cores
Memory
HDD
Network
Minimum
Recommended
8
16
8 GB
16 GB
100 GB
100 GB
GigE
10GbE
Analytics Data (x3-5 for High Availability System)
Minimum
Recommended
8
16
Memory
32 GB
48 GB
HDD
100 GB
70 GB / 1k users / month
GigE
10GbE
CPU Cores
Network
Setup — Analytics (Single Server)
Assumption: The wget utility is installed and part of the $PATH.
Assumption: The default /etc/hosts file should look like this:
127.0.0.1
127.0.1.1
localhost
$ANALYTICS_SERVER_NAME
After running the set-up of the Analytics server the /etc/hosts file will look like this:
127.0.0.1
localhost
$ACTUAL_IP_ADDRESS
$ANALYTICS_SERVER_NAME
Start with an Ubuntu 14.04 LTS base image.
Create the [interset] user on the Analytics server:
The [ubuntu] user password will need to be provided when executing the [adduser] command.
Also a “new” password will need to be provided for the [interset] user that is being created,
remember this password for future use. When creating the [interset] user other information will
be requested (i.e. Name, Phone, etc …), pressing [ENTER] when prompted for those will
provide the default which is a BLANK value. Last question is a ‘Y’ or ‘N’ confirmation to
proceed.
As the [ubuntu] user:
sudo adduser interset
sudo usermod -a -G sudo interset
sudo visudo
Add the following line after #includedir /etc/sudoers.d
interset ALL=(ALL) NOPASSWD:ALL
To close the editor, hit [CTRL+X], then ‘Y’, then [ENTER] to save.
Switch to the interset user:
sudo su - interset
Create ssh key:
ssh-keygen
Hit [ENTER] at each prompt:
ssh-copy-id interset@<$HOSTNAME>
Where <HOSTNAME> is each of the servers involved in your deployment (i.e. all Analytic
servers and all Reporting servers).
NOTE: When executing the above command for any other server(s), it is assumed that the
[interset] user has been previously created on each of the servers you will execute this
command for.
The above command expects a ‘yes’ response to proceed with connecting and also requests
the [interset] user’s password on the given server to do so.
Download the analytics-deploy file to the /opt/interset directory.
sudo mkdir /opt/interset
sudo chown -R interset:interset /opt/interset
cd /opt/interset
sudo wget analytics-2.1.x.xxx-bin.zip
Where “x.xxx” is the current build number.
Wait for the download to finish.
sudo apt-get install unzip
unzip analytics-2.1.x.xxx-bin.zip
rm -f analytics-2.1.x.xxx-bin.zip
ln -s analytics-2.1.x.xxx analytics
cd analytics/automated_install/
Where “x.xxx” is the current build number.
Run shell script as sudo:
Note: The script will initially need input from the user for the I.P. address of the Analytics
server and heap sizes, please have that information available. When entering the
memory heap size only enter the number.
sudo bash deploy.sh
This script does take some time to execute, look for the following message for
confirmation is has completed:
Execution of [deploy.sh] complete.
Format HDFS (name node only):
cd /opt/interset/hadoop/bin
./hdfs namenode -format
Answer “Yes” to the prompt to re-format the filesystem in the Storage Directory.
Note: Do NOT run the format command if HDFS is already running as it will cause data
loss. If this is first-time setup then HDFS will not be running
You should see something like the following (note the ‘Exiting with status 0’ on the thirdlast line): https://gist.github.com/Tamini/eb63dda92cc688a9db22
Start the HDFS services:
cd /opt/interset/hadoop/sbin
./start-dfs.sh
Answer ‘yes’ to the prompt asking if you wish to continue connecting.
After entering the above start commands, type ‘jps’ as a check, the output will look like
(ignore the process IDs/numbers):
13342 Jps
13278 DataNode
13219 SecondaryNameNode
13135 NameNode
For a Single Server deployment there is no need to have the SecondaryNameNode
running however it is not harmful to leave it running.
Another good check is to load up the HDFS web-ui. By default it can be found at:
http://hostname:50070
Where hostname is the namenode running HDFS.
Start HBase (namenode only):
cd /opt/interset/hbase/bin/
./start-hbase.sh
To check everything is running as it should, use ‘jps’ again, it should output (ignore
numbers):
For the name node:
16619 HMaster
16799 HRegionServer
16369 SecondaryNameNode
16521 HQuorumPeer
16182 DataNode
17063 Jps
16026 NameNode
The HBase web-ui is also available at:
http://hostname:60010
Start Spark:
cd /opt/interset/spark/sbin
./start-all.sh
As a test, use the ‘jps’ command, output should look like the following (on single
deployment):
28352 HMaster
28258 HQuorumPeer
29140 Worker
28538 HRegionServer
27723 DataNode
27915 SecondaryNameNode
28957 Master
29422 Jps
27567 NameNode
As a quick test, run one of the examples that came with spark:
/opt/interset/spark/bin/run-example SparkPi 10
It will output a lot of info and a line approximating the value of Pi.
Install the cron task to periodically run analytics:
cd /opt/interset/analytics/bin/cron/
./install.sh
The above should output the following:
Crontab installed
Create supporting schema
cd /opt/interset/analytics/bin
./sql.sh --dbServer $ZOOKEEPER --action migrate
Where $ZOOKEEPER is the Analytics server.
Source [.bashrc] to pick up newly set environment variables:
source ~/.bashrc
Configuring the [interset.conf] configuration file:
cd /opt/interset/analytics/conf
vi interset.conf
Configure the [ingestFolder], [ingestingFolder] and [ingestedFolder] to be the desired
locations. Defaults will work if the file is left unaltered.
Configure the [reportServers] with the conclusive list of all your Reporting servers.
Starting the ingest process:
/opt/interset/analytics/bin/ingest.sh
Running jps will now show the “Ingest” process as running.
Log file for the ingest is located in:
tail -f /opt/interset/analytics/logs/ingest.0.log
NOTE: The settings in the conf file can be modified on the fly without restarting the
process/service, changing the ingest folder(s) location(s) will change where the system
looks at (i.e. ingest, ingested, ingesting and ingesterror) to pick them up.
You have now completed the setup of the Analytics server and the server is ready to
ingest logs.
Setup — Reporting (Single Server or Distributed)
Assumption: The wget utility is installed and part of the $PATH.
Assumption: The default /etc/hosts file is expected to look like this:
127.0.0.1
127.0.1.1
localhost
$REPORTING_SERVER_NAME
After running the set-up of the Analytics server the /etc/hosts file will look like this:
127.0.0.1
localhost
$ACTUAL_IP_ADDRESS
$REPORTING_SERVER_NAME
Start with an Ubuntu 14.04 LTS base image.
Create the [interset] user on the Analytics server:
The [ubuntu] user password will need to be provided when executing the [adduser] command.
Also a “new” password will need to be provided for the [interset] user that is being created,
remember this password for future use. When creating the [interset] user other information will
be requested (i.e. Name, Phone, etc …), pressing [ENTER] when prompted for those will
provide the default which is a BLANK value. Last question is a ‘Y’ or ‘N’ confirmation to
proceed.
As the [ubuntu] user:
sudo adduser interset
sudo usermod -a -G sudo interset
sudo visudo
Add the line:
interset ALL=(ALL) NOPASSWD:ALL
After: #includedir /etc/sudoers.d
To close the editor, hit [CTRL+X], then ‘Y’, then [ENTER] to save.
Switch to the [interset] user:
sudo su - interset
Install needed packages:
sudo apt-get install unzip
Install Java 8:
sudo add-apt-repository ppa:webupd8team/java
(Press ENTER to continue)
sudo apt-get update
sudo apt-get install -y oracle-java8-installer
If prompted, read the license and respond ‘yes’ to the prompt to continue
sudo apt-get -fy install
To check, you can run java -version and you should see output like:
java version "1.8.0_25"
Java(TM) SE Runtime Environment (build 1.8.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)”
Set JAVA_HOME:
echo " " >> ~/.bashrc
echo "export JAVA_HOME=/usr/lib/jvm/java-8-oracle" >> ~/.bashrc
Source [.bashrc] to pick up newly set environment variables:
source ~/.bashrc
Install and start Investigator:
Download the Reporting archive to [/opt/interset]:
sudo mkdir /opt/interset/
sudo chown -R interset:interset /opt/interset
cd /opt/interset/
sudo wget investigator-2.1.x.xxx-deploy.zip
Where “x.xxx” is the current build number.
Unpack it into /opt/interset:
sudo apt-get install unzip
unzip investigator-2.1.x.xxx-deploy.zip
rm -f investigator-2.1.x.xxx-deploy.zip
ln -s investigator-2.1.x.xxx/ reporting
cd reporting
Set up Report Generation
sudo mkdir /var/interset/
sudo mkdir /var/interset/reportGen/
sudo chown -R interset:interset /var/interset
sudo mkdir /opt/interset/reportGen/
sudo chown -R interset:interset /opt/interset/reportGen
tar -xvf /opt/interset/reporting/reportgen-templates.tar.gz
cp -rf /opt/interset/reporting/reportgen-*/* /opt/interset/reportGen/
chmod +x /opt/interset/reportGen/bin/phantomjs
sudo echo " " >> /home/interset/.bashrc
sudo echo "export PATH=\$PATH:/opt/interset/reportGen/bin" >> /home/interset/.bashrc
sh /opt/interset/reportGen/scripts/setupReportsEnvironment.sh
rm -rf /opt/interset/reporting/reportgen-*
source ~/.bashrc
Set up nginx:
sudo apt-get install nginx
Answer ‘y’ to the prompt asking if you wish to continue.
sudo mv /opt/interset/reporting/nginx.conf /etc/nginx/sites-available/default
sudo service nginx restart
cd /opt/interset/reporting
vi investigator.yml
Change the line:
url: jdbc:phoenix:$ANALYTICS:2181
So that $ANALYTICS is your Analytics server.
Create the log folder:
mkdir logs
Create the Users Database:
This will set up the database and create two users:
cd /opt/interset/reporting
java -jar investigator-2.1.x.xxx.jar db migrate investigator.yml
Where “x.xxx” is the current build number.
The users are:
● User name: user, password password.
● User name: admin, password password.
Start the reporting app:
nohup java -jar investigator-2.1.x.xxx.jar server investigator.yml &
Where “x.xxx” is the current build number.
There is a log file for the Reporting server that you may wish to monitor:
tail -f /opt/interset/reporting/logs/reporting.log
The reporting web UI will be available at:
http://<REPORTING>/
Where <REPORTING> is the Reporting server.
You have now completed the setup of the Reporting server and the server is ready to
display the results of the Analytics.
The accounts user / password and admin / password can be used.
Upgrading from 2.0
Analytics
1.
2.
3.
4.
Obtain updated analytics bundle.
Unpack bundle into a new analytics-2.1... directory under /opt/interset.
Create an analytics symlink to point to the new unpacked directory.
Create ssh key:
ssh-keygen (Hit [ENTER] at each prompt)
ssh-copy-id interset@<REPORTS SERVER HOSTNAME>
5. Migrate changes from the old .../deploy/conf/ingest.conf to the new
.../analytics/conf/interset.conf file.
6. Update the analytics database:
cd /opt/interset/analytics/bin
./sql.sh --dbServer <ZOOKEEPER> --action baseline
./sql.sh --dbServer <ZOOKEEPER> --action migrate
7. Re-run analytics (daily.sh). (This will also copy new search indices to the reporting
server.)
8. You can now remove the old deploy-2.0… directory and associated deploy symlink.
Reporting
1.
2.
3.
4.
5.
Stop reporting server.
Obtain updated reporting bundle.
Unpack bundle into a new investigator-2.1... directory under /opt/interset.
Update the reporting symlink to point to the new directory
Migrate changes from investigator.yml (replace ‘$ANALYTICS’ with the analytics
server name)
6. Copy the reporting database investigator-db.mv.db from the previous folder to the new
folder.
7. Update the reporting database:
java -jar /opt/interset/reporting/investigator-2.1.x.jar db migrate /opt/interset/reporting/investigator.yml
Where “x” is the version number.
8. Start server
9. You can now remove the old investigator-2.0… directory.
Usage
1.
Put some log files into the Watch Folder of the Analytics server.
○ Watch Folder location is set in: /opt/interset/analytics/conf/interset.conf
○ Default Watch Folder location is:
○
○
○
○
○
2.
ingestFolder = /tmp/ingest
ingestingFolder = /tmp/ingest/ingesting
ingestedFolder = /tmp/ingest/ingested
ingesterrorFolder = /tmp/ingest/ingesterror
NOTE: Folders can be modified on the fly prior to ingesting a subsequent
dataset.
You can monitor the ingest of the dataset via an ingest log file:
○ tail -f /opt/interset/analytics/logs/ingest.0.log
○
Once all the log files ingested and processed (this can be verified in the
“ingesting” and “ingested” folders), use the webui (i.e. Reporting API) to see the
results of the analytics.
The web UI will be available at:
http://<REPORTING>/
Where <REPORTING> is the Reporting server.
Restrictions – Customer may not:
•
Transfer, resell or sublicense of any of Customer’s rights under this Agreement, nor
distribute, rent, lease, lend or otherwise make the Licensed Materials available to any third
party;
•
Use or permit the use of the Licensed Materials to provide any form of timesharing,
outsourcing, rental or third party bureau service purposes, commercial or otherwise;
•
Exploit the Licensed Materials other than as permitted under the Agreement;
•
Access or Use the Licensed Materials (A) to develop any software or technology having the
same primary function as the Licensed Software, (B) in any development or test procedure
that seeks to develop like software or other technology, or determine if such software or
other technology performs in a similar manner as the Licensed Materials, (C) or otherwise
build a competitive software or other technology; or
•
Retain any Licensed Materials or any copies thereof following the expiry or termination of
this Agreement.