Optimization
Big Picture
Row64 is optimized to run at maximum speeds across a wide range of hardware.
Default Optimization
Threading Visitors
By default, Row64 Server uses hardware optimization and CPU multi-threading to enhance speeds.
Visitors viewing dashboards are placed in separate threads on the server. The messages to create and filter dashboards are also threaded. This feature is referred to as multi-tenancy in this document.
Threading Panes
In Row64 Dashboards, each pane is drawn and calculated by a different thread, by default. This leads to dramatic increases in loading time and calculation speed.
Threaded Compute
With the default server settings, compute algorithms are divided across CPU threads. Thread distribution is automated and drives performance to its maximum speed.
GPU Display
All aspects of the Row64 Platform are displayed using the GPU. This orientation on speed creates a real-time and interactive user experience.
Threaded Streaming
By default, Row64 uses a thread per incoming message, which moves massive amounts of streaming data and displays it in real time.
Threaded Text
Traditionally, text and paragraph data is a major bottleneck in dashboard loading and evaluation. Row64 automatically threads and optimizes text data without the need for any extra configuration.
Custom Optimization
Custom Optimization: Big Picture
Increasing optimization should be considered when:
- You have several CPU cores available (over 32)
- You want to use GPU Compute on the server (by default, GPU Compute is on Studio and Client)
- You have users with slow internet connections but high-speed servers
Decreasing optimization should be considered when:
- Your focus is on high-speed streaming to dashboards
- (In this case, it is better to turn CPU and GPU Compute off. This will focus threads on streaming).
Do I Need a GPU?
Row64 Server:
A GPU is not required on the server (although, having a GPU makes it easier to set up features like streaming with a UI).
If you don't want to use GPU Compute, you don't have to; it is turned off by default in the configuration.
Row64 Client:
Row64 renders the dashboard using the GPU.
Integrated GPUs, even ones as old as 2012, will work.
Compute features will be automatically enabled based on the available hardware.
Row64 Studio:
A GPU is required, but Studio works on integrated GPUs and machines as old as 2012.
Compute features will be automatically enabled based on the available hardware.
Compatible Platforms and Needed CPU Cores
The Row64 Platform is available on a variety of platforms and setups. Row64 will work on the x86 architecture, data center hardware, and virtual machines. AMD, Intel, and Arm chipsets also work well.
There is no specific number of CPU cores for Row64 to run; the entire Platform will dynamically adapt to your hardware setup.
Configuration
The following diagram shows the configuration settings for GPU and CPU Compute. By running a benchmark first and determining cross-over points adn min/max values, you can precisely optimize your performance.
Algorithms Driven by Compute
Dashboard loading time, evaluation, and filtering is accelerated when Compute features are enabled. Here is a list of some of the affected features:
Drivers and Setup
Vulkan GPU Compute
Row64 Server uses Vulkan GPU Compute for enhanced low-level speed and cross-hardware compatibility.
GPU drivers, however, currently have some limitations, since advanced GPU analytics algorithms is a newer field.
Afer detailed testing across various cards, our current analysis is that AMD Linux compute drivers seem to be a weaker area. For this reason, in version 3.5, we recommend server GPU features with Nvidia hardware. This could change as drivers are updated, in which we will conduct additional tests.
If you decide to run row64bench with the -GPU flag on AMD hardware, be aware that GPU crashes could require a reboot. Momentum is growing on Linux GPU features, so we are optimiztic that these limitations will improve in the future.
Hardware Check
You will need to check if your system has a compatible NVIDIA GPU. If you don't already know your GPU model, you can use the following command to display the information:
lspci -v | egrep "NVIDIA"
If you are under enterprise support and are interested in AMD or Intel GPU Compute, please let us know so we can track interest.
Start With Default Drivers
Ubuntu third-party drivers often install correctly during the Ubuntu install. Check the box below when installation arrives at this page:
This is the simplest and first option for getting the right drivers for GPU Compute installed.
If Default Drivers Don't Work
If the default Ubuntu drivers don't work, follow the instructions at the following link:
https://documentation.ubuntu.com/server/how-to/graphics/install-nvidia-drivers/index.html
NOTE: Ensure you install the latest "server-open" driver. The non-open driver will not work due to recent changes.
For example:
sudo ubuntu-drivers list --gpgpu
sudo ubuntu-drivers install nvidia-driver-570-server-open
Optimization with Mission Center
A simple way to view hardware resource utilization is with Mission Center. This can be helpful to see how multi-threading and GPU resources are used while you run tests.
You can install Mission Center by using the following command:
sudo snap install mission-center
BIOS Settings for GPU Compute
GPU Compute is optional, but to get drivers and diagnostic tools like Mission Center to work, you may need to change some settings in BIOS.
Disable Secure Boot
Disabling Secure Boot is commonly required by some of the Nvidia drivers for them to work with Compute, especially the 5000 Series drivers. This area is continually changing, however, so you can run tests to evaluate if this is still needed on your system.
Disable Integrated Graphics
Do not modify your integrated graphics (iGPU) settings until your Nvidia drivers are working. Disabling the iGPU in BIOS makes it easier for diagnostic applications like Mission Center to report on the dedicated GPU, instead of the iGPU.
Install Bench
Install Row64 Bench
Row64 Bench is a benchmarking command line tool. It is designed to test multi-threading and GPU analytics to find the best server settings.
Use the following link to download Row64 Bench:
row64bench-3.5_amd64.deb
After downloading the file, run the following command in the terminal:
sudo dpkg -i row64bench-3.5_amd64.deb
Basic Bench Test
Initiate a basic CPU test with the following command:
./row64bench -MIN
To run a test for both the CPU and GPU, use the following command:
./row64bench -GPU -MIN
NOTE: The GPU test will report which GPU is being used, and will also provide your hardware setup details. This test is especially helpful if you have multiple GPUs, as it allows you to track which one is being utilized.
Standard Bench Tests
First, go to the bench directory:
cd /opt/row64server/row64bench/
To test that you can successfully access the CPU and GPU, run:
./row64bench -INFO
To run a minimal with 1000 records test to check if the GPU is working:
./row64bench -GPU -MIN
To run the default CPU test with 100 million records:
./row64bench
To run a CPU and GPU test (currently not recommended on AMD GPUs):
./row64bench -GPU
To run a CPU and GPU test with 500 million records:
/row64bench -GPU -500M
To run CPU and GPU testing with 1 billion records:
./row64bench -GPU -1B
Other Bench Flags
Additional Bench flags include:
-1M
= 1 million record test
-100K
= 100 thousand record test
-10M
= 10 million record test
Single record-set tests, which are useful for testing GPU limits:
-S1B
= Single test of 1 billion records
-S500M
= Single test of 500 million records
-S750M
= Single test of 750 million records
GPU QA and Correctness Test
GPU driver issues aren't uncommon, especially with cutting-edge products. Because of this, it's important to run QA testing on your card to confirm that the results are 100% correct.
To do this, just pass the -QA flag. For a quick test, run the following command:
./row64bench -MIN -GPU -QA
For a larger test, use the following command:
./row64bench -GPU -QA
NOTE: You need to include the -GPU flag for the -QA flag to run. If the GPU is not part of the test, then there's no CPU/GPU comparison.
Terminal Feedback
When Bench runs, it will log the current test in the terminal. Bench compares the timing for the CPU evaluation with 1, 2, 4, and 8 threads.
When the -GPU flag is used, you will see the GPU timing in the output. Data transfers to and from the GPU over the PCIe bus consume the majority of the time in GPU analytics, so we also break the time into On_Card evaluation time and Transfer time to enhance analytics.
Analyze the Bench Logs
When Bench completes, it outputs logs into the newest folder in the following directory:
/var/log/row64bench/
There are four sets of logs:
bench_BIGFLOAT.csv
bench_BIGINT.csv
bench_FLOAT.csv
bench_INT.csv
You can drag the bench logs in Studio (which is also Linux) and make a scatter plot with them. Then, figure out the crossover points, and feed that information back into the configuration.
Example: 1 Million Record Analysis
This example used the following hardware:
-
CPU: AMD 9950X
-
GPU: Nvidia 5090
Analysis: CPU 2 Thread overtakes CPU 1 Thread at 600,000 records. If we have CPU_COMPUTE_THREADS set to 2 (the default), then the best value for CPU_COMPUTE_MIN_RECORDS is 600,000.
Example: 100 Million Record Analysis
This example used the following hardware:
-
CPU: AMD 9950X
-
GPU: Nvidia 5090
Analysis: GPU Compute overtakes 8 core CPU at around 20 million records. If we had CPU_COMPUTE_THREADS set to 8, we should set GPU_COMPUTE_MIN_RECORDS to 20,000,000.
Example: 1 Billion Record Analysis
This example used the following hardware:
-
CPU: AMD 9950X
-
GPU: Nvidia 5090
Analysis: The GPU is the fastest, and each step in threading is faster. Row64 CPU 1 Thread is highly optimized to be faster than conventional CPU Compute.
Threaded Compression
Threaded compression is msot useful when network speeds are slow.
Threaded compression will:
-
Reduce the time spent sending data across the network (messages are smaller)
-
ON THE SERVER: Slightly increase the time to process while compressing
-
ON THE CLIENT: Add parsing time on message received. By default, Row64 doesn't need to parse because it sends a RAM layout.